首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.  相似文献   

2.
In this paper, we study the learnability of the Boolean inner product by a systematic simulation study. The family of the Boolean inner product function is known to be representable by neural networks of threshold neurons of depth 3 with only 2n+1 units (n the input dimension)—whereas an exact representation by a depth 2 network cannot possibly be of polynomial size. This result can be seen as a strong argument for deep neural network architectures. In our study, we found that this depth 3 architecture of the Boolean inner product is difficult to train, much harder than the depth 2 network, at least for the small input size scenarios n16. Nonetheless, the accuracy of the deep architecture increased with the dimension of the input space to 94% on average, which means that multiple restarts are needed to find the compact depth 3 architecture. Replacing the fully connected first layer by a partially connected layer (a kind of convolutional layer sparsely connected with weight sharing) can significantly improve the learning performance up to 99% accuracy in simulations. Another way to improve the learnability of the compact depth 3 representation of the inner product could be achieved by adding just a few additional units into the first hidden layer.  相似文献   

3.
Recently, there is a growing interest in applying Transfer Entropy (TE) in quantifying the effective connectivity between artificial neurons. In a feedforward network, the TE can be used to quantify the relationships between neuron output pairs located in different layers. Our focus is on how to include the TE in the learning mechanisms of a Convolutional Neural Network (CNN) architecture. We introduce a novel training mechanism for CNN architectures which integrates the TE feedback connections. Adding the TE feedback parameter accelerates the training process, as fewer epochs are needed. On the flip side, it adds computational overhead to each epoch. According to our experiments on CNN classifiers, to achieve a reasonable computational overhead–accuracy trade-off, it is efficient to consider only the inter-neural information transfer of the neuron pairs between the last two fully connected layers. The TE acts as a smoothing factor, generating stability and becoming active only periodically, not after processing each input sample. Therefore, we can consider the TE is in our model a slowly changing meta-parameter.  相似文献   

4.
Image steganography is a scheme that hides secret information in a cover image without being perceived. Most of the existing steganography methods are more concerned about the visual similarity between the stego image and the cover image, and they ignore the recovery accuracy of secret information. In this paper, the steganography method based on invertible neural networks is proposed, which can generate stego images with high invisibility and security and can achieve lossless recovery for secret information. In addition, this paper introduces a mapping module that can compress information actually embedded to improve the quality of the stego image and its antidetection ability. In order to restore message and prevent loss, the secret information is converted into a binary sequence and then embedded in the cover image through the forward operation of the invertible neural networks. This information will then be recovered from the stego image through the inverse operation of the invertible neural networks. Experimental results show that the proposed method in this paper has achieved competitive results in the visual quality and safety of stego images and achieved 100% accuracy in information extraction.  相似文献   

5.
Artificial neural networks have become the go-to solution for computer vision tasks, including problems of the security domain. One such example comes in the form of reidentification, where deep learning can be part of the surveillance pipeline. The use case necessitates considering an adversarial setting—and neural networks have been shown to be vulnerable to a range of attacks. In this paper, the preprocessing defences against adversarial attacks are evaluated, including block-matching convolutional neural network for image denoising used as an adversarial defence. The benefit of using preprocessing defences comes from the fact that it does not require the effort of retraining the classifier, which, in computer vision problems, is a computationally heavy task. The defences are tested in a real-life-like scenario of using a pre-trained, widely available neural network architecture adapted to a specific task with the use of transfer learning. Multiple preprocessing pipelines are tested and the results are promising.  相似文献   

6.
In this paper, we propose a new approach to train a deep neural network with multiple intermediate auxiliary classifiers, branching from it. These ‘multi-exits’ models can be used to reduce the inference time by performing early exit on the intermediate branches, if the confidence of the prediction is higher than a threshold. They rely on the assumption that not all the samples require the same amount of processing to yield a good prediction. In this paper, we propose a way to train jointly all the branches of a multi-exit model without hyper-parameters, by weighting the predictions from each branch with a trained confidence score. Each confidence score is an approximation of the real one produced by the branch, and it is calculated and regularized while training the rest of the model. We evaluate our proposal on a set of image classification benchmarks, using different neural models and early-exit stopping criteria.  相似文献   

7.
This paper investigates the problem of adaptive event-triggered synchronization for uncertain FNNs subject to double deception attacks and time-varying delay. During network transmission, a practical deception attack phenomenon in FNNs should be considered; that is, we investigated the situation in which the attack occurs via both communication channels, from S-C and from C-A simultaneously, rather than considering only one, as in many papers; and the double attacks are described by high-level Markov processes rather than simple random variables. To further reduce network load, an advanced AETS with an adaptive threshold coefficient was first used in FNNs to deal with deception attacks. Moreover, given the engineering background, uncertain parameters and time-varying delay were also considered, and a feedback control scheme was adopted. Based on the above, a unique closed-loop synchronization error system was constructed. Sufficient conditions that guarantee the stability of the closed-loop system are ensured by the Lyapunov-Krasovskii functional method. Finally, a numerical example is presented to verify the effectiveness of the proposed method.  相似文献   

8.
In this paper, a grid-free deep learning method based on a physics-informed neural network is proposed for solving coupled Stokes–Darcy equations with Bever–Joseph–Saffman interface conditions. This method has the advantage of avoiding grid generation and can greatly reduce the amount of computation when solving complex problems. Although original physical neural network algorithms have been used to solve many differential equations, we find that the direct use of physical neural networks to solve coupled Stokes–Darcy equations does not provide accurate solutions in some cases, such as rigid terms due to small parameters and interface discontinuity problems. In order to improve the approximation ability of a physics-informed neural network, we propose a loss-function-weighted function strategy, a parallel network structure strategy, and a local adaptive activation function strategy. In addition, the physical information neural network with an added strategy provides inspiration for solving other more complicated problems of multi-physical field coupling. Finally, the effectiveness of the proposed strategy is verified by numerical experiments.  相似文献   

9.
We show that neural networks with an absolute value activation function and with network path norm, network sizes and network weights having logarithmic dependence on 1/ε can ε-approximate functions that are analytic on certain regions of Cd.  相似文献   

10.
In this work, we formulate the image in-painting as a matrix completion problem. Traditional matrix completion methods are generally based on linear models, assuming that the matrix is low rank. When the original matrix is large scale and the observed elements are few, they will easily lead to over-fitting and their performance will also decrease significantly. Recently, researchers have tried to apply deep learning and nonlinear techniques to solve matrix completion. However, most of the existing deep learning-based methods restore each column or row of the matrix independently, which loses the global structure information of the matrix and therefore does not achieve the expected results in the image in-painting. In this paper, we propose a deep matrix factorization completion network (DMFCNet) for image in-painting by combining deep learning and a traditional matrix completion model. The main idea of DMFCNet is to map iterative updates of variables from a traditional matrix completion model into a fixed depth neural network. The potential relationships between observed matrix data are learned in a trainable end-to-end manner, which leads to a high-performance and easy-to-deploy nonlinear solution. Experimental results show that DMFCNet can provide higher matrix completion accuracy than the state-of-the-art matrix completion methods in a shorter running time.  相似文献   

11.
Purpose: In this work, we propose an implementation of the Bienenstock–Cooper–Munro (BCM) model, obtained by a combination of the classical framework and modern deep learning methodologies. The BCM model remains one of the most promising approaches to modeling the synaptic plasticity of neurons, but its application has remained mainly confined to neuroscience simulations and few applications in data science. Methods: To improve the convergence efficiency of the BCM model, we combine the original plasticity rule with the optimization tools of modern deep learning. By numerical simulation on standard benchmark datasets, we prove the efficiency of the BCM model in learning, memorization capacity, and feature extraction. Results: In all the numerical simulations, the visualization of neuronal synaptic weights confirms the memorization of human-interpretable subsets of patterns. We numerically prove that the selectivity obtained by BCM neurons is indicative of an internal feature extraction procedure, useful for patterns clustering and classification. The introduction of competitiveness between neurons in the same BCM network allows the network to modulate the memorization capacity of the model and the consequent model selectivity. Conclusions: The proposed improvements make the BCM model a suitable alternative to standard machine learning techniques for both feature selection and classification tasks.  相似文献   

12.
13.
Financial and economic time series forecasting has never been an easy task due to its sensibility to political, economic and social factors. For this reason, people who invest in financial markets and currency exchange are usually looking for robust models that can ensure them to maximize their profile and minimize their losses as much as possible. Fortunately, recently, various studies have speculated that a special type of Artificial Neural Networks (ANNs) called Recurrent Neural Networks (RNNs) could improve the predictive accuracy of the behavior of the financial data over time. This paper aims to forecast: (i) the closing price of eight stock market indexes; and (ii) the closing price of six currency exchange rates related to the USD, using the RNNs model and its variants: the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU). The results show that the GRU gives the overall best results, especially for the univariate out-of-sample forecasting for the currency exchange rates and multivariate out-of-sample forecasting for the stock market indexes.  相似文献   

14.
In recent years, neural network based image priors have been shown to be highly effective for linear inverse problems, often significantly outperforming conventional methods that are based on sparsity and related notions. While pre-trained generative models are perhaps the most common, it has additionally been shown that even untrained neural networks can serve as excellent priors in various imaging applications. In this paper, we seek to broaden the applicability and understanding of untrained neural network priors by investigating the interaction between architecture selection, measurement models (e.g., inpainting vs. denoising vs. compressive sensing), and signal types (e.g., smooth vs. erratic). We motivate the problem via statistical learning theory, and provide two practical algorithms for tuning architectural hyperparameters. Using experimental evaluations, we demonstrate that the optimal hyperparameters may vary significantly between tasks and can exhibit large performance gaps when tuned for the wrong task. In addition, we investigate which hyperparameters tend to be more important, and which are robust to deviations from the optimum.  相似文献   

15.
Identification of the diffusion type of molecules in living cells is crucial to deduct their driving forces and hence to get insight into the characteristics of the cells. In this paper, deep residual networks have been used to classify the trajectories of molecules. We started from the well known ResNet architecture, developed for image classification, and carried out a series of numerical experiments to adapt it to detection of diffusion modes. We managed to find a model that has a better accuracy than the initial network, but contains only a small fraction of its parameters. The reduced size significantly shortened the training time of the model. Moreover, the resulting network has less tendency to overfitting and generalizes better to unseen data.  相似文献   

16.
One of the most effective image processing techniques is the use of convolutional neural networks that use convolutional layers. In each such layer, the value of the layer’s output signal at each point is a combination of the layer’s input signals corresponding to several neighboring points. To improve the accuracy, researchers have developed a version of this technique, in which only data from some of the neighboring points is processed. It turns out that the most efficient case—called dilated convolution—is when we select the neighboring points whose differences in both coordinates are divisible by some constant . In this paper, we explain this empirical efficiency by proving that for all reasonable optimality criteria, dilated convolution is indeed better than possible alternatives.  相似文献   

17.
针对深度神经网络训练过程中残差随着其传播深度越来越小而使底层网络无法得到有效训练的问题,通过分析传统sigmoid激活函数应用于深度神经网络的局限性,提出双参数sigmoid激活函数。一个参数保证激活函数的输入集中坐标原点两侧,避免了激活函数进入饱和区,一个参数抑制残差衰减的速度,双参数结合有效的增强了深度神经网络的训练。结合DBN对MNIST数据集进行数字分类实验,实验表明双参数 sigmoid激活函数能够直接应用于无预训练深度神经网络,而且提高了sigmoid激活函数在有预训练深度神经网络中的训练效果。  相似文献   

18.
19.
基于BP神经网络的PID控制器   总被引:11,自引:3,他引:8  
利用神经网络和反馈控制理论, 提出了一种基于神经网络PID控制器的伺服控制系统结构. 在高精度仿真试验转台的应用中证实, 该方法避免了PID参数的整定难以匹配的问题, 减小了干摩擦对低速运动的影响. 实验表明:方法自适应能力强, 调节品质好, 具有较高的应用价值.  相似文献   

20.
Bistable behavior of neuronal complex networks is investigated in the limited-sustained-activity regime when the network is composed of excitatory and inhibitory neurons. The standard stability analysis is performed on the two metastable states separately. Both theoretical analysis and numerical simulations show consistently that the difference between time scales of excitatory and inhibitory populations can influence the dynamical behaviors of the neuronal networks dramatically, leading to the transition from bistable behaviors with memory effects to the collapse of bistable behaviors.
These results may suggest one possible neuronal information processing by only tuning time scales.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号