首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 533 毫秒
1.
Information field theory (IFT), the information theory for fields, is a mathematical framework for signal reconstruction and non-parametric inverse problems. Artificial intelligence (AI) and machine learning (ML) aim at generating intelligent systems, including such for perception, cognition, and learning. This overlaps with IFT, which is designed to address perception, reasoning, and inference tasks. Here, the relation between concepts and tools in IFT and those in AI and ML research are discussed. In the context of IFT, fields denote physical quantities that change continuously as a function of space (and time) and information theory refers to Bayesian probabilistic logic equipped with the associated entropic information measures. Reconstructing a signal with IFT is a computational problem similar to training a generative neural network (GNN) in ML. In this paper, the process of inference in IFT is reformulated in terms of GNN training. In contrast to classical neural networks, IFT based GNNs can operate without pre-training thanks to incorporating expert knowledge into their architecture. Furthermore, the cross-fertilization of variational inference methods used in IFT and ML are discussed. These discussions suggest that IFT is well suited to address many problems in AI and ML research and application.  相似文献   

2.
Many real-life processes are black-box problems, i.e., the internal workings are inaccessible or a closed-form mathematical expression of the likelihood function cannot be defined. For continuous random variables, likelihood-free inference problems can be solved via Approximate Bayesian Computation (ABC). However, an optimal alternative for discrete random variables is yet to be formulated. Here, we aim to fill this research gap. We propose an adjusted population-based MCMC ABC method by re-defining the standard ABC parameters to discrete ones and by introducing a novel Markov kernel that is inspired by differential evolution. We first assess the proposed Markov kernel on a likelihood-based inference problem, namely discovering the underlying diseases based on a QMR-DTnetwork and, subsequently, the entire method on three likelihood-free inference problems: (i) the QMR-DT network with the unknown likelihood function, (ii) the learning binary neural network, and (iii) neural architecture search. The obtained results indicate the high potential of the proposed framework and the superiority of the new Markov kernel.  相似文献   

3.
A polynomial learning algorithm for a perceptron with binary bonds and random patterns is investigated within dynamic mean field theory. A discontinuous freezing transition is found at a temperature where the entropy is still positive. Critical slowing down is observed approaching this temperature from above. The fraction of errors resulting from this learning procedure is finite in the thermodynamic limit for all temperatures and all finite values of the number of patterns per bond. Monte-Carlo simulations on larger samples (N127) are in quantitative agreement. Simulations on smaller samples indicate a finite bound for the existence of perfect solutions in agreement with the replica theory and the zero entropy criterion. This suggests that perfect solutions exist also in larger samples but cannot be found with a polynomial procedure as expected for a combinatorial hard problem.  相似文献   

4.
Learning from examples by perceptron withN binary synaptic couplings is investigated within dynamic mean field theory. This applies to learning by simulated annealing, which is a polynomial algorithm for finite cooling rates. For examples created by a teacher perceptron of the same type, a discontinuous freezing transition occurs for a training set of size N with <1.58 at a temperature where the entropy is still positive. The resulting perceptrons have finite training and generalization error. For >1.58 the couplings of the teacher are found by the above process. This work extends previous investigations on a binary perceptron trained with random patterns.  相似文献   

5.
Variational inference is an optimization-based method for approximating the posterior distribution of the parameters in Bayesian probabilistic models. A key challenge of variational inference is to approximate the posterior with a distribution that is computationally tractable yet sufficiently expressive. We propose a novel method for generating samples from a highly flexible variational approximation. The method starts with a coarse initial approximation and generates samples by refining it in selected, local regions. This allows the samples to capture dependencies and multi-modality in the posterior, even when these are absent from the initial approximation. We demonstrate theoretically that our method always improves the quality of the approximation (as measured by the evidence lower bound). In experiments, our method consistently outperforms recent variational inference methods in terms of log-likelihood and ELBO across three example tasks: the Eight-Schools example (an inference task in a hierarchical model), training a ResNet-20 (Bayesian inference in a large neural network), and the Mushroom task (posterior sampling in a contextual bandit problem).  相似文献   

6.
In this paper, we propose to leverage the Bayesian uncertainty information encoded in parameter distributions to inform the learning procedure for Bayesian models. We derive a first principle stochastic differential equation for the training dynamics of the mean and uncertainty parameter in the variational distributions. On the basis of the derived Bayesian stochastic differential equation, we apply the methodology of stochastic optimal control on the variational parameters to obtain individually controlled learning rates. We show that the resulting optimizer, StochControlSGD, is significantly more robust to large learning rates and can adaptively and individually control the learning rates of the variational parameters. The evolution of the control suggests separate and distinct dynamical behaviours in the training regimes for the mean and uncertainty parameters in Bayesian neural networks.  相似文献   

7.
It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success as a deep network used for feature extraction. Then, a GP was used as the function model. Recently, it was suggested that, albeit training with marginal likelihood, the deterministic nature of a feature extractor might lead to overfitting, and replacement with a Bayesian network seemed to cure it. Here, we propose the conditional deep Gaussian process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. It follows our previous moment matching approach to approximate the marginal prior for conditional DGP with a GP carrying an effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by optimizing the approximate marginal likelihood which implicitly depends on the hyperdata via the kernel. We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space. However, the conditional DGP and the corresponding approximate inference enjoy the benefit of being more Bayesian than deep kernel learning. Preliminary extrapolation results demonstrate expressive power from the depth of hierarchy by exploiting the exact covariance and hyperdata learning, in comparison with GP kernel composition, DGP variational inference and deep kernel learning. We also address the non-Gaussian aspect of our model as well as way of upgrading to a full Bayes inference.  相似文献   

8.
We consider the generalization problem for a perceptron with binary synapses, implementing the Stochastic Belief-Propagation-Inspired (SBPI) learning algorithm which we proposed earlier, and perform a mean-field calculation to obtain a differential equation which describes the behaviour of the device in the limit of a large number of synapses N. We show that the solving time of SBPI is of order \(N\sqrt{\log N}\), while the similar, well-known clipped perceptron (CP) algorithm does not converge to a solution at all in the time frame we considered. The analysis gives some insight into the ongoing process and shows that, in this context, the SBPI algorithm is equivalent to a new, simpler algorithm, which only differs from the CP algorithm by the addition of a stochastic, unsupervised meta-plastic reinforcement process, whose rate of application must be less than \(\sqrt{2/(\pi N)}\) for the learning to be achieved effectively. The analytical results are confirmed by simulations.  相似文献   

9.
We introduce superposition-based quantum networks composed of (i) the classical perceptron model of multilayered, feedforward neural networks and (ii) the algebraic model of evolving reticular quantum structures as described in quantum gravity. The main feature of this model is moving from particular neural topologies to a quantum metastructure which embodies many differing topological patterns. Using quantum parallelism, training is possible on superpositions of different network topologies. As a result, not only classical transition functions, but also topology becomes a subject of training. The main feature of our model is that particular neural networks, with different topologies, are quantum states. We consider high-dimensionaldissipative quantum structures as candidates for implementation of the model.  相似文献   

10.
A supervised learning algorithm for obtaining the template coefficients in completely stable Cellular Neural Networks (CNNs) is analysed in the paper. The considered algorithm resembles the well-known perceptron learning algorithm and hence called as Recurrent Perceptron Learning Algorithm (RPLA) when applied to a dynamical network. The RPLA learns pointwise defined algebraic mappings from initial-state and input spaces into steady-state output space; despite learning whole trajectories through desired equilibrium points. The RPLA has been used for training CNNs to perform some image processing tasks and found to be successful in binary image processing. The edge detection templates found by RPLA have performances comparable to those of Canny's edge detector for binary images.  相似文献   

11.
We introduce superposition-based quantum networks composed of (i) the classical perceptron model of multilayered, feedforward neural networks and (ii) the algebraic model of evolving reticular quantum structures as described in quantum gravity. The main feature of this model is moving from particular neural topologies to a quantum metastructure which embodies many differing topological patterns. Using quantum parallelism, training is possible on superpositions of different network topologies. As a result, not only classical transition functions, but also topology becomes a subject of training. The main feature of our model is that particular neural networks, with different topologies, are quantum states. We consider high-dimensional dissipative quantum structures as candidates for implementation of the model.  相似文献   

12.
Principles of the photorefractive perceptron learning algorithm are described. The influences of the finite response time and hologram erasure of the photorefractive gratings on the convergence property of the photorefractive perceptron learning are discussed. A novel neural network which could resolve these constraints is presented. It is a hybrid system which utilizes the photorefractive holographic gratings to implement the inner product between the input image and the interconnection matrix. A personal computer is used for storing the interconnection matrix and the updating procedure, and it also functions as a feedback means during the learning phase. After training the weight vectors are recorded in the volume hologram of an optical processor. This novel method combines the advantages of the massive parallelism of optical systems and the programmability of electronic computers. Experimental results of image classification are presented. It shows that the system could correctly classify the input patterns into one of the two groups after training on four examples in each group during successive iterations. The system has been extended to perform multi-category image classification.  相似文献   

13.
A learning mechanism for neural networks with binary synapses is defined and investigated. The algorithm is based on minimizing the energy of an Ising model. A replica symmetric calculation gives a parameter range where perfect learning is possible. A simple descent algorithm is studied by numerical simulation; and storage capacities, learning times and basins of attraction are determined.  相似文献   

14.
T.L.H. Watkin   《Physica A》1993,200(1-4):628-635
We introduce optimal learning with a neural network, which we define as building a network with a minimal expectation generalisation error. This procedure may be analysed exactly in idealized problems by exploiting the relationship between sampling a space of hypotheses and the replica method of statistical physics. We find that the optimally trained spherical perceptron may learn a linearly separable rule as well as any possible computer, and present simulation results supporting our conclusions. Optimal learning of a well-known unlearnable problem, the “mismatched weight” problem, gives better asymptotic learning than conventional techniques, and may be simulated more easily. Unlike many other perceptron learning schemes, optimal learning extends to more general networks learning more complex rules.  相似文献   

15.
Analysis of finite, noisy time series data leads to modern statistical inference methods. Here we adapt Bayesian inference for applied symbolic dynamics. We show that reconciling Kolmogorov's maximum-entropy partition with the methods of Bayesian model selection requires the use of two separate optimizations. First, instrument design produces a maximum-entropy symbolic representation of time series data. Second, Bayesian model comparison with a uniform prior selects a minimum-entropy model, with respect to the considered Markov chain orders, of the symbolic data. We illustrate these steps using a binary partition of time series data from the logistic and Henon maps as well as the R?ssler and Lorenz attractors with dynamical noise. In each case we demonstrate the inference of effectively generating partitions and kth-order Markov chain models.  相似文献   

16.
A modularly-structured neural network model is considered. Each module, which we call a ‘cell’, consists of two parts: a Hopfield neural network model and a multilayered perceptron. An array of such cells is used to simulate the Rule 110 cellular automaton with high accuracy even when all the units of neural networks are replaced by stochastic binary ones. We also find that noise not only degrades but also facilitates computation if the outputs of multilayered perceptrons are below the threshold required to update the states of the cells, which is a stochastic resonance in computation.  相似文献   

17.
Ming-Jian Guo 《中国物理 B》2022,31(7):78702-078702
Memristive neural network has attracted tremendous attention since the memristor array can perform parallel multiply-accumulate calculation (MAC) operations and memory-computation operations as compared with digital CMOS hardware systems. However, owing to the variability of the memristor, the implementation of high-precision neural network in memristive computation units is still difficult. Existing learning algorithms for memristive artificial neural network (ANN) is unable to achieve the performance comparable to high-precision by using CMOS-based system. Here, we propose an algorithm based on off-chip learning for memristive ANN in low precision. Training the ANN in the high-precision in digital CPUs and then quantifying the weight of the network to low precision, the quantified weights are mapped to the memristor arrays based on VTEAM model through using the pulse coding weight-mapping rule. In this work, we execute the inference of trained 5-layers convolution neural network on the memristor arrays and achieve an accuracy close to the inference in the case of high precision (64-bit). Compared with other algorithms-based off-chip learning, the algorithm proposed in the present study can easily implement the mapping process and less influence of the device variability. Our result provides an effective approach to implementing the ANN on the memristive hardware platform.  相似文献   

18.
We show that the phase sensitivity Deltatheta of a Mach-Zehnder interferometer illuminated by a coherent state in one input port and a squeezed-vacuum state in the other port is (i) independent of the true value of the phase shift and (ii) can reach the Heisenberg limit Deltatheta approximately 1/N(T), where N(T) is the average number of input particles. We also demonstrate that the Cramer-Rao lower bound of phase sensitivity, Deltatheta approximately 1/square root[|alpha|(2)e(2r)+sinh(2)r], can be saturated for arbitrary values of the squeezing parameter r and the amplitude of the coherent mode alpha by using a Bayesian phase inference protocol.  相似文献   

19.
Traditionally, Hawkes processes are used to model time-continuous point processes with history dependence. Here, we propose an extended model where the self-effects are of both excitatory and inhibitory types and follow a Gaussian Process. Whereas previous work either relies on a less flexible parameterization of the model, or requires a large amount of data, our formulation allows for both a flexible model and learning when data are scarce. We continue the line of work of Bayesian inference for Hawkes processes, and derive an inference algorithm by performing inference on an aggregated sum of Gaussian Processes. Approximate Bayesian inference is achieved via data augmentation, and we describe a mean-field variational inference approach to learn the model parameters. To demonstrate the flexibility of the model we apply our methodology on data from different domains and compare it to previously reported results.  相似文献   

20.
The accurate prediction of the solar diffuse fraction (DF), sometimes called the diffuse ratio, is an important topic for solar energy research. In the present study, the current state of Diffuse irradiance research is discussed and then three robust, machine learning (ML) models are examined using a large dataset (almost eight years) of hourly readings from Almeria, Spain. The ML models used herein, are a hybrid adaptive network-based fuzzy inference system (ANFIS), a single multi-layer perceptron (MLP) and a hybrid multi-layer perceptron grey wolf optimizer (MLP-GWO). These models were evaluated for their predictive precision, using various solar and DF irradiance data, from Spain. The results were then evaluated using frequently used evaluation criteria, the mean absolute error (MAE), mean error (ME) and the root mean square error (RMSE). The results showed that the MLP-GWO model, followed by the ANFIS model, provided a higher performance in both the training and the testing procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号