期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics

《中国科学数学(英文版)》2020,(7)

A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the overparametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels. In addition, it is proved that throughout the training process the functions represented by the neural network model are uniformly close to those of a kernel method. For general values of the network width and training data size, sharp estimates of the generalization error are established for target functions in the appropriate reproducing kernel Hilbert space. 相似文献

2.

用在线梯度法训练积单元神经网络的收敛性分析 总被引：1，自引：0，他引：1

张超李正学陈先华熊焱《高等学校计算数学学报》2010,32(3)

<正>1引言仅由加和单元构成的传统前向神经网络已经广泛应用于模式识别及函数逼近等领域.但在处理比较复杂的问题时,这种网络往往需要补充大量的隐节点,这样就不可避免地增相似文献

3.

Optimization of manufacturing systems using a neural network metamodel with a new training approach

B Dengiz C Alabas-Uslu O Dengiz 《The Journal of the Operational Research Society》2009,60(9):1191-1197

In this study, two manufacturing systems, a kanban-controlled system and a multi-stage, multi-server production line in a diamond tool production system, are optimized utilizing neural network metamodels (tst_NNM) trained via tabu search (TS) which was developed previously by the authors. The most widely used training algorithm for neural networks has been back propagation which is based on a gradient technique that requires significant computational effort. To deal with the major shortcomings of back propagation (BP) such as the tendency to converge to a local optimal and a slow convergence rate, the TS metaheuristic method is used for the training of artificial neural networks to improve the performance of the metamodelling approach. The metamodels are analysed based on their ability to predict simulation results versus traditional neural network metamodels that have been trained by BP algorithm (bp_NNM). Computational results show that tst_NNM is superior to bp_NNM for both of the manufacturing systems. 相似文献

4.

CONVERGENCE OF ONLINE GRADIENT METHOD WITH A PENALTY TERM FOR FEEDFORWARD NEURAL NETWORKS WITH STOCHASTIC INPUTS

Shao Hongmei Wu Wei Li Feng 《高等学校计算数学学报(英文版)》2005,14(1)

Online gradient algorithm has been widely used as a learning algorithm for feedforward neural network training. In this paper, we prove a weak convergence theorem of an online gradient algorithm with a penalty term, assuming that the training examples are input in a stochastic way. The monotonicity of the error function in the iteration and the boundedness of the weight are both guaranteed. We also present a numerical experiment to support our results. 相似文献

5.

Non-convergence of stochastic gradient descent in the training of deep neural networks

《Journal of Complexity》2021

Deep neural networks have successfully been trained in various application areas with stochastic gradient descent. However, there exists no rigorous mathematical explanation why this works so well. The training of neural networks with stochastic gradient descent has four different discretization parameters: (i) the network architecture; (ii) the amount of training data; (iii) the number of gradient steps; and (iv) the number of randomly initialized gradient trajectories. While it can be shown that the approximation error converges to zero if all four parameters are sent to infinity in the right order, we demonstrate in this paper that stochastic gradient descent fails to converge for ReLU networks if their depth is much larger than their width and the number of random initializations does not increase to infinity fast enough. 相似文献

6.

Online Gradient Methods with a Punishing Term for Neural Networks 总被引：1，自引：0，他引：1

孔俊吴微《东北数学》2001,(3)

1 IntroductionOnline gradient methods (OGM, for short) are widely used for training neuraJ networks (cf.Il,2,3,4]). Its iterative convergence for linear models is proved in e.g. I5,6,71. A nonlinearn1odel is considered in [8]. During the iterative training procedure, sometimes (see the nextsection of this paper)the weight of the network may become very laxge, causing difficultiesin the implementation of the network by electronic circuits. A revised error fUnction ispresented in [gl to prev… 相似文献

7.

Neural network as a simulation metamodel in economic analysis of risky projects

《European Journal of Operational Research》1998,105(1):130-142

An artificial neural network (ANN) model for economic analysis of risky projects is presented in this paper. Outputs of conventional simulation models are used as neural network training inputs. The neural network model is then used to predict the potential returns from an investment project having stochastic parameters. The nondeterministic aspects of the project include the initial investment, the magnitude of the rate of return, and the investment period. Backpropagation method is used in the neural network modeling. Sigmoid and hyperbolic tangent functions are used in the learning aspect of the system. Analysis of the outputs of the neural network model indicates that more predictive capability can be achieved by coupling conventional simulation with neural network approaches. The trained network was able to predict simulation output based on the input values with very good accuracy for conditions not in its training set. This allowed an analysis of the future performance of the investment project without having to run additional expensive and time-consuming simulation experiments. 相似文献

8.

三种生成神经网络拓扑结构的方法在股票、商品价格预测中的应用及结果比较

刘畅张文邵燕敏《系统科学与数学》2011,31(3)

对比了三种不同神经网络模型的生成方式:传统神经网络生成模型,遗传算法训练神经网络模型,以及在第二种方式训练参数的基础上,再使用传统神经网络优化生成模型.论文使用上述三种方法对代表性股票和商品价格进行拟合并预测,通过预测结果准确性和稳定性的比较发现:引入遗传算法后的神经网络在样本内的拟合误差有所降低,而第三种方法在样本外有最低的预测误差和最优稳定性. 相似文献

9.

Linear combination rule in genetic algorithm for optimization of finite impulse response neural network to predict natural chaotic time series

Hossein Mirzaee 《Chaos, solitons, and fractals》2009,41(5):407-2689

A finite impulse response neural network, with tap delay lines after each neuron in hidden layer, is used. Genetic algorithm with arithmetic decimal crossover and Roulette selection with normal probability mutation method with linear combination rule is used for optimization of FIR neural network. The method is applied for prediction of several important and benchmarks chaotic time series such as: geomagnetic activity index natural time series and famous Mackey–Glass time series. The results of simulations shows that applying dynamic neural models for modeling of highly nonlinear chaotic systems is more satisfactory with respect to feed forward neural networks. Likewise, global optimization method such as genetic algorithm is more efficient in comparison of nonlinear gradient based optimization methods like momentum term, conjugate gradient. 相似文献

10.

Mean field analysis of neural networks: A central limit theorem

《Stochastic Processes and their Applications》2020,130(3):1820-1852

We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network’s fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space. 相似文献

11.

Numerical solution of the nonlinear Schrodinger equation by feedforward neural networks

Yazdan Shirvany Mohsen Hayati Rostam Moradian 《Communications in Nonlinear Science & Numerical Simulation》2008,13(10):2132-2145

We present a method to solve boundary value problems using artificial neural networks (ANN). A trial solution of the differential equation is written as a feed-forward neural network containing adjustable parameters (the weights and biases). From the differential equation and its boundary conditions we prepare the energy function which is used in the back-propagation method with momentum term to update the network parameters. We improved energy function of ANN which is derived from Schrodinger equation and the boundary conditions. With this improvement of energy function we can use unsupervised training method in the ANN for solving the equation. Unsupervised training aims to minimize a non-negative energy function. We used the ANN method to solve Schrodinger equation for few quantum systems. Eigenfunctions and energy eigenvalues are calculated. Our numerical results are in agreement with their corresponding analytical solution and show the efficiency of ANN method for solving eigenvalue problems. 相似文献

12.

Intelligent control of chaos using linear feedback controller and neural network identifier

M. Sadeghpour M. Khodabakhsh H. Salarieh 《Communications in Nonlinear Science & Numerical Simulation》2012,17(12):4731-4739

A method for controlling chaos when the mathematical model of the system is unknown is presented in this paper. The controller is designed by the pole placement algorithm which provides a linear feedback control method. For calculating the feedback gain, a neural network is used for identification of the system from which the Jacobian of the system in its fixed point can be approximated. The weights of the neural network are adjusted online by the gradient descent algorithm in which the difference between the system output and the network output is considered as the error to be decreased. The method is applied on both discrete-time and continuous-time systems. For continuous-time systems, equivalent discrete-time systems are constructed by using the Poincare map concept. Two discrete-time systems and one continuous-time system are tested as examples for simulation and the results show good functionality of the proposed method. It can be concluded that the chaos in systems with unknown dynamics may be eliminated by the presented intelligent control system based on pole placement and neural network. 相似文献

13.

带惩罚项与随机输入的BP神经网络在线梯度学习算法的收敛性

鲁慧芳吴微李正学《数学研究及应用》2007,27(3):643-653

本文对三层BP神经网络中带有惩罚项的在线梯度学习算法的收敛性问题进行了研究,在网络训练每一轮开始执行之前,对训练样本随机进行重排,以使网络学习更容易跳出局部极小,文中给出了误差函数的单调性定理以及该算法的弱收敛和强收敛性定理。相似文献

14.

Comparative Analysis of Artificial Neural Network Models: Application in Bankruptcy Prediction

Chris Charalambous Andreas Charitou Froso Kaourou 《Annals of Operations Research》2000,99(1-4):403-425

This study compares the predictive performance of three neural network methods, namely the learning vector quantization, the radial basis function, and the feedforward network that uses the conjugate gradient optimization algorithm, with the performance of the logistic regression and the backpropagation algorithm. All these methods are applied to a dataset of 139 matched-pairs of bankrupt and non-bankrupt US firms for the period 1983–1994. The results of this study indicate that the contemporary neural network methods applied in the present study provide superior results to those obtained from the logistic regression method and the backpropagation algorithm. 相似文献

15.

Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero 总被引：4，自引：0，他引：4

M.V. Solodov 《Computational Optimization and Applications》1998,11(1):23-35

We consider the class of incremental gradient methods for minimizing a sum of continuously differentiable functions. An important novel feature of our analysis is that the stepsizes are kept bounded away from zero. We derive the first convergence results of any kind for this computationally important case. In particular, we show that a certain -approximate solution can be obtained and establish the linear dependence of on the stepsize limit. Incremental gradient methods are particularly well-suited for large neural network training problems where obtaining an approximate solution is typically sufficient and is often preferable to computing an exact solution. Thus, in the context of neural networks, the approach presented here is related to the principle of tolerant training. Our results justify numerous stepsize rules that were derived on the basis of extensive numerical experimentation but for which no theoretical analysis was previously available. In addition, convergence to (exact) stationary points is established when the gradient satisfies a certain growth property. 相似文献

16.

Analytical solution of stochastic differential equation by multilayer perceptron neural network approximation of Fokker–Planck equation

Ali Namadchian Mehdi Ramezani 《Numerical Methods for Partial Differential Equations》2020,36(3):637-653

The Fokker–Planck equation is a useful tool to analyze the transient probability density function of the states of a stochastic differential equation. In this paper, a multilayer perceptron neural network is utilized to approximate the solution of the Fokker–Planck equation. To use unconstrained optimization in neural network training, a special form of the trial solution is considered to satisfy the initial and boundary conditions. The weights of the neural network are calculated by Levenberg–Marquardt training algorithm with Bayesian regularization. Three practical examples demonstrate the efficiency of the proposed method. 相似文献

17.

基于智能控制的高速线材轧机水冷控制系统优化

谭钢军石晓龙《数学的实践与认识》2013,43(6)

针对武汉钢铁集团公司大型轧钢厂当前在高速线材生产线中存在的水冷控制系统可靠性差,轧线温度波动范围大等问题,应用智能计算理论及方法对上述工业控制系统进行系统辨识、建模以及优化.分析比较了基于梯度下降搜索BP算法、径向基函数网络、Levenberg Marquardt BP算法的前馈神经网络对SMS水冷系统的逼近精度、训练速度.研究了采用Levenberg-Marquardt BP算法的前馈神经网络在样本集和测试集上的表现,建立了基于Levenberg-Marquardt BP算法的前馈神经网络水冷控制系统模型.解决了高速线材水冷控制系统可靠性,温度控制精度问题. 相似文献

18.

故障诊断中模糊神经网络的应用研究 总被引：1，自引：0，他引：1

王正武张瑞平《数学的实践与认识》2004,34(12):8-12

文章给出了利用模糊神经网络诊断故障的数学模型、基本原理、方法、步骤 ,和模糊网络的学习流程 ,并利用梯度法推导出两种诊断算法 ;在对某发动机滑油典型故障样本的仿真过程中 ,结果完全正确 ,对非样本故障的仿真 ,准确率达 90 % . 相似文献

19.

基于组合神经网络的时间分数阶扩散方程计算方法

下载免费PDF全文

王江陈文《应用数学和力学》2019,40(7):741-750

该文首次采用一种组合神经网络的方法，求解了一维时间分数阶扩散方程.组合神经网络是由径向基函数（RBF）神经网络与幂激励前向神经网络相结合所构造出的一种新型网络结构.首先，利用该网络结构构造出符合时间分数阶扩散方程条件的数值求解格式，同时设置误差函数，使原问题转化为求解误差函数极小值问题；然后，结合神经网络模型中的梯度下降学习算法进行循环迭代，从而获得神经网络的最优权值以及各项最优参数，最终得到问题的数值解.数值算例验证了该方法的可行性、有效性和数值精度.该文工作为时间分数阶扩散方程的求解开辟了一条新的途径. 相似文献

20.

阿尔茨海默病诊断的自适应BP神经网络模型

罗万春马翠周先东王笑梅《数学的实践与认识》2017,(2):124-129

阿尔茨海默病(AD)和轻度认知功能损伤(MCI)具有患者多、诊断难的特点,改进BP神经网络,提出自适应BP神经网络(ABP)进行100次AD和MCI诊断模拟,ABP神经网络的诊断正确率显著高于BP和RBF神经网络.采用留一法将101例正常人、200例MCI和90例AD患者的样本分为训练集和检测集,用ABP神经网络对其进行诊断模拟,总正确率达到73.91%. 相似文献