首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Under consideration is the steepest descent method for solving the problem of determination of a coefficient in a hyperbolic equation in integral statement. The properties of solutions to the direct and inverse problems are studied. Estimates for the objective functional and its gradient are obtained. Convergence in the mean is proved for the steepest descent method for minimizing the residual functional.  相似文献   

2.
A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the overparametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels. In addition, it is proved that throughout the training process the functions represented by the neural network model are uniformly close to those of a kernel method. For general values of the network width and training data size, sharp estimates of the generalization error are established for target functions in the appropriate reproducing kernel Hilbert space.  相似文献   

3.
随机梯度下降法的一些性质(英文)   总被引:2,自引:0,他引:2  
汪宝彬  汪玉霞 《数学杂志》2011,31(6):1041-1044
本文研究了一般核空间下的随机梯度下降法.通过迭代方法,给出了该算法的一些重要性质,这些性质对于研究收敛速度起到至关重要的作用.  相似文献   

4.
This article derives characterizations and computational algorithms for continuous general gradient descent trajectories in high-dimensional parameter spaces for statistical model selection, prediction, and classification. Examples include proportional gradient shrinkage as an extension of LASSO and LARS, threshold gradient descent with right-continuous variable selectors, threshold ridge regression, and many more with proper combinations of variable selectors and functional forms of a kernel. In all these problems, general gradient descent trajectories are continuous piecewise analytic vector-valued curves as solutions to matrix differential equations. We show the monotonicity and convergence of the proposed algorithms in the loss or negative likelihood functions. We prove that approximations of continuous solutions via infinite series expansions are computationally more efficient and accurate compared with discretization methods. We demonstrate the applicability of our algorithms through numerical experiments with real and simulated datasets.  相似文献   

5.
The aim of this paper is an analysis of geometric inverse problems in linear elasticity and thermoelasticity related to the identification of cavities in two and three spatial dimensions. The overdetermined boundary data used for the reconstruction are the displacement and temperature on a part of the boundary. We derive identifiability results and directional stability estimates, the latter using the concept of shape derivatives, whose form is known in elasticity and newly derived for thermoelasticity. For numerical reconstructions we use a least‐squares formulation and a geometric gradient descent based on the associated shape derivatives. The directional stability estimates guarantee the stability of the gradient descent approach, so that an iterative regularization is obtained. This iterative scheme is then regularized by a level set approach allowing the reconstruction of multiply connected shapes. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

6.
Plane and axisymmetric cavitation flow problems are considered using Riabouchinsky’s scheme. The incoming flow is assumed to be irrotational and steady, and the fluid is assumed to be inviscid and incompressible. The flow problems are solved by applying the boundary element method with quadrature formulas without saturation. The free boundary is determined using a gradient descent technique based on Riabouchinsky’s principle. The drag force acting on the cavitator is expressed in terms of the Riabouchinsky functional. As a result, for small cavitation numbers, the force is calculated with a fairly high accuracy. Dependences of the drag coefficient are investigated for variously shaped cavitators: a wedge, a cone, a circular arc, and a spherical segment.  相似文献   

7.
We present an analytical study of gradient descent algorithms applied to a classification problem in machine learning based on artificial neural networks. Our approach is based on entropy–entropy dissipation estimates that yield explicit rates. Specifically, as long as the neural nets remain within a set of “good classifiers”, we establish a striking feature of the algorithm: it mathematically diverges as the number of gradient descent iterations (“time”) goes to infinity but this divergence is only logarithmic, while the loss function vanishes polynomially. As a consequence, this algorithm still yields a classifier that exhibits good numerical performance and may even appear to converge.  相似文献   

8.
本文使用双水平集函数逼近油藏模型特征, 构造出Uzawas 算法进行数值模拟. 对于两相流渗透率的数值求解问题, 可以通过测量油井数据和地震波数据来实现. 将构造出来的带限制的最优化问题使用变异的Lagrange 方法求解. 如果使用双水平集函数逼近渗透率函数, 则需要对Lagrange 函数进行修正, 从而将带限制的最优化问题转化成无限制的最优化问题. 由于双水平集函数的优越性, 进一步构造出最速梯度下降Uzawas 算法和算子分裂格式Uzawas 算法进行求解对应的最优化子问题. 数值算例表明设计的算法是高效的、稳定的.  相似文献   

9.
We introduce the notion of predicted decrease approximation (PDA) for constrained convex optimization, a flexible framework which includes as special cases known algorithms such as generalized conditional gradient, proximal gradient, greedy coordinate descent for separable constraints and working set methods for linear equality constraints with bounds. The new scheme allows the development of a unified convergence analysis for these methods. We further consider a partially strongly convex nonsmooth model and show that dual application of PDA-based methods yields new sublinear convergence rate estimates in terms of both primal and dual objectives. As an example of an application, we provide an explicit working set selection rule for SMO-type methods for training the support vector machine with an improved primal convergence analysis.  相似文献   

10.
We consider optimal shape design in Stokes flow using $H^1$ shape gradient flows based on the distributed Eulerian derivatives. MINI element is used for discretizations of Stokes equation and Galerkin finite element is used for discretizations of distributed and boundary $H^1$ shape gradient flows. Convergence analysis with a priori error estimates is provided under general and different regularity assumptions. We investigate the performances of shape gradient descent algorithms for energy dissipation minimization and obstacle flow. Numerical comparisons in 2D and 3D show that the distributed $H^1$ shape gradient flow is more accurate than the popular boundary type. The corresponding distributed shape gradient algorithm is more effective.  相似文献   

11.
For the adjoint topology optimisation of a fluid-dynamic cost functional we apply an Armijo step length selection rule in the gradient descent algorithm. To reduce the computational effort for evaluating the cost functional value in the step length prediction, the function evaluation step is performed on a coarse mesh. (© 2014 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

12.
The radial and circumferential (azimuthal) transient dependence of the strength of a volumetric heat source in a cylindrical rod is estimated with Alifanov's iterative regularization method. This inverse problem is solved as an optimization problem in which a squared residue functional is minimized with the conjugate gradient method. A sensitivity problem is used in the determination of the step size in the direction of descent, while an adjoint problem is solved to determine the gradient. In order to examine the accuracy of estimations, two test cases are considered, one with a radial and timewise dependence and the second with radial, azimuthal as well as timewise dependence. The effects of number of sensors and measurement errors are investigated.  相似文献   

13.
In most applications, denoising image is fundamental to subsequent image processing operations. This paper proposes a spectral conjugate gradient (CG) method for impulse noise removal, which is based on a two-phase scheme. The noise candidates are first identified by the adaptive (center-weighted) median filter; then these noise candidates are restored by minimizing an edge-preserving regularization functional, which is accomplished by the proposed spectral CG method. A favorite property of the proposed method is that the search direction generated at each iteration is descent. Under strong Wolfe line search conditions, its global convergence result could be established. Numerical experiments are given to illustrate the efficiency of the spectral conjugate gradient method for impulse noise removal.  相似文献   

14.
Deep neural networks have successfully been trained in various application areas with stochastic gradient descent. However, there exists no rigorous mathematical explanation why this works so well. The training of neural networks with stochastic gradient descent has four different discretization parameters: (i) the network architecture; (ii) the amount of training data; (iii) the number of gradient steps; and (iv) the number of randomly initialized gradient trajectories. While it can be shown that the approximation error converges to zero if all four parameters are sent to infinity in the right order, we demonstrate in this paper that stochastic gradient descent fails to converge for ReLU networks if their depth is much larger than their width and the number of random initializations does not increase to infinity fast enough.  相似文献   

15.
The shape derivative of a functional related to a Bernoulli problem is derived without using the shape derivative of the state. The gradient information is combined with level set ideas in a steepest descent algorithm. Numerical examples show the feasibility of the approach.  相似文献   

16.
The Penrose regression problem, including the orthonormal Procrustes problem and rotation problem to a partially specified target, is an important class of data matching problems arising frequently in multivariate analysis, yet its optimality conditions have never been clearly understood. This work offers a way to calculate the projected gradient and the projected Hessian explicitly. One consequence of this calculation is the complete characterization of the first order and the second order necessary and sufficient optimality conditions for this problem. Another application is the natural formulation of a continuous steepest descent ow that can serve as a globally convergent numerical method. Applications to the orthonormal Procrustes problem and Penrose regression problem with partially specified target are demonstrated in this article. Finally, some numerical results are reported and commented.  相似文献   

17.
In this paper we are concerned with a posteriori error estimates for the solution of some state constraint optimization problem subject to an elliptic PDE. The solution is obtained using an interior point method combined with a finite element method for the discretization of the problem. We will derive separate estimates for the error in the cost functional introduced by the interior point parameter and by the discretization of the problem. Finally we show numerical examples to illustrate the findings for pointwise state constraints and pointwise constraints on the gradient of the state.  相似文献   

18.
Descent methods with linesearch in the presence of perturbations   总被引:3,自引:0,他引:3  
We consider the class of descent algorithms for unconstrained optimization with an Armijo-type stepsize rule in the case when the gradient of the objective function is computed inexactly. An important novel feature in our theoretical analysis is that perturbations associated with the gradient are not assumed to be relatively small or to tend to zero in the limit (as a practical matter, we expect them to be reasonably small, so that a meaningful approximate solution can be obtained). This feature makes our analysis applicable to various difficult problems encounted in practice. We propose a modified Armijo-type rule for computing the stepsize which guarantees that the algorithm obtains a reasonable approximate solution. Furthermore, if perturbations are small relative to the size of the gradient, then our algorithm retains all the standard convergence properties of descent methods.  相似文献   

19.
The topic of this paper is the convergence analysis of subspace gradient iterations for the simultaneous computation of a few of the smallest eigenvalues plus eigenvectors of a symmetric and positive definite matrix pair (A,M). The methods are based on subspace iterations for A ? 1M and use the Rayleigh‐Ritz procedure for convergence acceleration. New sharp convergence estimates are proved by generalizing estimates, which have been presented for vectorial steepest descent iterations (see SIAM J. Matrix Anal. Appl., 32(2):443‐456, 2011). Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
This paper deals with the convergence analysis of various preconditioned iterations to compute the smallest eigenvalue of a discretized self-adjoint and elliptic partial differential operator. For these eigenproblems several preconditioned iterative solvers are known, but unfortunately, the convergence theory for some of these solvers is not very well understood.The aim is to show that preconditioned eigensolvers (like the preconditioned steepest descent iteration (PSD) and the locally optimal preconditioned conjugate gradient method (LOPCG)) can be interpreted as truncated approximate Krylov subspace iterations. In the limit of preconditioning with the exact inverse of the system matrix (such preconditioning can be approximated by multiple steps of a preconditioned linear solver) the iterations behave like Invert-Lanczos processes for which convergence estimates are derived.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号