首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
Parallel preconditioned conjugate gradient algorithm on GPU   总被引:1,自引:0,他引:1  
We propose a parallel implementation of the Preconditioned Conjugate Gradient algorithm on a GPU platform. The preconditioning matrix is an approximate inverse derived from the SSOR preconditioner. Used through sparse matrix–vector multiplication, the proposed preconditioner is well suited for the massively parallel GPU architecture. As compared to CPU implementation of the conjugate gradient algorithm, our GPU preconditioned conjugate gradient implementation is up to 10 times faster (8 times faster at worst).  相似文献   

2.
A meaningful rank as well as efficient methods for computing such a rank are necessary in many areas of applications. Major methodologies for ranking often exploit principal eigenvectors. Kleinberg’s HITS model is one of such methodologies. The standard approach for computing the HITS rank is the power method. Unlike the PageRank calculations where many acceleration schemes have been proposed, relatively few works on accelerating HITS rank calculation exist. This is mainly because the power method often works quite well in the HITS setting. However, there are cases where the power method is ineffective, moreover, a systematic acceleration over the power method is desirable even when the power method works well. We propose a practical acceleration scheme for HITS rank calculations based on the filtered power method by adaptive Chebyshev polynomials. For cases where the gap-ratio is below 0.85 for which the power method works well, our scheme is about twice faster than the power method. For cases where gap-ratio is unfavorable for the power method, our scheme can provide significant speedup. When the ranking problems are of very large scale, even a single matrix–vector product can be expensive, for which accelerations are highly necessary. The scheme we propose is desirable in that it provides consistent reduction in number of matrix–vector products as well as CPU time over the power method, with little memory overhead.  相似文献   

3.
The limiting factors of second-order methods for large-scale semidefinite optimization are the storage and factorization of the Newton matrix. For a particular algorithm based on the modified barrier method, we propose to use iterative solvers instead of the routinely used direct factorization techniques. The preconditioned conjugate gradient method proves to be a viable alternative for problems with a large number of variables and modest size of the constrained matrix. We further propose to avoid explicit calculation of the Newton matrix either by an implicit scheme in the matrix–vector product or using a finite-difference formula. This leads to huge savings in memory requirements and, for certain problems, to further speed-up of the algorithm. Dedicated to the memory of Jos Sturm.  相似文献   

4.
Issues of indefinite preconditioning of reduced Newton systems arising in optimization with interior point methods are addressed in this paper. Constraint preconditioners have shown much promise in this context. However, there are situations in which an unfavorable sparsity pattern of Jacobian matrix may adversely affect the preconditioner and make its inverse representation unacceptably dense hence too expensive to be used in practice. A remedy to such situations is proposed in this paper. An approximate constraint preconditioner is considered in which sparse approximation of the Jacobian is used instead of the complete matrix. Spectral analysis of the preconditioned matrix is performed and bounds on its non-unit eigenvalues are provided. Preliminary computational results are encouraging.  相似文献   

5.
Naive implementations of Newton's method for unconstrainedN-stage discrete-time optimal control problems with Bolza objective functions tend to increase in cost likeN 3 asN increases. However, if the inherent recursive structure of the Bolza problem is properly exploited, the cost of computing a Newton step will increase only linearly withN. The efficient Newton implementation scheme proposed here is similar to Mayne's DDP (differential dynamic programming) method but produces the Newton step exactly, even when the dynamical equations are nonlinear. The proposed scheme is also related to a Riccati treatment of the linear, two-point boundary-value problems that characterize optimal solutions. For discrete-time problems, the dynamic programming approach and the Riccati substitution differ in an interesting way; however, these differences essentially vanish in the continuous-time limit.This work was supported by the National Science Foundation, Grant No. DMS-85-03746.  相似文献   

6.
Solving large radial basis function (RBF) interpolation problems with non‐customised methods is computationally expensive and the matrices that occur are typically badly conditioned. For example, using the usual direct methods to fit an RBF with N centres requires O(N 2) storage and O(N 3) flops. Thus such an approach is not viable for large problems with N 10,000. In this paper we present preconditioning strategies which, in combination with fast matrix–vector multiplication and GMRES iteration, make the solution of large RBF interpolation problems orders of magnitude less expensive in storage and operations. In numerical experiments with thin‐plate spline and multiquadric RBFs the preconditioning typically results in dramatic clustering of eigenvalues and improves the condition numbers of the interpolation problem by several orders of magnitude. As a result of the eigenvalue clustering the number of GMRES iterations required to solve the preconditioned problem is of the order of 10-20. Taken together, the combination of a suitable approximate cardinal function preconditioner, the GMRES iterative method, and existing fast matrix–vector algorithms for RBFs [4,5] reduce the computational cost of solving an RBF interpolation problem to O(N) storage, and O(N \log N) operations. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

7.
In this paper, we compare two block triangular preconditioners for different linearizations of the Rayleigh–Bénard convection problem discretized with finite element methods. The two preconditioners differ in the nested or nonnested use of a certain approximation of the Schur complement associated to the Navier–Stokes block. First, bounds on the generalized eigenvalues are obtained for the preconditioned systems linearized with both Picard and Newton methods. Then, the performance of the proposed preconditioners is studied in terms of computational time. This investigation reveals some inconsistencies in the literature that are hereby discussed. We observe that the nonnested preconditioner works best both for the Picard and for the Newton cases. Therefore, we further investigate its performance by extending its application to a mixed Picard–Newton scheme. Numerical results of two‐ and three‐dimensional cases show that the convergence is robust with respect to the mesh size. We also give a characterization of the performance of the various preconditioned linearization schemes in terms of the Rayleigh number.  相似文献   

8.
Tikhonov regularization for large-scale linear ill-posed problems is commonly implemented by determining a partial Lanczos bidiagonalization of the matrix of the given system of equations. This paper explores the possibility of instead computing a partial Arnoldi decomposition of the given matrix. Computed examples illustrate that this approach may require fewer matrix–vector product evaluations and, therefore, less arithmetic work. Moreover, the proposed range-restricted Arnoldi–Tikhonov regularization method does not require the adjoint matrix and, hence, is convenient to use for problems for which the adjoint is difficult to evaluate.  相似文献   

9.
We consider an iterative preconditioning technique for non-convex large scale optimization. First, we refer to the solution of large scale indefinite linear systems by using a Krylov subspace method, and describe the iterative construction of a preconditioner which does not involve matrices products or matrices storage. The set of directions generated by the Krylov subspace method is used, as by product, to provide an approximate inverse preconditioner. Then, we experience our preconditioner within Truncated Newton schemes for large scale unconstrained optimization, where we generalize the truncation rule by Nash–Sofer (Oper. Res. Lett. 9:219–221, 1990) to the indefinite case, too. We use a Krylov subspace method to both approximately solve the Newton equation and to construct the preconditioner to be used at the current outer iteration. An extensive numerical experience shows that the proposed preconditioning strategy, compared with the unpreconditioned strategy and PREQN (Morales and Nocedal in SIAM J. Optim. 10:1079–1096, 2000), may lead to a reduction of the overall inner iterations. Finally, we show that our proposal has some similarities with the Limited Memory Preconditioners (Gratton et al. in SIAM J. Optim. 21:912–935, 2011).  相似文献   

10.
The Jacobian-free Newton–Krylov (JFNK) method is a special kind of Newton–Krylov algorithm, in which the matrix-vector product is approximated by a finite difference scheme. Consequently, it is not necessary to form and store the Jacobian matrix. This can greatly improve the efficiency and enlarge the application area of the Newton–Krylov method. The finite difference scheme has a strong influence on the accuracy and robustness of the JFNK method. In this paper, several methods for approximating the Jacobian-vector product, including the finite difference scheme and the finite difference step size, are analyzed and compared. Numerical results are given to verify the effectiveness of different finite difference methods.  相似文献   

11.
In this paper we propose a parallel preconditioner for the CG solver based on successive applications of the FSAI preconditioner. We first compute an FSAI factor G out for coefficient matrix A, and then another FSAI preconditioner is computed for either the preconditioned matrix $S = G_{\rm out} A G_{\rm out}^T$ or a sparse approximation of S. This process can be iterated to obtain a sequence of triangular factors whose product forms the final preconditioner. Numerical results onto large SPD matrices arising from geomechanical models account for the efficiency of the proposed preconditioner which provides a reduction of the iteration number and of the CPU time of the iterative phase with respect to the original FSAI preconditioner. The proposed preconditioner reveals particularly efficient for accelerating an iterative procedure to find the smallest eigenvalues of SPD matrices, where the increased setup cost of the RFSAI preconditioner does not affect the overall performance, being a small percentage of the total CPU time.  相似文献   

12.
We propose a simple and effective hybrid (multiplicative) Schwarz precondtioner for solving systems of algebraic equations resulting from the mortar finite element discretization of second order elliptic problems on nonmatching meshes. The preconditioner is embedded in a variant of the classical preconditioned conjugate gradient (PCG) for an effective implementation reducing the cost of computing the matrix-vector multiplication in each iteration of the PCG. In fact, it serves as a framework for effective implementation of a class of hybrid Schwarz preconditioners. The preconditioners of this class are based on solving a sequence of non-overlapping local subproblems exactly, and the coarse problems either exactly or inexactly (approximately). The classical PCG algorithm is reformulated in order to make reuse of the results of matrix-vector multiplications that are already available from the preconditioning step resulting in an algorithm which is cost effective. An analysis of the proposed preconditioner, with numerical results, showing scalability with respect to the number of subdomains, and a convergence which is independent of the jumps of the coefficients are given.  相似文献   

13.
The implementation of the recently proposed semi-monotonic augmented Lagrangian algorithm for the solution of large convex equality constrained quadratic programming problems is considered. It is proved that if the auxiliary problems are approximately solved by the conjugate gradient method, then the algorithm finds an approximate solution of the class of problems with uniformly bounded spectrum of the Hessian matrix at O(1) matrix–vector multiplications. If applied to the class of problems with the Hessian matrices that are in addition either sufficiently sparse or can be expressed as a product of such sparse matrices, then the cost of the solution is proportional to the dimension of the problems. Theoretical results are illustrated by numerical experiments. This research is supported by grants of the Ministry of Education No. S3086102, ET400300415 and MSM 6198910027.  相似文献   

14.
Newton’s method for unconstrained optimization problems on the Euclidean space can be generalized to that on Riemannian manifolds. The truncated singular value problem is one particular problem defined on the product of two Stiefel manifolds, and an algorithm of the Riemannian Newton’s method for this problem has been designed. However, this algorithm is not easy to implement in its original form because the Newton equation is expressed by a system of matrix equations which is difficult to solve directly. In the present paper, we propose an effective implementation of the Newton algorithm. A matrix-free Krylov subspace method is used to solve a symmetric linear system into which the Newton equation is rewritten. The presented approach can be used on other problems as well. Numerical experiments demonstrate that the proposed method is effective for the above optimization problem.  相似文献   

15.
Mesh shape-quality optimization using the inverse mean-ratio metric   总被引:1,自引:0,他引:1  
Meshes containing elements with bad quality can result in poorly conditioned systems of equations that must be solved when using a discretization method, such as the finite-element method, for solving a partial differential equation. Moreover, such meshes can lead to poor accuracy in the approximate solution computed. In this paper, we present a nonlinear fractional program that relocates the vertex coordinates of a given mesh to optimize the average element shape quality as measured by the inverse mean-ratio metric. To solve the resulting large-scale optimization problems, we apply an efficient implementation of an inexact Newton algorithm that uses the conjugate gradient method with a block Jacobi preconditioner to compute the direction. We show that the block Jacobi preconditioner is positive definite by proving a general theorem concerning the convexity of fractional functions, applying this result to components of the inverse mean-ratio metric, and showing that each block in the preconditioner is invertible. Numerical results obtained with this special-purpose code on several test meshes are presented and used to quantify the impact on solution time and memory requirements of using a modeling language and general-purpose algorithm to solve these problems.  相似文献   

16.
The implementation of implicit Runge-Kutta methods requires the solution of large systems of non-linear equations. Normally these equations are solved by a modified Newton process, which can be very expensive for problems of high dimension. The recently proposed triangularly implicit iteration methods for ODE-IVP solvers [5] substitute the Runge-Kutta matrixA in the Newton process for a triangular matrixT that approximatesA, hereby making the method suitable for parallel implementation. The matrixT is constructed according to a simple procedure, such that the stiff error components in the numerical solution are strongly damped. In this paper we prove for a large class of Runge-Kutta methods that this procedure can be carried out and that the diagnoal entries ofT are positive. This means that the linear systems that are to be solved have a non-singular matrix. The research reported in this paper was supported by STW (Dutch Foundation for Technical Sciences).  相似文献   

17.
The multigrid method for discontinuous Galerkin (DG) discretizations of advection–diffusion problems is presented. It is based on a block Gauss–Seidel smoother with downwind ordering honoring the advection operator. The cell matrices of the DG scheme are inverted in this smoother in order to obtain robustness for higher order elements. Employing a set of experiments, we show that this technique actually yields an efficient preconditioner and that both ingredients, downwind ordering and blocking of cell matrices are crucial for robustness.  相似文献   

18.
A preconditioned scheme for solving sparse symmetric eigenproblems is proposed. The solution strategy relies upon the DACG algorithm, which is a Preconditioned Conjugate Gradient algorithm for minimizing the Rayleigh Quotient. A comparison with the well established ARPACK code shows that when a small number of the leftmost eigenpairs is to be computed, DACG is more efficient than ARPACK. Effective convergence acceleration of DACG is shown to be performed by a suitable approximate inverse preconditioner (AINV). The performance of such a preconditioner is shown to be safe, i.e. not highly dependent on a drop tolerance parameter. On sequential machines, AINV preconditioning proves a practicable alternative to the effective incomplete Cholesky factorization, and is more efficient than Block Jacobi. Owing to its parallelizability, the AINV preconditioner is exploited for a parallel implementation of the DACG algorithm. Numerical tests account for the high degree of parallelization attainable on a Cray T3E machine and confirm the satisfactory scalability properties of the algorithm. A final comparison with PARPACK shows the (relative) higher efficiency of AINV‐DACG. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

19.
We formulate a locally superlinearly convergent projected Newton method for constrained minimization in a Cartesian product of balls. For discrete-time,N-stage, input-constrained optimal control problems with Bolza objective functions, we then show how the required scaled tangential component of the objective function gradient can be approximated efficiently with a differential dynamic programming scheme; the computational cost and the storage requirements for the resulting modified projected Newton algorithm increase linearly with the number of stages. In calculations performed for a specific control problem with 10 stages, the modified projected Newton algorithm is shown to be one to two orders of magnitude more efficient than a standard unscaled projected gradient method.This work was supported by the National Science Foundation, Grant No. DMS-85-03746.  相似文献   

20.
A regularized Newton‐like method for solving nonnegative least‐squares problems is proposed and analysed in this paper. A preconditioner for KKT systems arising in the method is introduced and spectral properties of the preconditioned matrix are analysed. A bound on the condition number of the preconditioned matrix is provided. The bound does not depend on the interior‐point scaling matrix. Preliminary computational results confirm the effectiveness of the preconditioner and fast convergence of the iterative method established by the analysis performed in this paper. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号