首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
1 引 言 本文研究了广义特征值问题 Ax=λBx (1)的并行计算。其中,A,B均为半带宽为r的n阶实对称带状矩阵且其中之一是正定的.本文总假设B是正定的.  相似文献   

2.
解非对称矩阵特征值问题的一种并行分治算法   总被引:3,自引:0,他引:3  
1引言考虑矩阵特征值问题其中A是非对称矩阵.通过正交变换(如Householder变换或Givens变换),A可化为上Hessenberg形.因而,本文假设A为上Hessenberg矩阵,表示如下:不失一般性,进一步假设所有的(j=2,…,n),即认为A是不可约的关于如何求解上述问题,人们进行了不懈的努力,提出了许多行之有效的算法[1-8].其中分治算法因具有良好的并行性而引人注目.分治算法的典型代表是基于同伦连续的分治算法[2,3,4]和基于Newton迭代的分治算法[1].本文提出一种新的分…  相似文献   

3.
Numerical results are obtained on sequential and parallel versions of ABS algorithms for linear systems for both full matrices andq-band matrices. The results using the sequential algorithm on full matrices indicate the superiority of a particular implementation of the symmetric algorithm. The condensed form of the algorithm is well suited for implementation in a parallel environment, and results obtained on the IBM 4381 system favor a synchronous implementation over the asynchronous one. Results are obtained from sequential implementations of theLU, Cholesky, and symmetric algorithms of the ABS class forq-band matrices able to reduce memory storage. A simple parallelization of do-loops for calculating components gives interesting performances.This work has been developed in the framework of a collaboration between IBM-ECSEC, Rome, Italy, and the Department of Mathematics of the University of Bergamo, Bergamo, Italy.The author is grateful to Prof. J. Abaffy (University of Economics, Budapest), Prof. L. Dixon (Hatfield Polytechnic), and Prof. E. Spedicato (Department of Mathematics, University of Bergamo) for useful suggestions.  相似文献   

4.
A new parallel algorithm for inverting Toeplitz triangular matrices as well as solving Toeplitz triangular linear systems is presented in this paper. The algorithm possesses very good parallelism, which can easily be adjusted to match the natural hardware parallelism of the computer systems, that was assumed to be much smaller than the order $n$ of the matrices to be considered since this is the usual case in practical applications. The parallel time complexity of the algorithm is $O([n/p|\log n+\log^2p)$, where $p$ is the hardware parallelism.  相似文献   

5.
For solving systems of linear algebraic equations with block-tridiagonal matrices arising in geoelectrics problems, the parallel matrix sweep algorithm, conjugate gradient method with preconditioner, and square root method are proposed and implemented numerically on multi-core CPU Intel with graphics processors NVIDIA. Investigation of efficiency and optimization of parallel algorithms for solving the problem with quasi-model data are performed.  相似文献   

6.
The ability of the modern graphics processors to operate on large matrices in parallel can be exploited for solving constrained image deblurring problems in a short time. In particular, in this paper we propose the parallel implementation of two iterative regularization methods: the well known expectation maximization algorithm and a recent scaled gradient projection method. The main differences between the considered approaches and their impact on the parallel implementations are discussed. The effectiveness of the parallel schemes and the speedups over standard CPU implementations are evaluated on test problems arising from astronomical images.  相似文献   

7.
Based on a series of recent papers, a powerful algorithm is reformulated for computing the maximal eigenpair of self-adjoint complex tridiagonal matrices. In parallel, the same problem in a particular case for computing the sub-maximal eigenpair is also introduced. The key ideas for each critical improvement are explained. To illustrate the present algorithm and compare it with the related algorithms, more than 10 examples are included.  相似文献   

8.
The ILU class preconditioners (ILU(0), ILU(1), ILUT) employed for iterative algorithms for non-symmetrical linear sparse matrix systems are considered. Test matrices used in this study originate from discretization of systems of partial differential equations describing multicomponent fluid flows in porous media. A new parallel algorithm for block ILU factorization is suggested. This algorithm demonstrates a good convergence and significant speed-up in comparison with sequential algorithms. New integrated approach was tested on the wide range of matrices resulted from real hydrodynamic simulations of oil fields of Western Siberia and demonstrated significant reduction in computational time.  相似文献   

9.
Interior Point algorithms have become a very successful tool for solving large-scale linear programming problems. The Dual Affine algorithm is one of the Interior Point algorithms implemented in the computer program OB1. It is a good candidate for implementation on a parallel computer because it is very computing-intensive. A parallel Dual Affine algorithm is presented which is suitable for a parallel computer with a distributed memory. The algorithm obtains its speedup from parallel sparse linear algebra computations such as Cholesky factorisation, matrix multiplication, and triangular system solving, which form the bulk of the computing work. Efficient algorithms based on the grid distribution of matrices are presented for each of these computations. The algorithm is implemented in occam 2 on a square mesh of transputers. The resulting parallel program is connected to the sequentialFortran 77 program OB1, which performs the preprocessing and the postprocessing. Experimental results on a mesh of 400 transputers are given for a test set of seven realistic planning and scheduling problems from Shell and seven problems from the NETLIB LP collection; the results show a speedup of 88 for the largest problem.  相似文献   

10.
1引言 考虑在并行计算机上解大型线性方程组AX=b假定有K台处理机可供使用,并且对于局部数据,处理机能执行不同的指令序列,毗邻的处理机之间能自然地通讯。  相似文献   

11.
For solving inverse gravimetry problems, efficient stable parallel algorithms based on iterative gradient methods are proposed. For solving systems of linear algebraic equations with block-tridiagonal matrices arising in geoelectrics problems, a parallel matrix sweep algorithm, a square root method, and a conjugate gradient method with preconditioner are proposed. The algorithms are implemented numerically on a parallel computing system of the Institute of Mathematics and Mechanics (PCS-IMM), NVIDIA graphics processors, and an Intel multi-core CPU with some new computing technologies. The parallel algorithms are incorporated into a system of remote computations entitled “Specialized Web-Portal for Solving Geophysical Problems on Multiprocessor Computers.” Some problems with “quasi-model” and real data are solved.  相似文献   

12.
In this paper, two accelerated divide‐and‐conquer (ADC) algorithms are proposed for the symmetric tridiagonal eigenvalue problem, which cost O(N2r) flops in the worst case, where N is the dimension of the matrix and r is a modest number depending on the distribution of eigenvalues. Both of these algorithms use hierarchically semiseparable (HSS) matrices to approximate some intermediate eigenvector matrices, which are Cauchy‐like matrices and are off‐diagonally low‐rank. The difference of these two versions lies in using different HSS construction algorithms, one (denoted by ADC1) uses a structured low‐rank approximation method and the other (ADC2) uses a randomized HSS construction algorithm. For the ADC2 algorithm, a method is proposed to estimate the off‐diagonal rank. Numerous experiments have been carried out to show their stability and efficiency. These algorithms are implemented in parallel in a shared memory environment, and some parallel implementation details are included. Comparing the ADCs with highly optimized multithreaded libraries such as Intel MKL, we find that ADCs could be more than six times faster for some large matrices with few deflations. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
张明  王润秋 《计算数学》1995,17(2):127-135
研究弹性固体中波的传播问题由来已久,由于地震学、石油勘探、地震层析成像等方面的重要应用,三维弹性波方程的求解问题在工业应用中显得日益重要.自七十年代末由石油勘探界首先引入有限元方法求解二维弹性波方程以来,这一公认精度较高的方法在求解三维弹性波方程方面研究成果尚不多见,其主要原因是该问题需要较大的计算机存贮量和计算量.对三维问题使用一般的有限元方法求解,常常需要几十万次反复递推求解高达10~7阶以上的大型线性代数方程组.对于这类超大型的计算课题,目前的  相似文献   

14.
Summary. In this paper we propose an algorithm based on Laguerre's iteration, rank two divide-and-conquer technique and a hybrid strategy for computing singular values of bidiagonal matrices. The algorithm is fully parallel in nature and evaluates singular values to tiny relative error if necessary. It is competitive with QR algorithm in serial mode in speed and advantageous in computing partial singular values. Error analysis and numerical results are presented. Received March 15, 1993 / Revised version received June 7, 1994  相似文献   

15.
This paper introduces some efficient initials for a well-known algorithm (an inverse iteration) for computing the maximal eigenpair of a class of real matrices. The initials not only avoid the collapse of the algorithm but are also unexpectedly efficient. The initials presented here are based on our analytic estimates of the maximal eigenvalue and a mimic of its eigenvector for many years of accumulation in the study of stochastic stability speed. In parallel, the same problem for computing the next to the maximal eigenpair is also studied.  相似文献   

16.
给出了交替方向的二维扩散方程的精细积分算法,将一个时间步积分分为两个方向,使大规模矩阵的计算转化为一些小矩阵的计算,减小了每一步求解的计算量.对于方形区域的齐次方程,计算结果与全城精细积分完全相同,而计算量和存储量都要小得多.算例表明了算法具有较高的并行计算加速比和计算效率.  相似文献   

17.
We consider the parallel factorization of sparse finite element matrices on distributed memory machines. Our method is based on a nested dissection approach combined with a cyclic re‐distribution of the interface Schur complements. We present a detailed definition of the parallel method, and the well‐posedness and the complexity of the algorithm are analyzed. A lean and transparent functional interface to existing finite element software is defined, and the performance is demonstrated for several representative examples. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

18.
Rutishauser, Gragg and Harrod and finally H.Y. Zha used the same class of chasing algorithms for transforming arrowhead matrices to tridiagonal form. Using a graphical theoretical approach, we propose a new chasing algorithm. Although this algorithm has the same sequential computational complexity and backward error properties as the old algorithms, it is better suited for a pipelined approach. The parallel algorithm for this new chasing method is described, with performance results on the Paragon and nCUBE. Comparison results between the old and the new algorithms are also presented.

  相似文献   


19.
The directed acyclic graph (DAG) associated with a parallel algorithm captures the partial order in which separaT.L.cal computations are completed and how their outputs are subsequently used in further computations. Unlike in a synchronous parallel algorithm, the DAG associated with an asynchronous parallel algorithm is not predetermined. Instead, it is a product of the asynchronous timing dynamics of the machine and cannot be known in advance, as such it is best thought of as a pseudorandom variable. In this paper, we present a formalism for analyzing the performance of asynchronous parallel Jacobi’s method in terms of its DAG. We use this app.roach to prove error bounds and bounds on the rate of convergence. The rate of convergence bounds is based on the statistical properties of the DAG and is valid for systems with a non-negative iteration matrix. We supp.ort our theoretical results with a suit of numerical examples, where we compare the performance of synchronous and asynchronous parallel Jacobi to certain statistical properties of the DAGs associated with the computations. We also present some examples of small matrices with elements of mixed sign, which demonstrate that determining whether a system will converge under asynchronous iteration in this more general setting is a far more difficult problem.  相似文献   

20.
The parallel quasi-Newton method based on updating conjugate subspaces proposed in [4] can be very effective for large-scale sparse minimization because conjugate subspaces with respect to sparse Hessians are usually easy to obtain. We demonstrate this point in this paper for the partially separable case with matrices updated by a quasi-Newton scheme ofGriewank andToint [2,3]. The algorithm presented is suitable for parallel computation and economical in computer storage. Some testing results of the algorithm on an Alliant FX/8 minisupercomputer are reported.The material is based on work supported in part by the National Science Foundation under Grant No. DMS 8602419 and by the Center for Supercomputing Research and Development at the University of Illinois.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号