首页 | 本学科首页   官方微博 | 高级检索  
     


Efficient algorithms for computing rank-revealing factorizations on a GPU
Authors:Nathan Heavner  Chao Chen  Abinand Gopal  Per-Gunnar Martinsson
Affiliation:1. Department of Applied Mathematics, University of Colorado at Boulder, Boulder, Colorado, USA;2. Oden Institute, University of Texas at Austin, Austin, Texas, USA;3. Department of Mathematics, Yale University, New Haven, Connecticut, USA;4. Oden Institute & Department of Mathematics, University of Texas at Austin, Austin, Texas, USA
Abstract:Standard rank-revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level-3 BLAS. This article presents two alternative algorithms for computing a rank-revealing factorization of the form A = U T V $$ mathbf{mathsf{A}}=mathbf{mathsf{UT}}{mathbf{mathsf{V}}}^{ast } $$ , where U $$ mathbf{mathsf{U}} $$ and V $$ mathbf{mathsf{V}} $$ are orthogonal and T $$ mathbf{mathsf{T}} $$ is trapezoidal (or triangular if A $$ mathbf{mathsf{A}} $$ is square). Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix-matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve significant acceleration over finely tuned GPU implementations of the SVD while providing low rank approximation errors close to that of the SVD.
Keywords:parallel algorithm for GPU  randomized numerical linear algebra  rank-revealing matrix factorization
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号