Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms

Institution:	1. Department of Electrical and Computer Engineering, University of California at Riverside, Riverside, CA 92521, USA;2. Synopsys Inc., Mountain View, CA 94043, USA;3. School of Microelectronics & Solid-State Electronics, University of Electronic Science & Technology of China, Chengdu, Sichuan 610054, China;4. School of Microelectronics, Shanghai Jiao Tong University, Shanghai 200240, China;1. University of Gafsa, Tunisia;2. CTS, Uninova, Faculdade de Ciências e Tecnologia, FCT, Universidade Nova de Lisboa, Portugal;3. University of Sfax, Tunisia;1. Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA;2. Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4;1. Universidad Autónoma de Querétaro, Facultad de Ingeniería. A.P. 3-24, C.P. 76150, Querétaro, Qro., Mexico;2. CIDESI, Dirección de Investigación y Posgrado. Av. Playa Pie de la Cuesta No. 702, Desarrollo San Pablo, C.P. 76130, Querétaro, Qro., Mexico;1. National Research Institute of Electronics and Cryptology, TÜB?TAK, 41470 Kocaeli, Turkey;2. Bogazici University, Department of Electrical and Electronics Engineering, 34342 Bebek, Istanbul, Turkey

Abstract:	In this paper, we propose an efficient parallel dynamic linear solver, called GPU-GMRES, for transient analysis of large linear dynamic systems such as large power grid networks. The new method is based on the preconditioned generalized minimum residual (GMRES) iterative method implemented on heterogeneous CPU–GPU platforms. The new solver is very robust and can be applied to power grids with different structures as well as for general analysis problems for large linear dynamic systems with asymmetric matrices. The proposed GPU-GMRES solver adopts the very general and robust incomplete LU based preconditioner. We show that by properly selecting the right amount of fill-ins in the incomplete LU factors, a good trade-off between GPU efficiency and convergence rate can be achieved for the best overall performance. Such tunable feature can make this algorithm very adaptive to different problems. GPU-GMRES solver properly partitions the major computing tasks in GMRES solver to minimize the data traffic between CPU and GPUs to enhance performance of the proposed method. Furthermore, we propose a new fast parallel sparse matrix–vector (SpMV) multiplication algorithm to further accelerate the GPU-GMRES solver. The new algorithm, called segSpMV, can enjoy full coalesced memory access compared to existing approaches. To further improve the scalability and efficiency, segSpMV method is further extended to multi-GPU platforms, which leads to more scalable and faster multi-GPU GMRES solver. Experimental results on the set of the published IBM benchmark circuits and mesh-structured power grid networks show that the GPU-GMRES solver can deliver order of magnitudes speedup over the direct LU solver, UMFPACK. The resulting multi-GPU-GMRES can also deliver 3–12× speedup over the CPU implementation of the same GMRES method on transient analysis.

Keywords:	Parallel analysis GPU Sparse vector and matrix multiplication Dynamic linear systems Circuit simulation
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏