首页 | 本学科首页   官方微博 | 高级检索  
     检索      

紧束缚近似含时密度泛函理论的高效OpenMP并行化和GPU加速实现
引用本文:范果红,韩克利,何国钟.紧束缚近似含时密度泛函理论的高效OpenMP并行化和GPU加速实现[J].化学物理学报,2013,26(6):635-645.
作者姓名:范果红  韩克利  何国钟
作者单位:中国科学院大连化学物理研究所分子反应动力学国家重点实验室,大连116023;中国科学院大连化学物理研究所分子反应动力学国家重点实验室,大连116023;中国科学院大连化学物理研究所分子反应动力学国家重点实验室,大连116023
摘    要:紧束缚近似的含时密度泛函理论在多核和GPU系统下的高效加速实现,并应用于拥有成百上千原子体系的激发态电子结构计算.程序中采用了稀疏矩阵和OpenMP并行化来加速哈密顿矩阵的构建,而最为耗时的基态对角化部分通过双精度的GPU加速来实现.基态的GPU加速能够在保持计算精度的基础上达到8.73倍的加速比.激发态计算采用了基于Krylov子空间迭代算法,OpenMP并行化和GPU加速等方法对激发态计算的大规模TDDFT矩阵进行求解,从而得到本征值和本征矢,大大减少了迭代的次数和最终的求解时间.采用GPU对矩阵矢量相乘进行加速后的Krylov算法能够很快地达到收敛,使得相比于采用常规算法和CPU并行化的程序能够加速206倍.程序在一系列的小分子体系和大分子体系上的计算表明,相比基于第一性原理的CIS方法和含时密度泛函方法,程序能够花费很少的计算量取得合理而精确结果.

关 键 词:密度泛函理论,紧束缚近似方法,含时密度泛函理论,激发态,GPU计算,Krylov迭代子空间算法,稀疏矩阵,OpenMP并行化
收稿时间:2013/8/16 0:00:00

Time-dependent Density Functional-based Tight-bind Method Efficiently Implemented with OpenMP Parallel and GPU Acceleration
Guo-hong Fan,Ke-li Han and Guo-zhong He.Time-dependent Density Functional-based Tight-bind Method Efficiently Implemented with OpenMP Parallel and GPU Acceleration[J].Chinese Journal of Chemical Physics,2013,26(6):635-645.
Authors:Guo-hong Fan  Ke-li Han and Guo-zhong He
Institution:State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China;State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China;State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China
Abstract:The time-dependent density functional-based tight-bind (TD-DFTB) method is implemented on the multi-core and the graphical processing unit (GPU) system for excited state calcu-lations of large system with hundreds or thousands of atoms. Sparse matrix and OpenMP multithreaded are used for building the Hamiltonian matrix. The diagonal of the eigenvalue problem in the ground state is implemented on the GPUs with double precision. The GPU-based acceleration fully preserves all the properties, and a considerable total speedup of 8.73 can be achieved. A Krylov-space-based algorithm with the OpenMP parallel and GPU acceleration is used for finding the lowest eigenvalue and eigenvector of the large TDDFT matrix, which greatly reduces the iterations taken and the time spent on the excited states eigenvalue problem. The Krylov solver with the GPU acceleration of matrix-vector product can converge quickly to obtain the final result and a notable speed-up of 206 times can be observed for system size of 812 atoms. The calculations on serials of small and large sys-tems show that the fast TD-DFTB code can obtain reasonable result with a much cheaper computational requirement compared with the first-principle results of CIS and full TDDFT calculation.
Keywords:Density-functional theory  Tight-binding method  Time-dependent density functional theory  Excited state  Graphical processing unit  Krylov iterative algorithm  Sparse matrix  OpenMP
点击此处可从《化学物理学报》浏览原始摘要信息
点击此处可从《化学物理学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号