首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到13条相似文献,搜索用时 0 毫秒
1.
The approach used to calculate the two‐electron integral by many electronic structure packages including generalized atomic and molecular electronic structure system‐UK has been designed for CPU‐based compute units. We redesigned the two‐electron compute algorithm for acceleration on a graphical processing unit (GPU). We report the acceleration strategy and illustrate it on the (ss|ss) type integrals. This strategy is general for Fortran‐based codes and uses the Accelerator compiler from Portland Group International and GPU‐based accelerators from Nvidia. The evaluation of (ss|ss) type integrals within calculations using Hartree Fock ab initio methods and density functional theory are accelerated by single and quad GPU hardware systems by factors of 43 and 153, respectively. The overall speedup for a single self consistent field cycle is at least a factor of eight times faster on a single GPU compared with that of a single CPU. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011  相似文献   

2.
The generation of molecular conformations and the evaluation of interaction potentials are common tasks in molecular modeling applications, particularly in protein-ligand or protein-protein docking programs. In this work, we present a GPU-accelerated approach capable of speeding up these tasks considerably. For the evaluation of interaction potentials in the context of rigid protein-protein docking, the GPU-accelerated approach reached speedup factors of up to over 50 compared to an optimized CPU-based implementation. Treating the ligand and donor groups in the protein binding site as flexible, speedup factors of up to 16 can be observed in the evaluation of protein-ligand interaction potentials. Additionally, we introduce a parallel version of our protein-ligand docking algorithm PLANTS that can take advantage of this GPU-accelerated scoring function evaluation. We compared the GPU-accelerated parallel version to the same algorithm running on the CPU and also to the highly optimized sequential CPU-based version. In terms of dependence of the ligand size and the number of rotatable bonds, speedup factors of up to 10 and 7, respectively, can be observed. Finally, a fitness landscape analysis in the context of rigid protein-protein docking was performed. Using a systematic grid-based search methodology, the GPU-accelerated version outperformed the CPU-based version with speedup factors of up to 60.  相似文献   

3.
Using a grid‐based method to search the critical points in electron density, we show how to accelerate such a method with graphics processing units (GPUs). When the GPU implementation is contrasted with that used on central processing units (CPUs), we found a large difference between the time elapsed by both implementations: the smallest time is observed when GPUs are used. We tested two GPUs, one related with video games and other used for high‐performance computing (HPC). By the side of the CPUs, two processors were tested, one used in common personal computers and other used for HPC, both of last generation. Although our parallel algorithm scales quite well on CPUs, the same implementation on GPUs runs around 10× faster than 16 CPUs, with any of the tested GPUs and CPUs. We have found what one GPU dedicated for video games can be used without any problem for our application, delivering a remarkable performance, in fact; this GPU competes against one HPC GPU, in particular when single‐precision is used. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand–protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual‐GPU version leads to a 39‐fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc.  相似文献   

5.
We investigated the performance of heterogeneous computing with graphics processing units (GPUs) and many integrated core (MIC) with 20 CPU cores (20×CPU). As a practical example toward large scale electronic structure calculations using grid‐based methods, we evaluated the Hartree potentials of silver nanoparticles with various sizes (3.1, 3.7, 4.9, 6.1, and 6.9 nm) via a direct integral method supported by the sinc basis set. The so‐called work stealing scheduler was used for efficient heterogeneous computing via the balanced dynamic distribution of workloads between all processors on a given architecture without any prior information on their individual performances. 20×CPU + 1GPU was up to ~1.5 and ~3.1 times faster than 1GPU and 20×CPU, respectively. 20×CPU + 2GPU was ~4.3 times faster than 20×CPU. The performance enhancement by CPU + MIC was considerably lower than expected because of the large initialization overhead of MIC, although its theoretical performance is similar with that of CPU + GPU. © 2016 Wiley Periodicals, Inc.  相似文献   

6.
Accelerating molecular modeling applications with graphics processors   总被引:3,自引:0,他引:3  
Molecular mechanics simulations offer a computational approach to study the behavior of biomolecules at atomic detail, but such simulations are limited in size and timescale by the available computing resources. State-of-the-art graphics processing units (GPUs) can perform over 500 billion arithmetic operations per second, a tremendous computational resource that can now be utilized for general purpose computing as a result of recent advances in GPU hardware and software architecture. In this article, an overview of recent advances in programmable GPUs is presented, with an emphasis on their application to molecular mechanics simulations and the programming techniques required to obtain optimal performance in these cases. We demonstrate the use of GPUs for the calculation of long-range electrostatics and nonbonded forces for molecular dynamics simulations, where GPU-based calculations are typically 10-100 times faster than heavily optimized CPU-based implementations. The application of GPU acceleration to biomolecular simulation is also demonstrated through the use of GPU-accelerated Coulomb-based ion placement and calculation of time-averaged potentials from molecular dynamics trajectories. A novel approximation to Coulomb potential calculation, the multilevel summation method, is introduced and compared with direct Coulomb summation. In light of the performance obtained for this set of calculations, future applications of graphics processors to molecular dynamics simulations are discussed.  相似文献   

7.
A unified, computer algebra system‐based scheme of code‐generation for computational quantum‐chemistry programs is presented. Generation of electron‐repulsion integrals and their derivatives as well as exchange‐correlation potential and its derivatives is discussed. Application to general‐purpose computing on graphics processing units is considered.  相似文献   

8.
We present new algorithms to improve the performance of ENUF method (F. Hedman, A. Laaksonen, Chem. Phys. Lett. 425, 2006, 142) which is essentially Ewald summation using Non‐Uniform FFT (NFFT) technique. A NearDistance algorithm is developed to extensively reduce the neighbor list size in real‐space computation. In reciprocal‐space computation, a new algorithm is developed for NFFT for the evaluations of electrostatic interaction energies and forces. Both real‐space and reciprocal‐space computations are further accelerated by using graphical processing units (GPU) with CUDA technology. Especially, the use of CUNFFT (NFFT based on CUDA) very much reduces the reciprocal‐space computation. In order to reach the best performance of this method, we propose a procedure for the selection of optimal parameters with controlled accuracies. With the choice of suitable parameters, we show that our method is a good alternative to the standard Ewald method with the same computational precision but a dramatically higher computational efficiency. © 2015 Wiley Periodicals, Inc.  相似文献   

9.
The capabilities of the polarizable force fields for alchemical free energy calculations have been limited by the high computational cost and complexity of the underlying potential energy functions. In this work, we present a GPU‐based general alchemical free energy simulation platform for polarizable potential AMOEBA. Tinker‐OpenMM, the OpenMM implementation of the AMOEBA simulation engine has been modified to enable both absolute and relative alchemical simulations on GPUs, which leads to a ∼200‐fold improvement in simulation speed over a single CPU core. We show that free energy values calculated using this platform agree with the results of Tinker simulations for the hydration of organic compounds and binding of host–guest systems within the statistical errors. In addition to absolute binding, we designed a relative alchemical approach for computing relative binding affinities of ligands to the same host, where a special path was applied to avoid numerical instability due to polarization between the different ligands that bind to the same site. This scheme is general and does not require ligands to have similar scaffolds. We show that relative hydration and binding free energy calculated using this approach match those computed from the absolute free energy approach. © 2017 Wiley Periodicals, Inc.  相似文献   

10.
The Si(111)2 × 1 surface has been widely studied via a range of different experimental and theoretical techniques, and found to adopt a π‐bonded chain configuration. To determine an accurate electronic structure for this system, however, it has been found necessary to use sophisticated and very computationally expensive methods such as GW or hybrid functionals. In this article, we show that the MBJLDA approach, originally proposed by Tran and Blaha for bulk materials (Tran and Blaha, Phys. Rev. Lett. 2009, 102, 226401), yields results which are comparable to GW, and generally superior to those obtained from hybrid functional density functional theory calculations. The MBJLDA method is also substantially more computationally efficient. A procedure and justification for the application of the MBJLDA approach to surfaces in general is also provided. © 2014 Wiley Periodicals, Inc.  相似文献   

11.
A sparse matrix multiplication scheme with multiatom blocks is reported, a tool that can be very useful for developing linear-scaling methods with atom-centered basis functions. Compared to conventional element-by-element sparse matrix multiplication schemes, efficiency is gained by the use of the highly optimized basic linear algebra subroutines (BLAS). However, some sparsity is lost in the multiatom blocking scheme because these matrix blocks will in general contain negligible elements. As a result, an optimal block size that minimizes the CPU time by balancing these two effects is recovered. In calculations on linear alkanes, polyglycines, estane polymers, and water clusters the optimal block size is found to be between 40 and 100 basis functions, where about 55-75% of the machine peak performance was achieved on an IBM RS6000 workstation. In these calculations, the blocked sparse matrix multiplications can be 10 times faster than a standard element-by-element sparse matrix package.  相似文献   

12.
The triatomic C3 unit that is known to exist in Mg2C3 has recently been found in the new compounds Ca3Cl2C3 and Sc3C4. The electronic structure of these compounds is analyzed with the aid of extended Hückel Calculation. A fragment molecular Orbital analysis (FMO) is used to study the bonding characteristic of the C3 unit in the ionic Ca3Cl2C3, and in Sc3C4, the latter Containing C2 unit and single C atoms as well. Sc3C4 Contain partially filled Sc (d) and C2 bands leading to Metallic conductivity and Pauli Paramagnetism. The C? C bond distance in the diatomic C2 units is significantly increased (dc? c= 125 pm) relative to C2?2 or acetylene, because antibonding π*g orbitals are partially filled. The unusual bending of the C3 unit (dc? c= 134 pm) in Sc3C4 (175,8°) and in Ca3Cl2C3 (169,0°) is likely to be a result of the packing arrangement in these structures.  相似文献   

13.
FT-IR and FT-Raman (4000–100 cm−1) spectral measurements of 3-methyl-1,2-butadiene (3M12B) have been attempted in the present work. Ab-initio HF and DFT (LSDA/B3LYP/B3PW91) calculations have been performed giving energies, optimized structures, harmonic vibrational frequencies, IR intensities and Raman activities. Complete vibrational assignments on the observed spectra are made with vibrational frequencies obtained by HF and DFT (LSDA/B3LYP/B3PW91) at 6-31G(d,p) and 6-311G(d,p) basis sets. The results of the calculations have been used to simulate IR and Raman spectra for the molecule that showed good agreement with the observed spectra. The potential energy distribution (PED) corresponding to each of the observed frequencies are calculated which confirms the reliability and precision of the assignment and analysis of the vibrational fundamentals modes. The oscillation of vibrational frequencies of butadiene due to the couple of methyl group is also discussed. A study on the electronic properties such as HOMO and LUMO energies, were performed by time-dependent DFT (TD-DFT) approach. The calculated HOMO and LUMO energies show that charge transfer occurs within the molecule. The thermodynamic properties of the title compound at different temperatures reveal the correlations between standard heat capacities (C) standard entropies (S), and standard enthalpy changes (H).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号