首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 392 毫秒
1.
Accelerating molecular modeling applications with graphics processors   总被引:3,自引:0,他引:3  
Molecular mechanics simulations offer a computational approach to study the behavior of biomolecules at atomic detail, but such simulations are limited in size and timescale by the available computing resources. State-of-the-art graphics processing units (GPUs) can perform over 500 billion arithmetic operations per second, a tremendous computational resource that can now be utilized for general purpose computing as a result of recent advances in GPU hardware and software architecture. In this article, an overview of recent advances in programmable GPUs is presented, with an emphasis on their application to molecular mechanics simulations and the programming techniques required to obtain optimal performance in these cases. We demonstrate the use of GPUs for the calculation of long-range electrostatics and nonbonded forces for molecular dynamics simulations, where GPU-based calculations are typically 10-100 times faster than heavily optimized CPU-based implementations. The application of GPU acceleration to biomolecular simulation is also demonstrated through the use of GPU-accelerated Coulomb-based ion placement and calculation of time-averaged potentials from molecular dynamics trajectories. A novel approximation to Coulomb potential calculation, the multilevel summation method, is introduced and compared with direct Coulomb summation. In light of the performance obtained for this set of calculations, future applications of graphics processors to molecular dynamics simulations are discussed.  相似文献   

2.
综述了图形处理器(GPU)在计算化学中的应用和进展.首先简单介绍了GPU在科学计算中应用的发展,然后分别详细讲述了迄今几个使用GPU和CUDA(compute unified device architecture,显卡厂商Nvidia推出的计算平台)开发工具设计的量子化学计算和分子动力学(MD)模拟的算法和程序,尤其对目前唯一完全使用GPU技术开发的量子化学计算软件TeraChem做了完备的介绍,包括算法、实现的细节和程序目前的功能.此外,本文还对GPU在计算化学上将会发挥的作用做出了极为乐观的展望.  相似文献   

3.
Graphical processing units (GPUs) are emerging in computational chemistry to include Hartree?Fock (HF) methods and electron‐correlation theories. However, ab initio calculations of large molecules face technical difficulties such as slow memory access between central processing unit and GPU and other shortfalls of GPU memory. The divide‐and‐conquer (DC) method, which is a linear‐scaling scheme that divides a total system into several fragments, could avoid these bottlenecks by separately solving local equations in individual fragments. In addition, the resolution‐of‐the‐identity (RI) approximation enables an effective reduction in computational cost with respect to the GPU memory. The present study implemented the DC‐RI‐HF code on GPUs using math libraries, which guarantee compatibility with future development of the GPU architecture. Numerical applications confirmed that the present code using GPUs significantly accelerated the HF calculations while maintaining accuracy. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
Using a grid‐based method to search the critical points in electron density, we show how to accelerate such a method with graphics processing units (GPUs). When the GPU implementation is contrasted with that used on central processing units (CPUs), we found a large difference between the time elapsed by both implementations: the smallest time is observed when GPUs are used. We tested two GPUs, one related with video games and other used for high‐performance computing (HPC). By the side of the CPUs, two processors were tested, one used in common personal computers and other used for HPC, both of last generation. Although our parallel algorithm scales quite well on CPUs, the same implementation on GPUs runs around 10× faster than 16 CPUs, with any of the tested GPUs and CPUs. We have found what one GPU dedicated for video games can be used without any problem for our application, delivering a remarkable performance, in fact; this GPU competes against one HPC GPU, in particular when single‐precision is used. © 2014 Wiley Periodicals, Inc.  相似文献   

5.
Modern graphics processing units (GPUs) are flexibly programmable and have peak computational throughput significantly faster than conventional CPUs. Herein, we describe the design and implementation of PAPER, an open‐source implementation of Gaussian molecular shape overlay for NVIDIA GPUs. We demonstrate one to two order‐of‐magnitude speedups on high‐end commodity GPU hardware relative to a reference CPU implementation of the shape overlay algorithm and speedups of over one order of magnitude relative to the commercial OpenEye ROCS package. In addition, we describe errors incurred by approximations used in common implementations of the algorithm. © 2009 Wiley Periodicals, Inc. J Comput Chem 2010  相似文献   

6.
Ray casting on graphics processing units (GPUs) opens new possibilities for molecular visualization. We describe the implementation and calculation of diverse molecular representations such as licorice, ball-and-stick, space-filling van der Waals spheres, and approximated solvent-accessible surfaces using GPUs. We introduce HyperBalls, an improved ball-and-stick representation replacing tubes, linking the atom spheres by hyperboloids that can smoothly connect them. This type of depiction is particularly useful to represent dynamic phenomena, such as the evolution of noncovalent bonds. It is furthermore well suited to represent coarse-grained models and spring networks. All these representations can be defined by a single general algebraic equation that is adapted for the ray-casting technique and is well suited for execution on the GPU. Using GPU capabilities, this implementation can routinely, accurately, and interactively render molecules ranging from a few atoms up to huge macromolecular assemblies with more than 500,000 particles. In simple cases, based only on spheres, we have been able to display up to two million atoms smoothly.  相似文献   

7.
During the past few years, graphics processing units (GPUs) have become extremely popular in the high performance computing community. In this study, we present an implementation of an acceleration engine for the solvent–solvent interaction evaluation of molecular dynamics simulations. By careful optimization of the algorithm speed‐ups up to a factor of 54 (single‐precision GPU vs. double‐precision CPU) could be achieved. The accuracy of the single‐precision GPU implementation is carefully investigated and does not influence structural, thermodynamic, and dynamic quantities. Therefore, the implementation enables users of the GROMOS software for biomolecular simulation to run the solvent–solvent interaction evaluation on a GPU, and thus, to speed‐up their simulations by a factor 6–9. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

8.
We present a way to improve the performance of the electronic structure Vienna Ab initio Simulation Package (VASP) program. We show that high-performance computers equipped with graphics processing units (GPUs) as accelerators may reduce drastically the computation time when offloading these sections to the graphic chips. The procedure consists of (i) profiling the performance of the code to isolate the time-consuming parts, (ii) rewriting these so that the algorithms become better-suited for the chosen graphic accelerator, and (iii) optimizing memory traffic between the host computer and the GPU accelerator. We chose to accelerate VASP with NVIDIA GPU using CUDA. We compare the GPU and original versions of VASP by evaluating the Davidson and RMM-DIIS algorithms on chemical systems of up to 1100 atoms. In these tests, the total time is reduced by a factor between 3 and 8 when running on n (CPU core + GPU) compared to n CPU cores only, without any accuracy loss. © 2012 Wiley Periodicals, Inc.  相似文献   

9.
Compute Unified Device Architecture (CUDA) was used to design and implement molecular dynamics (MD) simulations on graphics processing units (GPU). With an NVIDIA Tesla C870, a 20–60 fold speedup over that of one core of the Intel Xeon 5430 CPU was achieved, reaching up to 150 Gflops. MD simulation of cavity flow and particle-bubble interaction in liquid was implemented on multiple GPUs using a message passing interface (MPI). Up to 200 GPUs were tested on a special network topology, which achieves good scalability. The capability of GPU clusters for large-scale molecular dynamics simulation of meso-scale flow behavior was, therefore, uncovered. Supported by the National Natural Science Foundation of China (Grant Nos. 20336040, 20221603 and 20490201), and the Chinese Academy of Sciences (Grant No. Kgcxz-yw-124)  相似文献   

10.
GALAMOST [graphics processing unit (GPU)‐accelerated large‐scale molecular simulation toolkit] is a molecular simulation package designed to utilize the computational power of GPUs. Besides the common features of molecular dynamics (MD) packages, it is developed specially for the studies of self‐assembly, phase transition, and other properties of polymeric systems at mesoscopic scale by using some lately developed simulation techniques. To accelerate the simulations, GALAMOST contains a hybrid particle‐field MD technique where particle–particle interactions are replaced by interactions of particles with density fields. Moreover, the numerical potential obtained by bottom‐up coarse‐graining methods can be implemented in simulations with GALAMOST. By combining these force fields and particle‐density coupling method in GALAMOST, the simulations for polymers can be performed with very large system sizes over long simulation time. In addition, GALAMOST encompasses two specific models, that is, a soft anisotropic particle model and a chain‐growth polymerization model, by which the hierarchical self‐assembly of soft anisotropic particles and the problems related to polymerization can be studied, respectively. The optimized algorithms implemented on the GPU, package characteristics, and benchmarks of GALAMOST are reported in detail. © 2013 Wiley Periodicals, Inc.  相似文献   

11.
The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well‐exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)‐based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off‐loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance‐to‐price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer‐class GPUs this improvement equally reflects in the performance‐to‐price ratio. Although memory issues in consumer‐class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost‐efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well‐balanced ratio of CPU and consumer‐class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.  相似文献   

12.
A unified, computer algebra system‐based scheme of code‐generation for computational quantum‐chemistry programs is presented. Generation of electron‐repulsion integrals and their derivatives as well as exchange‐correlation potential and its derivatives is discussed. Application to general‐purpose computing on graphics processing units is considered.  相似文献   

13.
Molecular dynamics (MD) simulations are a vital tool in chemical research, as they are able to provide an atomistic view of chemical systems and processes that is not obtainable through experiment. However, large‐scale MD simulations require access to multicore clusters or supercomputers that are not always available to all researchers. Recently, scientists have returned to exploring the power of graphics processing units (GPUs) for various applications, such as MD, enabled by the recent advances in hardware and integrated programming interfaces such as NVIDIA's CUDA platform. One area of particular interest within the context of chemical applications is that of aqueous interfaces, the salt solutions of which have found application as model systems for studying atmospheric process as well as physical behaviors such as the Hoffmeister effect. Here, we present results of GPU‐accelerated simulations of the liquid–vapor interface of aqueous sodium iodide solutions. Analysis of various properties, such as density and surface tension, demonstrates that our model is consistent with previous studies of similar systems. In particular, we find that the current combination of water and ion force fields coupled with the ability to simulate surfaces of differing area enabled by GPU hardware is able to reproduce the experimental trend of increasing salt solution surface tension relative to pure water. In terms of performance, our GPU implementation performs equivalent to CHARMM running on 21 CPUs. Finally, we address possible issues with the accuracy of MD simulaions caused by nonstandard single‐precision arithmetic implemented on current GPUs. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011  相似文献   

14.
We present a highly parallel algorithm to convert internal coordinates of a polymeric molecule into Cartesian coordinates. Traditionally, converting the structures of polymers (e.g., proteins) from internal to Cartesian coordinates has been performed serially, due to an inherent linear dependency along the polymer chain. We show this dependency can be removed using a tree-based concatenation of coordinate transforms between segments, and then parallelized efficiently on graphics processing units (GPUs). The conversion algorithm is applicable to protein engineering and fitting protein structures to experimental data, and we observe an order of magnitude speedup using parallel processing on a GPU compared to serial execution on a CPU.  相似文献   

15.
The conductor-like polarizable continuum model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in quantum chemistry. We have previously implemented C-PCM solvation for Hartree-Fock (HF) and density functional theory (DFT) on graphical processing units (GPUs), enabling the quantum mechanical treatment of large solvated biomolecules. Here, we first propose a GPU-based algorithm for the PCM conjugate gradient linear solver that greatly improves the performance for very large molecules. The overhead for PCM-related evaluations now consumes less than 15% of the total runtime for DFT calculations on large molecules. Second, we demonstrate that our algorithms tailored for ground state C-PCM are transferable to excited state properties. Using a single GPU, our method evaluates the analytic gradient of the linear response PCM time-dependent density functional theory energy up to 80× faster than a conventional central processing unit (CPU)-based implementation. In addition, our C-PCM algorithms are transferable to other methods that require electrostatic potential (ESP) evaluations. For example, we achieve speed-ups of up to 130× for restricted ESP-based atomic charge evaluations, when compared to CPU-based codes. We also summarize and compare the different PCM cavity discretization schemes used in some popular quantum chemistry packages as a reference for both users and developers.  相似文献   

16.
Presented is the implementation of the Drude force field in the open‐source OpenMM simulation package allowing for access to graphical processing unit (GPU) hardware. In the Drude model, electronic degrees of freedom are represented by negatively charged particles attached to their parent atoms via harmonic springs, such that extra computational overhead comes from these additional particles and virtual sites representing lone pairs on electronegative atoms, as well as the associated thermostat and integration algorithms. This leads to an approximately fourfold increase in computational demand over additive force fields. However, by making the Drude model accessible to consumer‐grade desktop GPU hardware it will be possible to perform simulations of one microsecond or more in less than a month, indicating that the barrier to employ polarizable models has largely been removed such that polarizable simulations with the classical Drude model are readily accessible and practical.  相似文献   

17.
A custom code for molecular dynamics simulations has been designed to run on CUDA‐enabled NVIDIA graphics processing units (GPUs). The double‐precision code simulates multicomponent fluids, with intramolecular and intermolecular forces, coarse‐grained and atomistic models, holonomic constraints, Nosé–Hoover thermostats, and the generation of distribution functions. Algorithms to compute Lennard‐Jones and Gay‐Berne interactions, and the electrostatic force using Ewald summations, are discussed. A neighbor list is introduced to improve scaling with respect to system size. Three test systems are examined: SPC/E water; an n‐hexane/2‐propanol mixture; and a liquid crystal mesogen, 2‐(4‐butyloxyphenyl)‐5‐octyloxypyrimidine. Code performance is analyzed for each system. With one GPU, a 33–119 fold increase in performance is achieved compared with the serial code while the use of two GPUs leads to a 69–287 fold improvement and three GPUs yield a 101–377 fold speedup. © 2015 Wiley Periodicals, Inc.  相似文献   

18.
We investigated the performance of heterogeneous computing with graphics processing units (GPUs) and many integrated core (MIC) with 20 CPU cores (20×CPU). As a practical example toward large scale electronic structure calculations using grid‐based methods, we evaluated the Hartree potentials of silver nanoparticles with various sizes (3.1, 3.7, 4.9, 6.1, and 6.9 nm) via a direct integral method supported by the sinc basis set. The so‐called work stealing scheduler was used for efficient heterogeneous computing via the balanced dynamic distribution of workloads between all processors on a given architecture without any prior information on their individual performances. 20×CPU + 1GPU was up to ~1.5 and ~3.1 times faster than 1GPU and 20×CPU, respectively. 20×CPU + 2GPU was ~4.3 times faster than 20×CPU. The performance enhancement by CPU + MIC was considerably lower than expected because of the large initialization overhead of MIC, although its theoretical performance is similar with that of CPU + GPU. © 2016 Wiley Periodicals, Inc.  相似文献   

19.
Usually based on molecular mechanics force fields, the post-optimization of ligand poses is typically the most time-consuming step in protein–ligand docking procedures. In return, it bears the potential to overcome the limitations of discretized conformation models. Because of the parallel nature of the problem, recent graphics processing units (GPUs) can be applied to address this dilemma. We present a novel algorithmic approach for parallelizing and thus massively speeding up protein–ligand complex optimizations with GPUs. The method, customized to pose-optimization, performs at least 100 times faster than widely used CPU-based optimization tools. An improvement in Root-Mean-Square Distance (RMSD) compared to the original docking pose of up to 42% can be achieved. © 2012 Wiley Periodicals, Inc.  相似文献   

20.
Large-scale computational screening of thirty thousand zeolite structures was conducted to find optimal structures for separation of ethane/ethene mixtures. Efficient grand canonical Monte Carlo (GCMC) simulations were performed with graphics processing units (GPUs) to obtain pure component adsorption isotherms for both ethane and ethene. We have utilized the ideal adsorbed solution theory (IAST) to obtain the mixture isotherms, which were used to evaluate the performance of each zeolite structure based on its working capacity and selectivity. In our analysis, we have determined that specific arrangements of zeolite framework atoms create sites for the preferential adsorption of ethane over ethene. The majority of optimum separation materials can be identified by utilizing this knowledge and screening structures for the presence of this feature will enable the efficient selection of promising candidate materials for ethane/ethene separation prior to performing molecular simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号