首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到5条相似文献,搜索用时 0 毫秒
1.
The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well‐exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)‐based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off‐loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance‐to‐price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer‐class GPUs this improvement equally reflects in the performance‐to‐price ratio. Although memory issues in consumer‐class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost‐efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well‐balanced ratio of CPU and consumer‐class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.  相似文献   

2.
The influence of the total number of cores, the number of cores dedicated to Particle mesh Ewald (PME) calculation and the choice of single vs. double precision on the performance of molecular dynamic (MD) simulations in the size of 70,000 to 1.7 million of atoms was analyzed on three different high‐performance computing facilities employing GROMACS 4 by running about 6000 benchmark simulations. Small and medium sized systems scaled linear up to 64 and 128 cores, respectively. Systems with half a million to 1.2 million atoms scaled linear up to 256 cores. The best performance was achieved by dedicating 25% of the total number of cores to PME calculation. Double precision calculations lowered the performance by 30–50%. A database for collecting information about MD simulations and the achieved performance was created and is freely available online and allows the fast estimation of the performance that can be expected in similar environments. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011  相似文献   

3.
We present a method of parallelizing flat histogram Monte Carlo simulations, which give the free energy of a molecular system as an output. In the serial version, a constant probability distribution, as a function of any system parameter, is calculated by updating an external potential that is added to the system Hamiltonian. This external potential is related to the free energy. In the parallel implementation, the simulation is distributed on to different processors. With regular intervals the modifying potential is summed over all processors and distributed back to every processor, thus spreading the information of which parts of parameter space have been explored. This implementation is shown to decrease the execution time linearly with added number of processors.  相似文献   

4.
Based on our critique of requirements for performing an efficient molecular dynamics simulation with the particle-mesh Ewald (PME) implementation in GROMACS 4.5, we present a computational tool to enable the discovery of parameters that produce a given accuracy in the PME approximation of the full electrostatics. Calculations on two parallel computers with different processor and communication structures showed that a given accuracy can be attained over a range of parameter space, and that the attributes of the hardware and simulation system control which parameter sets are optimal. This information can be used to find the fastest available PME parameter sets that achieve a given accuracy. We hope that this tool will stimulate future work to assess the impact of the quality of the PME approximation on simulation outcomes, particularly with regard to the trade-off between cost and scientific reliability in biomolecular applications.  相似文献   

5.
Evaluation of long-range Coulombic interactions still represents a bottleneck in the molecular dynamics (MD) simulations of biological macromolecules. Despite the advent of sophisticated fast algorithms, such as the fast multipole method (FMM), accurate simulations still demand a great amount of computation time due to the accuracy/speed trade-off inherently involved in these algorithms. Unless higher order multipole expansions, which are extremely expensive to evaluate, are employed, a large amount of the execution time is still spent in directly calculating particle-particle interactions within the nearby region of each particle. To reduce this execution time for pair interactions, we developed a computation unit (board), called MD-Engine II, that calculates nonbonded pairwise interactions using a specially designed hardware. Four custom arithmetic-processors and a processor for memory manipulation ("particle processor") are mounted on the computation board. The arithmetic processors are responsible for calculation of the pair interactions. The particle processor plays a central role in realizing efficient cooperation with the FMM. The results of a series of 50-ps MD simulations of a protein-water system (50,764 atoms) indicated that a more stringent setting of accuracy in FMM computation, compared with those previously reported, was required for accurate simulations over long time periods. Such a level of accuracy was efficiently achieved using the cooperative calculations of the FMM and MD-Engine II. On an Alpha 21264 PC, the FMM computation at a moderate but tolerable level of accuracy was accelerated by a factor of 16.0 using three boards. At a high level of accuracy, the cooperative calculation achieved a 22.7-fold acceleration over the corresponding conventional FMM calculation. In the cooperative calculations of the FMM and MD-Engine II, it was possible to achieve more accurate computation at a comparable execution time by incorporating larger nearby regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号