首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A message passing interface (MPI)-based implementation (Autodock4.lga.MPI) of the grid-based docking program Autodock4 has been developed to allow simultaneous and independent docking of multiple compounds on up to thousands of central processing units (CPUs) using the Lamarkian genetic algorithm. The MPI version reads a single binary file containing precalculated grids that represent the protein-ligand interactions, i.e., van der Waals, electrostatic, and desolvation potentials, and needs only two input parameter files for the entire docking run. In comparison, the serial version of Autodock4 reads ASCII grid files and requires one parameter file per compound. The modifications performed result in significantly reduced input/output activity compared with the serial version. Autodock4.lga.MPI scales up to 8192 CPUs with a maximal overhead of 16.3%, of which two thirds is due to input/output operations and one third originates from MPI operations. The optimal docking strategy, which minimizes docking CPU time without lowering the quality of the database enrichments, comprises the docking of ligands preordered from the most to the least flexible and the assignment of the number of energy evaluations as a function of the number of rotatable bounds. In 24 h, on 8192 high-performance computing CPUs, the present MPI version would allow docking to a rigid protein of about 300K small flexible compounds or 11 million rigid compounds.  相似文献   

2.
A new parallel algorithm and its implementation for the RI‐MP2 energy calculation utilizing peta‐flop‐class many‐core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual‐level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi‐node and multi‐GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi‐node and multi‐GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc.  相似文献   

3.
Glide SP mode enrichment results for two preparations of the DUD dataset and native ligand docking RMSDs for two preparations of the Astex dataset are presented. Following a best-practices preparation scheme, an average RMSD of 1.140 ? for native ligand docking with Glide SP is computed. Following the same best-practices preparation scheme for the DUD dataset an average area under the ROC curve (AUC) of 0.80 and average early enrichment via the ROC (0.1?%) metric of 0.12 were observed. 74 and 56?% of the 39 best-practices prepared targets showed AUC over 0.7 and 0.8, respectively. Average AUC was greater than 0.7 for all best-practices protein families demonstrating consistent enrichment performance across a broad range of proteins and ligand chemotypes. In both Astex and DUD datasets, docking performance is significantly improved employing a best-practices preparation scheme over using minimally-prepared structures from the PDB. Enrichment results for WScore, a new scoring function and sampling methodology integrating WaterMap and Glide, are presented for four DUD targets, hivrt, hsp90, cdk2, and fxa. WScore performance in early enrichment is consistently strong and all systems examined show AUC?>?0.9 and superior early enrichment to DUD best-practices Glide SP results.  相似文献   

4.
Recent advance in high performance computing (HPC) resources has opened the possibility to expand the scope of density functional theory (DFT) simulations toward large and complex molecular systems. This work proposes a numerically robust method that enables scalable diagonalizations of large DFT Hamiltonian matrices, particularly with thousands of computing CPUs (cores) that are usual these days in terms of sizes of HPC resources. The well‐known Lanczos method is extensively refactorized to overcome its weakness for evaluation of multiple degenerate eigenpairs that is the substance of DFT simulations, where a multilevel parallelization is adopted for scalable simulations in as many cores as possible. With solid benchmark tests for realistic molecular systems, the fidelity of our method are validated against the locally optimal block preconditioned conjugated gradient (LOBPCG) method that is widely used to simulate electronic structures. Our method may waste computing resources for simulations of molecules whose degeneracy cannot be reasonably estimated. But, compared to LOBPCG method, it is fairly excellent in perspectives of both speed and scalability, and particularly has remarkably less (< 10%) sensitivity of performance to the random nature of initial basis vectors. As a promising candidate for solving electronic structures of highly degenerate systems, the proposed method can make a meaningful contribution to migrating DFT simulations toward extremely large computing environments that normally have more than several tens of thousands of computing cores.  相似文献   

5.
Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands physically, so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands physically but not topologically to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chemical space, and the proper scope for using DUD. Careful attention to both the composition of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.  相似文献   

6.
Results of a previous docking study are reanalyzed and extended to include results from the docking program FRED and a detailed statistical analysis of both structure reproduction and virtual screening results. FRED is run both in a traditional docking mode and in a hybrid mode that makes use of the structure of a bound ligand in addition to the protein structure to screen molecules. This analysis shows that most docking programs are effective overall but highly inconsistent, tending to do well on one system and poorly on the next. Comparing methods, the difference in mean performance on DUD is found to be statistically significant (95% confidence) 61% of the time when using a global enrichment metric (AUC). Early enrichment metrics are found to have relatively poor statistical power, with 0.5% early enrichment only able to distinguish methods to 95% confidence 14% of the time.  相似文献   

7.
The programs ESCF, EGRAD, and AOFORCE are parts of the TURBOMOLE program package and compute excited-state properties and ground-state geometric hessians, respectively, for Hartree-Fock and density functional methods. The range of applicability of these programs has been extended by allowing them to use all CPU cores on a given node in parallel. The parallelization strategy is not new and duplicates what is standard today in the calculation of ground-state energies and gradients. The focus is on how this can be achieved without needing extensive modifications of the existing serial code. The key ingredient is to fork off worker processes with separated address spaces as they are needed. Test calculations on a molecule with about 80 atoms and 1000 basis functions show good parallel speedup up to 32 CPU cores.  相似文献   

8.
Structure‐based virtual screening usually involves docking of a library of chemical compounds onto the functional pocket of the target receptor so as to discover novel classes of ligands. However, the overall success rate remains low and screening a large library is computationally intensive. An alternative to this “ab initio” approach is virtual screening by binding homology search. In this approach, potential ligands are predicted based on similar interaction pairs (similarity in receptors and ligands). SPOT‐Ligand is an approach that integrates ligand similarity by Tanimoto coefficient and receptor similarity by protein structure alignment program SPalign. The method was found to yield a consistent performance in DUD and DUD‐E docking benchmarks even if model structures were employed. It improves over docking methods (DOCK6 and AUTODOCK Vina) and has a performance comparable to or better than other binding‐homology methods (FINDsite and PoLi) with higher computational efficiency. The server is available at http://sparks-lab.org . © 2016 Wiley Periodicals, Inc.  相似文献   

9.
The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well‐exploited with a combination of single instruction multiple data, multithreading, and message passing interface (MPI)‐based single program multiple data/multiple program multiple data parallelism while graphics processing units (GPUs) can be used as accelerators to compute interactions off‐loaded from the CPU. Here, we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance‐to‐price ratio, energy efficiency, and several other criteria. Although hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer‐class GPUs this improvement equally reflects in the performance‐to‐price ratio. Although memory issues in consumer‐class GPUs could pass unnoticed as these cards do not support error checking and correction memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost‐efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well‐balanced ratio of CPU and consumer‐class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime. © 2015 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.  相似文献   

10.
Fully ab initio treatment of complex solid systems needs computational software which is able to efficiently take advantage of the growing power of high performance computing (HPC) architectures. Recent improvements in CRYSTAL, a periodic ab initio code that uses a Gaussian basis set, allows treatment of very large unit cells for crystalline systems on HPC architectures with high parallel efficiency in terms of running time and memory requirements. The latter is a crucial point, due to the trend toward architectures relying on a very high number of cores with associated relatively low memory availability. An exhaustive performance analysis shows that density functional calculations, based on a hybrid functional, of low‐symmetry systems containing up to 100,000 atomic orbitals and 8000 atoms are feasible on the most advanced HPC architectures available to European researchers today, using thousands of processors. © 2012 Wiley Periodicals, Inc.  相似文献   

11.
Using a grid‐based method to search the critical points in electron density, we show how to accelerate such a method with graphics processing units (GPUs). When the GPU implementation is contrasted with that used on central processing units (CPUs), we found a large difference between the time elapsed by both implementations: the smallest time is observed when GPUs are used. We tested two GPUs, one related with video games and other used for high‐performance computing (HPC). By the side of the CPUs, two processors were tested, one used in common personal computers and other used for HPC, both of last generation. Although our parallel algorithm scales quite well on CPUs, the same implementation on GPUs runs around 10× faster than 16 CPUs, with any of the tested GPUs and CPUs. We have found what one GPU dedicated for video games can be used without any problem for our application, delivering a remarkable performance, in fact; this GPU competes against one HPC GPU, in particular when single‐precision is used. © 2014 Wiley Periodicals, Inc.  相似文献   

12.
In conjunction with the recent American Chemical Society symposium titled "Docking and Scoring: A Review of Docking Programs" the performance of the DOCK6 program was evaluated through (1) pose reproduction and (2) database enrichment calculations on a common set of organizer-specified systems and datasets (ASTEX, DUD, WOMBAT). Representative baseline grid score results averaged over five docking runs yield a relatively high pose identification success rate of 72.5?% (symmetry corrected rmsd) and sampling rate of 91.9?% for the multi site ASTEX set (N?=?147) using organizer-supplied structures. Numerous additional docking experiments showed that ligand starting conditions, symmetry, multiple binding sites, clustering, and receptor preparation protocols all affect success. Encouragingly, in some cases, use of more sophisticated scoring and sampling methods yielded results which were comparable (Amber score ligand movable protocol) or exceeded (LMOD score) analogous baseline grid-score results. The analysis highlights the potential benefit and challenges associated with including receptor flexibility and indicates that different scoring functions have system dependent strengths and weaknesses. Enrichment studies with the DUD database prepared using the SB2010 preparation protocol and native ligand pairings yielded individual area under the curve (AUC) values derived from receiver operating characteristic curve analysis ranging from 0.29 (bad enrichment) to 0.96 (good enrichment) with an average value of 0.60 (27/38 have AUC?≥?0.5). Strong early enrichment was also observed in the critically important 1.0-2.0?% region. Somewhat surprisingly, an alternative receptor preparation protocol yielded comparable results. As expected, semi-random pairings yielded poorer enrichments, in particular, for unrelated receptors. Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.  相似文献   

13.
The program VinaMPI has been developed to enable massively large virtual drug screens on leadership‐class computing resources, using a large number of cores to decrease the time‐to‐completion of the screen. VinaMPI is a massively parallel Message Passing Interface (MPI) program based on the multithreaded virtual docking program AutodockVina, and is used to distribute tasks while multithreading is used to speed‐up individual docking tasks. VinaMPI uses a distribution scheme in which tasks are evenly distributed to the workers based on the complexity of each task, as defined by the number of rotatable bonds in each chemical compound investigated. VinaMPI efficiently handles multiple proteins in a ligand screen, allowing for high‐throughput inverse docking that presents new opportunities for improving the efficiency of the drug discovery pipeline. VinaMPI successfully ran on 84,672 cores with a continual decrease in job completion time with increasing core count. The ratio of the number of tasks in a screening to the number of workers should be at least around 100 in order to have a good load balance and an optimal job completion time. The code is freely available and downloadable. Instructions for downloading and using the code are provided in the Supporting Information. © 2013 Wiley Periodicals, Inc.  相似文献   

14.
A new three‐dimensional reference interaction site model (3D‐RISM) program for massively parallel machines combined with the volumetric 3D fast Fourier transform (3D‐FFT) was developed, and tested on the RIKEN K supercomputer. The ordinary parallel 3D‐RISM program has a limitation on the number of parallelizations because of the limitations of the slab‐type 3D‐FFT. The volumetric 3D‐FFT relieves this limitation drastically. We tested the 3D‐RISM calculation on the large and fine calculation cell (20483 grid points) on 16,384 nodes, each having eight CPU cores. The new 3D‐RISM program achieved excellent scalability to the parallelization, running on the RIKEN K supercomputer. As a benchmark application, we employed the program, combined with molecular dynamics simulation, to analyze the oligomerization process of chymotrypsin Inhibitor 2 mutant. The results demonstrate that the massive parallel 3D‐RISM program is effective to analyze the hydration properties of the large biomolecular systems. © 2014 Wiley Periodicals, Inc.  相似文献   

15.
16.
Ligand-based shape matching approaches have become established as important and popular virtual screening (VS) techniques. However, despite their relative success, many authors have discussed how best to choose the initial query compounds and which of their conformations should be used. Furthermore, it is increasingly the case that pharmaceutical companies have multiple ligands for a given target and these may bind in different ways to the same pocket. Conversely, a given ligand can sometimes bind to multiple targets, and this is clearly of great importance when considering drug side-effects. We recently introduced the notion of spherical harmonic-based "consensus shapes" to help deal with these questions. Here, we apply a consensus shape clustering approach to the 40 protein-ligand targets in the DUD data set using PARASURF/PARAFIT. Results from clustering show that in some cases the ligands for a given target are split into two subgroups which could suggest they bind to different subsites of the same target. In other cases, our clustering approach sometimes groups together ligands from different targets, and this suggests that those ligands could bind to the same targets. Hence spherical harmonic-based clustering can rapidly give cross-docking information while avoiding the expense of performing all-against-all docking calculations. We also report on the effect of the query conformation on the performance of shape-based screening of the DUD data set and the potential gain in screening performance by using consensus shapes calculated in different ways. We provide details of our analysis of shape-based screening using both PARASURF/PARAFIT and ROCS, and we compare the results obtained with shape-based and conventional docking approaches using MSSH/SHEF and GOLD. The utility of each type of query is analyzed using commonly reported statistics such as enrichment factors (EF) and receiver-operator-characteristic (ROC) plots as well as other early performance metrics.  相似文献   

17.
An outline of the improvements to the pseudospectral electronic structure program Jaguar is presented, showing efficient and robust performance of hybrid‐DFT calculations for large systems with thousands of basis functions, focusing on materials applications. The improvements include re‐engineered parallelization, the design of a fragment‐based initial guess generation method, and the validation of small eigenvalue cutoff values. An OpenMP/MPI hybrid parallelization has been implemented for the pseudospectral algorithm, which extends Jaguar's scalability to up to 256 cores in tests of TiO2 clusters with 1295‐4961 basis functions. In the largest test case, the code delivers 84.4× speedup for 128 cores in total calculation time. In addition, a fragment‐based initial guess method has been constructed for large systems containing many transition metals, where the conventional (atomic) approach often fails. Overall, Jaguar is now capable of efficiently and robustly performing hybrid‐DFT geometrical optimizations for large systems with more than 600 atoms in reasonable runtimes. © 2015 Wiley Periodicals, Inc.  相似文献   

18.
We investigated the performance of heterogeneous computing with graphics processing units (GPUs) and many integrated core (MIC) with 20 CPU cores (20×CPU). As a practical example toward large scale electronic structure calculations using grid‐based methods, we evaluated the Hartree potentials of silver nanoparticles with various sizes (3.1, 3.7, 4.9, 6.1, and 6.9 nm) via a direct integral method supported by the sinc basis set. The so‐called work stealing scheduler was used for efficient heterogeneous computing via the balanced dynamic distribution of workloads between all processors on a given architecture without any prior information on their individual performances. 20×CPU + 1GPU was up to ~1.5 and ~3.1 times faster than 1GPU and 20×CPU, respectively. 20×CPU + 2GPU was ~4.3 times faster than 20×CPU. The performance enhancement by CPU + MIC was considerably lower than expected because of the large initialization overhead of MIC, although its theoretical performance is similar with that of CPU + GPU. © 2016 Wiley Periodicals, Inc.  相似文献   

19.
A new parallel algorithm has been developed for calculating the analytic energy derivatives of full accuracy second order Møller‐Plesset perturbation theory (MP2). Its main projected application is the optimization of geometries of large molecules, in which noncovalent interactions play a significant role. The algorithm is based on the two‐step MP2 energy calculation algorithm developed recently and implemented into the quantum chemistry program, GAMESS. Timings are presented for test calculations on taxol (C47H51NO14) with the 6‐31G and 6‐31G(d) basis sets (660 and 1032 basis functions, 328 correlated electrons) and luciferin (C11H8N2O3S2) with aug‐cc‐pVDZ and aug‐cc‐pVTZ (530 and 1198 basis functions, 92 correlated electrons). The taxol 6‐31G(d) calculations are also performed with up to 80 CPU cores. The results demonstrate the high parallel efficiency of the program. © 2007 Wiley Periodicals, Inc. J Comput Chem, 2007  相似文献   

20.
Benchmarks for molecular docking have historically focused on re-docking the cognate ligand of a well-determined protein-ligand complex to measure geometric pose prediction accuracy, and measurement of virtual screening performance has been focused on increasingly large and diverse sets of target protein structures, cognate ligands, and various types of decoy sets. Here, pose prediction is reported on the Astex Diverse set of 85 protein ligand complexes, and virtual screening performance is reported on the DUD set of 40 protein targets. In both cases, prepared structures of targets and ligands were provided by symposium organizers. The re-prepared data sets yielded results not significantly different than previous reports of Surflex-Dock on the two benchmarks. Minor changes to protein coordinates resulting from complex pre-optimization had large effects on observed performance, highlighting the limitations of cognate ligand re-docking for pose prediction assessment. Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction. Performance on virtual screening performance was shown to benefit by employing and combining multiple screening methods: docking, 2D molecular similarity, and 3D molecular similarity. In addition, use of multiple protein conformations significantly improved screening enrichment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号