首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We discuss issues in developing scalable parallel algorithms and focus on the distribution, as opposed to the replication, of key data structures. Replication of large data structures limits the maximum calculation size by imposing a low ratio of processors to memory. Only applications which distribute both data and computation across processors are truly scalable. The use of shared data structures that may be independently accessed by each process even in a distributed memory environment greatly simplifies development and provides a significant performance enhancement. We describe tools we have developed to support this programming paradigm. These tools are used to develop a highly efficient and scalable algorithm to perform self-consistent field calculations on molecular systems. A simple and classical strip-mining algorithm suffices to achieve an efficient and scalable Fock matrix construction in which all matrices are fully distributed. By strip mining over atoms, we also exploit all available sparsity and pave the way to adopting more sophisticated methods for summation of the Coulomb and exchange interactions. © 1996 by John Wiley & Sons, Inc.  相似文献   

2.
There are now a wide variety of packages for electronic structure calculations, each of which differs in the algorithms implemented and the output format. Many computational chemistry algorithms are only available to users of a particular package despite being generally applicable to the results of calculations by any package. Here we present cclib, a platform for the development of package-independent computational chemistry algorithms. Files from several versions of multiple electronic structure packages are automatically detected, parsed, and the extracted information converted to a standard internal representation. A number of population analysis algorithms have been implemented as a proof of principle. In addition, cclib is currently used as an input filter for two GUI applications that analyze output files: PyMOlyze and GaussSum.  相似文献   

3.
We extend the spin-adapted density matrix renormalization group (DMRG) algorithm of McCulloch and Gulacsi [Europhys. Lett. 57, 852 (2002)] to quantum chemical Hamiltonians. This involves using a quasi-density matrix, to ensure that the renormalized DMRG states are eigenfunctions of S?(2), and the Wigner-Eckart theorem, to reduce overall storage and computational costs. We argue that the spin-adapted DMRG algorithm is most advantageous for low spin states. Consequently, we also implement a singlet-embedding strategy due to Tatsuaki [Phys. Rev. E 61, 3199 (2000)] where we target high spin states as a component of a larger fictitious singlet system. Finally, we present an efficient algorithm to calculate one- and two-body reduced density matrices from the spin-adapted wavefunctions. We evaluate our developments with benchmark calculations on transition metal system active space models. These include the Fe(2)S(2), [Fe(2)S(2)(SCH(3))(4)](2-), and Cr(2) systems. In the case of Fe(2)S(2), the spin-ladder spacing is on the microHartree scale, and here we show that we can target such very closely spaced states. In [Fe(2)S(2)(SCH(3))(4)](2-), we calculate particle and spin correlation functions, to examine the role of sulfur bridging orbitals in the electronic structure. In Cr(2) we demonstrate that spin-adaptation with the Wigner-Eckart theorem and using singlet embedding can yield up to an order of magnitude increase in computational efficiency. Overall, these calculations demonstrate the potential of using spin-adaptation to extend the range of DMRG calculations in complex transition metal problems.  相似文献   

4.
The one-electron density matrix (DM ) of crystalline systems is discussed, especially concerning its longrange behavior; reference is made throughout to systems treated at a Hartree–Fock–LCAO –SCF level of approximation. The analysis is performed on the assumption of generally smooth behavior of eigenvalues and eigenvectors in k (reciprocal) space, so that they can be expressed by means of a truncated Fourier expansion. This assumption allows us to obtain analytic approximations for the DM , on the basis of the information collected at a few, suitably selected sampling k points. It is therefore possible at the same time to discuss the influence of structural parameters (dimensionality of the system, existence and shape of the Fermi surface, structure of the chemical bonds) and to set up a computational scheme that is general and simple enough.  相似文献   

5.
A parallel Fock matrix construction program for a hierarchical network has been developed on the molecular orbital calculation-specific EHPC system. To obtain high parallelization efficiency on the hierarchical network system, a multilevel dynamic load-balancing scheme was adopted, which provides equal load balance and localization of communications on a tree-structured hierarchical network. The parallelized Fock matrix construction routine was implemented into a GAMESS program on the EHPC system, which has a tree-structured hierarchical network. Benchmark results on a 63-processor system showed high parallelization efficiency even on the tree-structured hierarchical network.  相似文献   

6.
A new and experimental photodensitometer designed for quantitative chromatography is described. The principal features of the instrument were based upon the results of an extensive theoretical analysis and incorporate a mechanical arrangement for the production fo a flying spot and an optical path in which two beams of light are separated after interaction with the medium. The device is constructed so as to be suitable for operation in the three principal modes; in reflectance measurements only the ratio of the beam signals is formed, whilst in transmittance measurements the ratio is converted to logarithmic form, in the fluorescence mode only a single beam is used. The spectral range of the instrument extends from the red end of the visible spectrum to the medium ultraviolet, and quartz optics are utilized in most of the optic elements. A quartz halogen lamp and a xenon-mercury lamp may be used alternatively as the light source. Changeable interference filters are employed to determine the spectral position of the light beams and semiconductor photo-diodes with sensitivities extending into the ultraviolet are used as photo-detectors. In the determination of the sensitivity limits of the device the photo-diodes were replaced by photomultipliers and the apparatus was shown to fulfil most of the calculated thoeretical predictions.  相似文献   

7.
We present parallelization of a quantum-chemical tree-code for linear scaling computation of the Coulomb matrix. Equal time partition is used to load balance computation of the Coulomb matrix. Equal time partition is a measurement based algorithm for domain decomposition that exploits small variation of the density between self-consistent-field cycles to achieve load balance. Efficiency of the equal time partition is illustrated by several tests involving both finite and periodic systems. It is found that equal time partition is able to deliver 91%-98% efficiency with 128 processors in the most time consuming part of the Coulomb matrix calculation. The current parallel quantum chemical tree code is able to deliver 63%-81% overall efficiency on 128 processors with fine grained parallelism (less than two heavy atoms per processor).  相似文献   

8.
It is shown that a compression of two-electron integrals and their indices significantly improves efficiency of the conventional self-consistent field (SCF) algorithm for a solution of the Hartree-Fock equation by decrease the Fock matrix calculation time. The improvement is reached not only due to a reduction of the integral file size, but mainly because data compression reduces or even can eliminate a cache conflict in data transfer from the hard drive to the main computer memory. Thus, the conventional SCF algorithm with the data compression becomes very efficient and permits to carry out large-scale Hartree-Fock calculations. The largest Hartree-Fock calculations have been performed for RNA 433D structure from the PDB data bank with 6080 basis functions formed from 6-31G basis on a workstation with 1 GHz Alpha processor.  相似文献   

9.
We present an outline of the parallel implementation of our pseudospectral electronic structure program, Jaguar, including the algorithm and timings for the Hartree–Fock and analytic gradient portions of the program. We also present the parallel algorithm and timings for our Lanczos eigenvector refinement code and demonstrate that its performance is superior to the ScaLAPACK diagonalization routines. The overall efficiency of our code increases as the size of the calculation is increased, demonstrating actual as well as theoretical scalability. For our largest test system, alanine pentapeptide [818 basis functions in the cc-pVTZ(-f) basis set], our Fock matrix assembly procedure has an efficiency of nearly 90% on a 16-processor SP2 partition. The SCF portion for this case (including eigenvector refinement) has an overall efficiency of 87% on a partition of 8 processors and 74% on a partition of 16 processors. Finally, our parallel gradient calculations have a parallel efficiency of 84% on 8 processors for porphine (430 basis functions). © 1998 John Wiley & Sons, Inc. J Comput Chem 19: 1017–1029, 1998  相似文献   

10.
A parallel Fock matrix construction program for FMO‐MO method has been developed with the distributed shared memory model. To construct a large‐sized Fock matrix during FMO‐MO calculations, a distributed parallel algorithm was designed to make full use of local memory to reduce communication, and was implemented on the Global Array toolkit. A benchmark calculation for a small system indicates that the parallelization efficiency of the matrix construction portion is as high as 93% at 1,024 processors. A large FMO‐MO application on the epidermal growth factor receptor (EGFR) protein (17,246 atoms and 96,234 basis functions) was also carried out at the HF/6‐31G level of theory, with the frontier orbitals being extracted by a Sakurai‐Sugiura eigensolver. It takes 11.3 h for the FMO calculation, 49.1 h for the Fock matrix construction, and 10 min to extract 94 eigen‐components on a PC cluster system using 256 processors. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

11.
We developed a novel parallel algorithm for large-scale Fock matrix calculation with small locally distributed memory architectures, and named it the "RT parallel algorithm." The RT parallel algorithm actively involves the concept of integral screening, which is indispensable for reduction of computing times with large-scale biological molecules. The primary characteristic of this algorithm is parallel efficiency, which is achieved by well-balanced reduction of both communicating and computing volume. Only the density matrix data necessary for Fock matrix calculations are communicated, and the data once communicated are reutilized for calculations as many times as possible. The RT parallel algorithm is a scalable method because required memory volume does not depend on the number of basis functions. This algorithm automatically includes a partial summing technique that is indispensable for maintaining computing accuracy, and can also include some conventional methods to reduce calculation times. In our analysis, the RT parallel algorithm had better performance than other methods for massively parallel processors. The RT parallel algorithm is most suitable for massively parallel and distributed Fock matrix calculations for large-scale biological molecules with more than thousands of basis functions.  相似文献   

12.
 A direct comparison is made between two recently proposed methods for linear scaling computation of the Hartree–Fock exchange matrix to investigate the importance of exploiting two-electron integral permutational symmetry. Calculations on three-dimensional water clusters and graphitic sheets with different basis sets and levels of accuracy are presented to identify specific cases where permutational symmetry may or may not be useful. We conclude that a reduction in integrals via permutational symmetry does not necessarily translate into a reduction in computation times. For large insulating systems and weakly contracted basis sets the advantage of permutational symmetry is found to be negligible, while for noninsulating systems and highly contracted basis sets a fourfold speedup is approached. Received: 8 October 1999 / Accepted: 3 January 2000 / Published online: 21 June 2000  相似文献   

13.
Cuby is a computational chemistry framework written in the Ruby programming language. It provides unified access to a wide range of computational methods by interfacing external software and it implements various protocols that operate on their results. Using structured input files, elementary calculations can be combined into complex workflows. For users, Cuby provides a unified and userfriendly way to automate their work, seamlessly integrating calculations carried out in different computational chemistry programs. For example, the QM/MM module allows combining methods across the interfaced programs and the builtin molecular dynamics engine makes it possible to run a simulation on the resulting potential. For programmers, it provides high‐level, object‐oriented environment that allows rapid development and testing of new methods and computational protocols. The Cuby framework is available for download at http://cuby4.molecular.cz . © 2016 Wiley Periodicals, Inc.  相似文献   

14.
Here, we present just a collection of beans (JACOB): an integrated batch‐based framework designed for the rapid development of computational chemistry applications. The framework expedites developer productivity by handling the generic infrastructure tier, and can be easily extended by user‐specific scientific code. Paradigms from enterprise software engineering were rigorously applied to create a scalable, testable, secure, and robust framework. A centralized web application is used to configure and control the operation of the framework. The application‐programming interface provides a set of generic tools for processing large‐scale noninteractive jobs (e.g., systematic studies), or for coordinating systems integration (e.g., complex workflows). The code for the JACOB framework is open sourced and is available at: www.wallerlab.org/jacob . © 2013 Wiley Periodicals, Inc.  相似文献   

15.
Reaction pathways for the formation of zirconocene phosphinidene complex Cp2Zr(PR3)PR from Cp2ZrCl2 and LiH and LiPRH and its reactivity to 1,2-dichloroethane are explored with density functional theory using model structures that are devoid of substituents. After the initial Cp2Zr(Cl)PH2 is generated with LiPH2 reaction with LiH is likely to eliminate HCl in a single step to give directly the 16-electron complex Cp2ZrPH, which is stabilized by the PH3 phosphine ligand. The intermediate formation of a phosphine hydride complex, Cp2Zr(H)PH2 resulting from hydride substitution, is unlikely both on the basis of unfavorable reaction energies and calculated 31P NMR chemical shifts that indicate that such a species cannot have been observed experimentally. It is likely that a diphosphine complex, Cp2Zr(PH2)2, results on using an excess of the lithium phosphide, which on H-transfer gives directly the phosphine-stabilized phosphinidene complex. The reactivity of this species is dominated by the release of its stabilizing phosphine ligand to give a highly reactive 16-electron phosphinidene complex, Cp2ZrPH, which reacts with 1,2-dichloroethane after coordination to one of the chlorine atoms in two asynchronous metathesis steps to the three-membered phosphirane ring. In this process, ZrCl2 is reformed enabling its recycling to regenerate the phosphinidene complex. This study highlights the special reactivity of the 16-electron Cp2ZrPH and suggests that related complexes may be generated similarly, thereby expanding the synthetic potential of these nucleophilic reagents.  相似文献   

16.
A computational chemistry study of the artificial redox enzyme synthesized by covalently attaching flavin to cyclodextrins explains some of its properties. Calculations indicate that the flavin moiety covalently attached to cyclodextrin is not within the cavity of cyclodextrin. This result is consistent with the UV-vis spectrum of the artificial enzyme. The calculations also indicate hydrogen bonds formed between the carbonyl groups of the catalytic functionality and the hydroxyl groups of cyclodextrin play a role in their most stable conformation. This explains the observed overall stability of these artificial enzymes compared to riboflavin. Electrostatic energies and solvation energies play a major role in the stability of the hosts and the orientation of guests included within the artificial enzymes. The rates of oxidation of various thiols catalyzed by the artificial enzyme can be explained by the relative distances between the sulfur atom of the substrates and C(4a) of the flavin moiety.  相似文献   

17.
18.
A simple message‐passing implementation for distributed disk storage, called array files (AF), is described. It is designed primarily for parallelizing computational chemistry applications but it should be useful for any application that handles large amounts of data stored on disk. AF allows transparent distributed storage and access of large data files. An AF consists of a set of logically related records, i.e., blocks of data. It is assumed that the records have the typical dimension of matrices in quantum chemical calculations, i.e., they range from 0.1 to ~32 MB in size. The individual records are not striped over nodes; each record is stored on a single node. As a simple application, second‐order Møller‐Plesset (MP2) energies have been implemented using AF. The AF implementation approaches the efficiency of the hand‐coded program. MP2 is relatively simple to parallelize but for more complex applications, such as Coupled Cluster energies, the AF system greatly simplifies the programming effort. © 2007 Wiley Periodicals, Inc. J Comput Chem, 2007.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号