首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Machine learning algorithms have been demonstrated to predict atomistic properties approaching the accuracy of quantum chemical calculations at significantly less computational cost. Difficulties arise, however, when attempting to apply these techniques to large systems, or systems possessing excessive conformational freedom. In this article, the machine learning method kriging is applied to predict both the intra‐atomic and interatomic energies, as well as the electrostatic multipole moments, of the atoms of a water molecule at the center of a 10 water molecule (decamer) cluster. Unlike previous work, where the properties of small water clusters were predicted using a molecular local frame, and where training set inputs (features) were based on atomic index, a variety of feature definitions and coordinate frames are considered here to increase prediction accuracy. It is shown that, for a water molecule at the center of a decamer, no single method of defining features or coordinate schemes is optimal for every property. However, explicitly accounting for the structure of the first solvation shell in the definition of the features of the kriging training set, and centring the coordinate frame on the atom‐of‐interest will, in general, return better predictions than models that apply the standard methods of feature definition, or a molecular coordinate frame. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.  相似文献   

2.
A method for the prediction of geometries and the de novo design of oligodentate ligands for octahedral high‐spin FeIII complexes with chemically diverse coordinating functions is described. Based on a set of 23 complexes with two nitrogens and four oxygens as coordinating atoms, a computational method was elaborated that describes and predicts the geometries of high‐spin FeIII complexes, including small variations in bond length and angles. The method uses partial atomic charges of the ligand, which are obtained from ab initio calculations, and empirically derived angular and dihedral constraints, which are added to a molecular‐mechanics force field. Conformational analyses of the complex geometries were performed. The method was iteratively optimized by fitting calculated geometries into the corresponding crystal structures of the FeIII complexes. Three representative examples of calculated structures superimposed on the crystal structure are shown to illustrate the accuracy of the method.  相似文献   

3.
Conformational Memories (CM) is a simulated annealing/Monte Carlo method that explores peptide and protein dihedral conformational space completely and efficiently, independent of the original conformation. Here we extend the CM method to include the variation of a randomly chosen bond angle, in addition to the standard variation of two or three randomly chosen dihedral angles, in each Monte Carlo trial of the CM exploratory and biased phases. We test the hypothesis that the inclusion of variable bond angles in CM leads to an improved sampling of conformational space. We compare the results with variable bond angles to CM with no bond angle variation for the following systems: (1) the pentapeptide Met-enkephalin, which is a standard test case for conformational search methods; (2) the proline ring pucker in a 17mer model peptide, (Ala)(8)Pro(Ala)(8); and (3) the conformations of the Ser 7.39 chi(1) in transmembrane helix 7 (TMH7) of the cannabinoid CB1 receptor, a 25-residue system. In each case, analysis of the CM results shows that the inclusion of variable bond angles results in sampling of regions of conformational space that are inaccessible to CM calculations with only variable dihedral angles, and/or a shift in conformational populations from those calculated when variable bond angles are not included. The incorporation of variable bond angles leads to an improved sampling of conformational space without loss of efficiency. Our examples show that this improved sampling leads to better exploration of biologically relevant conformations that have been experimentally validated.  相似文献   

4.
Relative amino acid residue solvent accessibility values allow the quantitative comparison of atomic solvent-accessible surface areas in different residue types and physical environments in proteins and in protein structural alignments. Geometry-optimised tri-peptide structures in extended solvent-exposed reference conformations have been obtained for 43 amino acid residue types at a high level of quantum chemical theory. Significant increases in side-chain solvent accessibility, offset by reductions in main-chain atom solvent exposure, were observed for standard residue types in partially geometry-optimised structures when compared to non-minimised models built from identical sets of proper dihedral angles abstracted from the literature. Optimisation of proper dihedral angles led most notably to marked increases of up to 54% in proline main-chain atom solvent accessibility compared to literature values. Similar effects were observed for fully-optimised tri-peptides in implicit solvent. The relief of internal strain energy was associated with systematic variation in N, Cα and Cβ atom solvent accessibility across all standard residue types. The results underline the importance of optimisation of ‘hard’ degrees of freedom (bond lengths and valence bond angles) and improper dihedral angle values from force field or other context-independent reference values, and impact on the use of standardised fixed internal co-ordinate geometry in sampling approaches to the determination of absolute values of protein amino acid residue solvent accessibility. Quantum chemical methods provide a useful and accurate alternative to molecular mechanics methods to perform energy minimisation of peptides containing non-standard (chemically modified) amino acid residues frequently present in experimental protein structure data sets, for which force field parameters may not be available. Reference tri-peptide atomic co-ordinate sets including hydrogen atoms are made freely available.  相似文献   

5.
Presented is an extension of the CHARMM additive all-atom carbohydrate force field to enable the modeling of phosphate and sulfate linked to carbohydrates. The parameters are developed in a hierarchical fashion using model compounds containing the key atoms in the full carbohydrates. Target data for parameter optimization included full two-dimensional energy surfaces defined by the glycosidic dihedral angle pairs in the phosphate/sulfate model compound analogs of hexopyranose monosaccharide phosphates and sulfates, as determined by quantum mechanical (QM) MP2/cc-pVTZ single point energies on MP2/6-31+G(d) optimized structures. In order to achieve balanced, transferable dihedral parameters for the dihedral angles, surfaces for all possible anomeric and conformational states were included during the parametrization process. In addition, to model physiologically relevant systems both the mono- and di-anionic charged states were studied for the phosphates. This resulted in over 7000 MP2/cc-pVTZ//MP2/6-31G+(d) model compound conformational energies which, supplemented with QM geometries, were the main target data for the parametrization. Parameters were validated against crystals of relevant monosaccharide derivatives obtained from the Cambridge Structural Database (CSD) and larger systems, namely inositol-(tri/tetra/penta) phosphates non-covalently bound to the pleckstrin homology (PH) domain and oligomeric chondroitin sulfate in solution and in complex with cathepsin K protein.  相似文献   

6.
7.
We propose a generic method to model polarization in the context of high‐rank multipolar electrostatics. This method involves the machine learning technique kriging, here used to capture the response of an atomic multipole moment of a given atom to a change in the positions of the atoms surrounding this atom. The atoms are malleable boxes with sharp boundaries, they do not overlap and exhaust space. The method is applied to histidine where it is able to predict atomic multipole moments (up to hexadecapole) for unseen configurations, after training on 600 geometries distorted using normal modes of each of its 24 local energy minima at B3LYP/apc‐1 level. The quality of the predictions is assessed by calculating the Coulomb energy between an atom for which the moments have been predicted and the surrounding atoms (having exact moments). Only interactions between atoms separated by three or more bonds (“1, 4 and higher” interactions) are included in this energy error. This energy is compared with that of a central atom with exact multipole moments interacting with the same environment. The resulting energy discrepancies are summed for 328 atom–atom interactions, for each of the 29 atoms of histidine being a central atom in turn. For 80% of the 539 test configurations (outside the training set), this summed energy deviates by less than 1 kcal mol?1. © 2013 Wiley Periodicals, Inc.  相似文献   

8.
We describe a package of some IBM PC programs that may find application in computer-aided molecular design. PCGEOM constructs and visualizes molecular models from bond lengths, bond angles, and dihedral angles, from Cartesian coordinates, or from stored fragments. It may prepare output files to be used as input for other programs, like CNDOB (conventional CNDO /2) or PCMEP using the bond increment (BI ) method for the calculation of molecular electrostatic potentials. PCPROT is in preparation and will use Protein Data Bank coordinates to visualize and manipulate protein molecular models. Starting from these, it will calculate electrostatic potentials using the BI method and/or monopoles adjusted to reproduce ab initio values for amino acid residues. FSCF is based on a CNDO -type approximation and uses strictly localized molecular orbitals in order to partition large molecules into a central fragment, a polarizable region, and a fully transferable environment. The partition allows one to handle relatively large systems with up to 200 atoms. To illustrate applications, we present estimation of relative inhibitory potencies of a series of substituted triazines on chicken liver dihydrofolate reductase.  相似文献   

9.
We have developed a computer program, named PDBETA, that performs normal mode analysis (NMA) based on an elastic network model that uses dihedral angles as independent variables. Taking advantage of the relatively small number of degrees of freedom required to describe a molecular structure in dihedral angle space and a simple potential-energy function independent of atom types, we aimed to develop a program applicable to a full-atom system of any molecule in the Protein Data Bank (PDB). The algorithm for NMA used in PDBETA is the same as the computer program FEDER/2, developed previously. Therefore, the main challenge in developing PDBETA was to find a method that can automatically convert PDB data into molecular structure information in dihedral angle space. Here, we illustrate the performance of PDBETA with a protein–DNA complex, a protein–tRNA complex, and some non-protein small molecules, and show that the atomic fluctuations calculated by PDBETA reproduce the temperature factor data of these molecules in the PDB. A comparison was also made with elastic-network-model based NMA in a Cartesian-coordinate system.  相似文献   

10.
Accurately predicting loop structures is important for understanding functions of many proteins. In order to obtain loop models with high accuracy, efficiently sampling the loop conformation space to discover reasonable structures is a critical step. In loop conformation sampling, coarse-grain energy (scoring) functions coupling with reduced protein representations are often used to reduce the number of degrees of freedom as well as sampling computational time. However, due to implicitly considering many factors by reduced representations, the coarse-grain scoring functions may have potential insensitivity and inaccuracy, which can mislead the sampling process and consequently ignore important loop conformations. In this paper, we present a new computational sampling approach to obtain reasonable loop backbone models, so-called the Pareto optimal sampling (POS) method. The rationale of the POS method is to sample the function space of multiple, carefully selected scoring functions to discover an ensemble of diversified structures yielding Pareto optimality to all sampled conformations. The POS method can efficiently tolerate insensitivity and inaccuracy in individual scoring functions and thereby lead to significant accuracy improvement in loop structure prediction. We apply the POS method to a set of 4-12-residue loop targets using a function space composed of backbone-only Rosetta and distance-scale finite ideal-gas reference (DFIRE) and a triplet backbone dihedral potential developed in our lab. Our computational results show that in 501 out of 502 targets, the model sets generated by POS contain structure models are within subangstrom resolution. Moreover, the top-ranked models have a root mean square deviation (rmsd) less than 1 A in 96.8, 84.1, and 72.2% of the short (4-6 residues), medium (7-9 residues), and long (10-12 residues) targets, respectively, when the all-atom models are generated by local optimization from the backbone models and are ranked by our recently developed Pareto optimal consensus (POC) method. Similar sampling effectiveness can also be found in a set of 13-residue loop targets.  相似文献   

11.
A redundant internal coordinate system for optimizing molecular geometries is constructed from all bonds, all valence angles between bonded atoms, and all dihedral angles between bonded atoms. Redundancies are removed by using the generalized inverse of the G matrix; constraints can be added by using an appropriate projector. For minimizations, redundant internal coordinates provide substantial improvements in optimization efficiency over Cartesian and nonredundant internal coordinates, especially for flexible and polycyclic systems. Transition structure searches are also improved when redundant coordinates are used and when the initial steps are guided by the quadratic synchronous transit approach. © 1996 by John Wiley & Sons, Inc.  相似文献   

12.
We present a method to rapidly identify hydrogen-mediated interactions in proteins (e.g., hydrogen bonds, hydrogen bonds, water-mediated hydrogen bonds, salt bridges, and aromatic π-hydrogen interactions) through heavy atom geometry alone, that is, without needing to explicitly determine hydrogen atom positions using either experimental or theoretical methods. By including specific real (or virtual) partner atoms as defined by the atom type of both the donor and acceptor heavy atoms, a set of unique angles can be rapidly calculated. By comparing the distance between the donor and the acceptor and these unique angles to the statistical preferences observed in the Protein Data Bank (PDB), we were able to identify a set of conserved geometries (15 for donor atoms and 7 for acceptor atoms) for hydrogen-mediated interactions in proteins. This set of identified interactions includes every polar atom type present in the Protein Data Bank except OE1 (glutamate/glutamine sidechain) and a clear geometric preference for the methionine sulfur atom (SD) to act as a hydrogen bond acceptor. This method could be readily applied to protein design efforts.  相似文献   

13.
The conformational flexibility of carbohydrates is challenging within the field of computational chemistry. This flexibility causes the electron density to change, which leads to fluctuating atomic multipole moments. Quantum Chemical Topology (QCT) allows for the partitioning of an “atom in a molecule,” thus localizing electron density to finite atomic domains, which permits the unambiguous evaluation of atomic multipole moments. By selecting an ensemble of physically realistic conformers of a chemical system, one evaluates the various multipole moments at defined points in configuration space. The subsequent implementation of the machine learning method kriging delivers the evaluation of an analytical function, which smoothly interpolates between these points. This allows for the prediction of atomic multipole moments at new points in conformational space, not trained for but within prediction range. In this work, we demonstrate that the carbohydrates erythrose and threose are amenable to the above methodology. We investigate how kriging models respond when the training ensemble incorporating multiple energy minima and their environment in conformational space. Additionally, we evaluate the gains in predictive capacity of our models as the size of the training ensemble increases. We believe this approach to be entirely novel within the field of carbohydrates. For a modest training set size of 600, more than 90% of the external test configurations have an error in the total (predicted) electrostatic energy (relative to ab initio) of maximum 1 kJ mol?1 for open chains and just over 90% an error of maximum 4 kJ mol?1 for rings. © 2015 Wiley Periodicals, Inc.  相似文献   

14.
15.
Protein structure determination has long been one of the most challenging problems in molecular biology for the past 60 years. Here we present an ab initio protein tertiary-structure prediction method assisted by predicted contact maps from SPOT-Contact and predicted dihedral angles from SPIDER 3. These predicted properties were then fed to the crystallography and NMR system (CNS) for restrained structure modeling. The resulted structures are first evaluated by the potential energy calculated by CNS, followed by dDFIRE energy function for model selections. The method called SPOT-Fold has been tested on 241 CASP targets between 67 and 670 amino acid residues, 60 randomly selected globular proteins under 100 amino acids. The method has a comparable accuracy to other contact-map-based modeling techniques. © 2019 Wiley Periodicals, Inc.  相似文献   

16.
Recent results from Preuss et al. (J Comput Chem 2004, 25, 112) on DNA base molecules, obtained by plane wave density functional calculations using ultrasoft pseudopotentials, are compared with calculations using Gaussian basis sets. Bond lengths and angles agree closely, but dihedral angles and vibrational frequencies show significant differences. The Gaussian basis calculations are at least an order of magnitude more efficient than the plane wave/ultrasoft pseudopotential calculations at a similar level of accuracy; the advantage is even larger if the Fourier Transform Coulomb method is used. To obtain definite benchmark values, the geometries of the four DNA bases were optimized at the MP2 level with large basis sets, up to cc-pVQZ and aug-cc-pVTZ.  相似文献   

17.
Protein structure prediction is a long‐standing problem in molecular biology. Due to lack of an accurate energy function, it is often difficult to know whether the sampling algorithm or the energy function is the most important factor for failure of locating near‐native conformations of proteins. This article examines the size dependence of sampling effectiveness by using a perfect “energy function”: the root‐mean‐squared distance from the target native structure. Using protein targets up to 460 residues from critical assessment of structure prediction techniques (CASP11, 2014), we show that the accuracy of near native structures sampled is relatively independent of protein sizes but strongly depends on the errors of predicted torsion angles. Even with 40% out‐of‐range angle prediction, 2 Å or less near‐native conformation can be sampled. The result supports that the poor energy function is one of the bottlenecks of structure prediction and predicted torsion angles are useful for overcoming the bottleneck by restricting the sampling space in the absence of a perfect energy function. © 2015 Wiley Periodicals, Inc.  相似文献   

18.
Conformational properties of polymers, such as average dihedral angles or molecular alpha-helicity, display a rather weak dependence on the detailed arrangement of the elementary constituents (atoms). We propose a computer simulation method to explore the polymer phase space using a variant of the standard multicanonical method, in which the density of states associated to suitably chosen configurational variables is considered in place of the standard energy density of states. This configurational density of states is used in the Metropolis acceptance/rejection test when configurations are generated with the help of a hybrid Monte Carlo algorithm. The resulting configurational probability distribution is then modulated by exponential factors derived from the general principle of the maximal constrained entropy by requiring that certain average configurational quantities take preassigned (possibly temperature dependent) values. Thermal averages of other configurational quantities can be computed by using the probability distributions obtained in this way. Moments of the energy distribution require an extra canonical sampling of the system phase space at the desired temperature, in order to locally thermalize the configurational degrees of freedom. As an application of these ideas we present the study of the structural properties of two simple models: a bead-and-spring model of polyethylene with independent hindered torsions and an all-atom model of alanine and glycine oligomers with 12 amino acids in vacuum.  相似文献   

19.
Constraint generation for 3d structure prediction and structure-based database searches benefit from fine-grained prediction of local structure. In this work, we present LOCUSTRA, a novel scheme for the multiclass prediction of local structure that uses two layers of support vector machines (SVM). Using a 16-letter structural alphabet from de Brevern et al. (Proteins: Struct., Funct., Bioinf. 2000, 41, 271-287), we assess its prediction ability for an independent test set of 222 proteins and compare our method to three-class secondary structure prediction and direct prediction of dihedral angles. The prediction accuracy is Q16=61.0% for the 16 classes of the structural alphabet and Q3=79.2% for a simple mapping to the three secondary classes helix, sheet, and coil. We achieve a mean phi(psi) error of 24.74 degrees (38.35 degrees) and a median RMSDA (root-mean-square deviation of the (dihedral) angles) per protein chain of 52.1 degrees. These results compare favorably with related approaches. The LOCUSTRA web server is freely available to researchers at http://www.fz-juelich.de/nic/cbb/service/service.php.  相似文献   

20.
A novel design of a next-generation force field considers not only the electronic inter-atomic energy but also intra-atomic energy. This strategy promises a faithful mapping between the force field and the quantum mechanics that underpins it. Quantum chemical topology provides an energy partitioning in which atoms have well-defined electronic kinetic energies, and we are interested in capturing how they respond to changes in the positions of surrounding atoms. A machine learning method called kriging successfully creates models from a training set of molecular configurations that can then be used to predict the atomic kinetic energies occurring in previously unseen molecular configurations. We present a proof-of-concept based on four molecules of increasing complexity (methanol, N-methylacetamide, glycine and triglycine). We test how well the atomic kinetic energies can be modelled with respect to training set size, molecule size and elemental composition. For all atoms tested, the mean atomic kinetic energy errors fall below 1.5 kJ mol?1, and far below this in most cases. This represents errors all under 0.5 % and thus the kinetic energies are well modelled using the kriging method, even when using modest-to-small training set sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号