首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
One of the main challenges in computational protein design (CPD) is the huge size of the protein sequence and conformational space that has to be computationally explored. Recently, we showed that state‐of‐the‐art combinatorial optimization technologies based on Cost Function Network (CFN) processing allow speeding up provable rigid backbone protein design methods by several orders of magnitudes. Building up on this, we improved and injected CFN technology into the well‐established CPD package Osprey to allow all Osprey CPD algorithms to benefit from associated speedups. Because Osprey fundamentally relies on the ability of to produce conformations in increasing order of energy, we defined new strategies combining CFN lower bounds, with new side‐chain positioning‐based branching scheme. Beyond the speedups obtained in the new ‐CFN combination, this novel branching scheme enables a much faster enumeration of suboptimal sequences, far beyond what is reachable without it. Together with the immediate and important speedups provided by CFN technology, these developments directly benefit to all the algorithms that previously relied on the DEE/ combination inside Osprey* and make it possible to solve larger CPD problems with provable algorithms. © 2016 Wiley Periodicals, Inc.  相似文献   

3.
A novel, yet simple and automated, protocol for reconstruction of complete peptide backbones from C(alpha) coordinates only is described, validated, and benchmarked. The described method collates a set of possible backbone conformations for each set of residue triads from a structural library derived from the PDB. The optimal permutation of these three residue segments of backbone conformations is determined using the dead-end elimination (DEE) algorithm. Putative conformations are evaluated using a pairwise-additive knowledge-based forcefield term and a fragment overlap term. The protocol described in this report is able to restore the full backbone coordinates to within 0.2-0.6 A of the actual crystal structure from C(alpha) coordinates only. In addition, it is insensitive to errors in the input C(alpha) coordinates with RMSDs of 3.0 A, and this is illustrated through application to deliberately distorted C(alpha) traces. The entire process, as described, is rapid, requiring of the order of a few minutes for a typical protein on a typical desktop PC. Approximations enable this to be reduced to a few seconds, although this is at the expense of prediction accuracy. This compares very favorably to previously published methods, being sufficiently fast for general use and being one of the most accurate methods. Because the method is not restricted to the reconstruction from only C(alpha) coordinates, reconstruction based on C(beta) coordinates is also demonstrated.  相似文献   

4.
Backbone–backbone hydrogen bonds (BBHBs) are one of the most abundant interactions at the interface of protein–protein complex. Here, we propose an angle‐dependent potential energy function for BBHB based on density functional theory (DFT) calculations and the operation of a genetic algorithm to find the optimal parameters in the potential energy function. The angular part of the energy funtion is assumed to be the product of the power series of sine and cosine functions with respect to the two angles associated with BBHB. Two radial functions are taken into account in this study: Morse and Leonard‐Jones 12‐10 potential functions. Of these two functions under consideration, the former is found to be more accurate than the latter in terms of predicting the binding energies obtained from DFT calculations. The new HB potential function also compares well with the knowledge‐based potential derived by applying Boltzmann statistics for a variety of protein–protein complexes in protein data bank. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

5.
An important unsolved problem in molecular and structural biology is the protein folding and structure prediction problem. One major bottleneck for solving this is the lack of an accurate energy to discriminate near‐native conformations against other possible conformations. Here we have developed sDFIRE energy function, which is an optimized linear combination of DFIRE (the Distance‐scaled Finite Ideal gas Reference state based Energy), the orientation dependent (polar‐polar and polar‐nonpolar) statistical potentials, and the matching scores between predicted and model structural properties including predicted main‐chain torsion angles and solvent accessible surface area. The weights for these scoring terms are optimized by three widely used decoy sets consisting of a total of 134 proteins. Independent tests on CASP8 and CASP9 decoy sets indicate that sDFIRE outperforms other state‐of‐the‐art energy functions in selecting near native structures and in the Pearson's correlation coefficient between the energy score and structural accuracy of the model (measured by TM‐score). © 2016 Wiley Periodicals, Inc.  相似文献   

6.
The global structural optimization is carried out for off-lattice protein AB models in two and three dimensions by conformational space annealing. The models consist of hydrophobic and hydrophilic monomers in Fibonacci sequences. To accelerate the convergence, we have introduced a shift operator in the internal coordinate system, and effectively reduced the search space by forming a quotient space. With this, we significantly improve our previous results on AB models, and provide new low energy conformations. This work provides insights on exploring complicated energy landscapes by exploiting the advantages and limitations of CSA.  相似文献   

7.
The search for the global minimum energy conformation (GMEC) of protein side chains is an important computational challenge in protein structure prediction and design. Using rotamer models, the problem is formulated as a NP‐hard optimization problem. Dead‐end elimination (DEE) methods combined with systematic A* search (DEE/A*) has proven useful, but may not be strong enough as we attempt to solve protein design problems where a large number of similar rotamers is eligible and the network of interactions between residues is dense. In this work, we present an exact solution method, named BroMAP (branch‐and‐bound rotamer optimization using MAP estimation), for such protein design problems. The design goal of BroMAP is to be able to expand smaller search trees than conventional branch‐and‐bound methods while performing only a moderate amount of computation in each node, thereby reducing the total running time. To achieve that, BroMAP attempts reduction of the problem size within each node through DEE and elimination by lower bounds from approximate maximum‐a‐posteriori (MAP) estimation. The lower bounds are also exploited in branching and subproblem selection for fast discovery of strong upper bounds. Our computational results show that BroMAP tends to be faster than DEE/A* for large protein design cases. BroMAP also solved cases that were not solved by DEE/A* within the maximum allowed time, and did not incur significant disadvantage for cases where DEE/A* performed well. Therefore, BroMAP is particularly applicable to large protein design problems where DEE/A* struggles and can also substitute for DEE/A* in general GMEC search. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009  相似文献   

8.
A Monte Carlo sampling algorithm for searching a scale-transformed conformational energy space of polypeptides is presented. This algorithm is based on the assumption that energy barriers can be overcome by a uniform sampling of the logarithmically transformed energy space. This algorithm is tested with Met-enkephalin. For comparison, the entropy sampling Monte Carlo (ESMC) simulation is performed. First, the global minimum is easily found by the optimization of a scale-transformed energy space. With a new Monte Carlo sampling, energy barriers of 3000 kcal/mol are frequently overcome, and low-energy conformations are sampled more efficiently than with ESMC simulations. Several thermodynamic quantities are calculated with good accuracy.  相似文献   

9.
Genetic algorithms constitute a powerful optimization method that has already been used in the study of the protein folding problem. However, they often suffer from a lack of convergence in a reasonably short time for complex fitness functions. Here, we propose an evolutionary strategy that can reproducibly find structures close to the minimum of a potential function for a simplified protein model in an efficient way. The model reduces the number of degrees of freedom of the system by treating the protein structure as composed of rigid fragments. The search incorporates a double encoding procedure and a merging operation from subpopulations that evolve independently of one another, both contributing to the good performance of the full algorithm. We have tested it with protein structures of different degrees of complexity, and present our conclusions related to its possible application as an efficient tool for the analysis of folding potentials.  相似文献   

10.
The performances of three different stochastic optimization methods for all-atom protein structure prediction are investigated and compared. We use the recently developed all-atom free-energy force field (PFF01), which was demonstrated to correctly predict the native conformation of several proteins as the global optimum of the free energy surface. The trp-cage protein (PDB-code 1L2Y) is folded with the stochastic tunneling method, a modified parallel tempering method, and the basin-hopping technique. All the methods correctly identify the native conformation, and their relative efficiency is discussed.  相似文献   

11.
Lead optimization is one of the crucial steps in the drug discovery pipeline. After identifying the lead molecule and obtaining its 2D geometry, understanding the best conformation it would attain in 3D still remains one of the most challenging steps in drug discovery. There have been multiple methods and algorithms that are directed toward achieving best conformation for the lead molecules. TANGO focuses on conformation generation and its optimization using semiempirical energy calculations. The conformation generation is based on torsion angle rotation of the exocyclic bonds. The energy calculations are performed using MOPAC. The unique feature of this tool lies in the implementation of Message Passing Interface (MPI) for conformation generation and semiempirical-based optimization. A well-defined architecture handling the input and output generation has been used. The master and slave approach to handle operations involved in torsion angle rotation and energy calculations has helped in load balancing the process of conformation generation. The benchmarking results suggest that TANGO scales significantly well across eight nodes with each node utilizing 16 cores. This tool may prove to very useful in high throughput generation of semiempirically optimized small molecule conformations. The use of semiempirical methods for optimization generates a conformational ensemble thereby helping to obtain stable and alternate stable conformers for a given ligand molecule. © 2018 Wiley Periodicals, Inc.  相似文献   

12.
Loop closure in proteins requires computing the values of the inverse kinematics (IK) map for a backbone fragment with 2n > or = 6 torsional degrees of freedom (dofs). It occurs in a variety of contexts, e.g., structure determination from electron-density maps, loop insertion in homology-based structure prediction, backbone tweaking for protein energy minimization, and the study of protein mobility in folded states. The first part of this paper analyzes the global structure of the IK map for a fragment of protein backbone with 6 torsional dofs for a slightly idealized kinematic model, called the canonical model. This model, which assumes that every two consecutive torsional bonds C(alpha)--C and N--C(alpha) are exactly parallel, makes it possible to separately compute the inverse orientation map and the inverse position map. The singularities of both maps and their images, the critical sets, respectively, decompose SO(3) x R(3) into open regions where the number of IK solutions is constant. This decomposition leads to a constructive proof of the existence of a region in R(3) x SO(3) where the IK of the 6-dof fragment attains its theoretical maximum of 16 solutions. The second part of this paper extends this analysis to study fragments with more than 6 torsional dofs. It describes an efficient recursive algorithm to sample IK solutions for such fragments, by identifying the feasible range of each successive torsional dof. A numerical homotopy algorithm is then used to deform the IK solutions for a canonical fragment into solutions for a noncanonical fragment. Computational results for fragments ranging from 8 to 30 dofs are presented.  相似文献   

13.
Carbohydrate‐binding proteins (CBPs) are potential biomarkers and drug targets. However, the interactions between carbohydrates and proteins are challenging to study experimentally and computationally because of their low binding affinity, high flexibility, and the lack of a linear sequence in carbohydrates as exists in RNA, DNA, and proteins. Here, we describe a structure‐based function‐prediction technique called SPOT‐Struc that identifies carbohydrate‐recognizing proteins and their binding amino acid residues by structural alignment program SPalign and binding affinity scoring according to a knowledge‐based statistical potential based on the distance‐scaled finite‐ideal gas reference state (DFIRE). The leave‐one‐out cross‐validation of the method on 113 carbohydrate‐binding domains and 3442 noncarbohydrate binding proteins yields a Matthews correlation coefficient of 0.56 for SPalign alone and 0.63 for SPOT‐Struc (SPalign + binding affinity scoring) for CBP prediction. SPOT‐Struc is a technique with high positive predictive value (79% correct predictions in all positive CBP predictions) with a reasonable sensitivity (52% positive predictions in all CBPs). The sensitivity of the method was changed slightly when applied to 31 APO (unbound) structures found in the protein databank (14/31 for APO versus 15/31 for HOLO). The result of SPOT‐Struc will not change significantly if highly homologous templates were used. SPOT‐Struc predicted 19 out of 2076 structural genome targets as CBPs. In particular, one uncharacterized protein in Bacillus subtilis (1oq1A) was matched to galectin‐9 from Mus musculus. Thus, SPOT‐Struc is useful for uncovering novel carbohydrate‐binding proteins. SPOT‐Struc is available at http://sparks‐lab.org . © 2014 Wiley Periodicals, Inc.  相似文献   

14.
True ab initio prediction of protein 3D structure requires only the protein primary structure, a physicochemical free energy model, and a search method for identifying the free energy global minimum. Various characteristics of evolutionary algorithms (EAs) mean they are in principle well suited to the latter. Studies to date have been less than encouraging, however. This is because of the limited consideration given to EA design and control parameter issues. A comprehensive study of these issues was, therefore, undertaken for ab initio protein fold prediction using a full atomistic protein model. The performance and optimal control parameter settings of twelve EA designs where first established using a 15-residue polyalanine molecule-design aspects varied include the encoding alphabet, crossover operator, and replacement strategy. It can be concluded that real encoding and multipoint crossover are superior, while both generational and steady-state replacement strategies have merits. The scaling between the optimal control parameter settings and polyalanine size was also identified for both generational and steady-state designs based on real encoding and multipoint crossover. Application of the steady-state design to met-enkephalin indicated that these scalings are potentially transferable to real proteins. Comparison of the performance of the steady state design for met-enkephalin with other ab initio methods indicates that EAs can be competitive provided the correct design and control parameter values are used.  相似文献   

15.
There are several very difficult problems related to genetic or genomic analysis that belong to the field of discrete optimization in a set of all possible orders. With n elements (points, markers, clones, sequences, etc.), the number of all possible orders is n!/2 and only one of these is considered to be the true order. A classical formulation of a similar mathematical problem is the well-known traveling salesperson problem model (TSP). Genetic analogues of this problem include: ordering in multilocus genetic mapping, evolutionary tree reconstruction, building physical maps (contig assembling for overlapping clones and radiation hybrid mapping), and others. A novel, fast and reliable hybrid algorithm based on evolution strategy and guided local search discrete optimization was developed for TSP formulation of the multilocus mapping problems. High performance and high precision of the employed algorithm named guided evolution strategy (GES) allows verification of the obtained multilocus orders based on different computing-intensive approaches (e.g., bootstrap or jackknife) for detection and removing unreliable marker loci, hence, stabilizing the resulting paths. The efficiency of the proposed algorithm is demonstrated on standard TSP problems and on simulated data of multilocus genetic maps up to 1000 points per linkage group.  相似文献   

16.
Exact rotamer optimization for protein design   总被引:1,自引:0,他引:1  
Computational methods play a central role in the rational design of novel proteins. The present work describes a new hybrid exact rotamer optimization (HERO) method that builds on previous dead-end elimination algorithms to yield dramatic performance enhancements. Measured on experimentally validated physical models, these improvements make it possible to perform previously intractable designs of entire protein core, surface, or boundary regions. Computational demonstrations include a full core design of the variable domains of the light and heavy chains of catalytic antibody 48G7 FAB with 74 residues and 10(128) conformations, a full core/boundary design of the beta1 domain of protein G with 25 residues and 10(53) conformations, and a full surface design of the beta1 domain of protein G with 27 residues and 10(60) conformations. In addition, a full sequence design of the beta1 domain of protein G is used to demonstrate the strong dependence of algorithm performance on the exact form of the potential function and the fidelity of the rotamer library. These results emphasize that search algorithm performance for protein design can only be meaningfully evaluated on physical models that have been subjected to experimental scrutiny. The new algorithm greatly facilitates ongoing efforts to engineer increasingly complex protein features.  相似文献   

17.
All‐atom sampling is a critical and compute‐intensive end stage to protein structural modeling. Because of the vast size and extreme ruggedness of conformational space, even close to the native structure, the high‐resolution sampling problem is almost as difficult as predicting the rough fold of a protein. Here, we present a combination of new algorithms that considerably speed up the exploration of very rugged conformational landscapes and are capable of finding heretofore hidden low‐energy states. The algorithm is based on a hierarchical workflow and can be parallelized on supercomputers with up to 128,000 compute cores with near perfect efficiency. Such scaling behavior is notable, as with Moore's law continuing only in the number of cores per chip, parallelizability is a critical property of new algorithms. Using the enhanced sampling power, we have uncovered previously invisible deficiencies in the Rosetta force field and created an extensive decoy training set for optimizing and testing force fields. © 2012 Wiley Periodicals, Inc.  相似文献   

18.
We adapt a combinatorial optimization algorithm, extremal optimization (EO), for the search problem in computational protein design. This algorithm takes advantage of the knowledge of local energy information and systematically improves on the residues that have high local energies. Power-law probability distributions are used to select the backbone sites to be improved on and the rotamer choices to be changed to. We compare this method with simulated annealing (SA) and motivate and present an improved method, which we call reference energy extremal optimization (REEO). REEO uses reference energies to convert a problem with a structured local-energy profile to one with more random profile, and extremal optimization proves to be extremely efficient for the latter problem. We show in detail the large improvement we have achieved using REEO as compared to simulated annealing and discuss a number of other heuristics we have attempted to date.  相似文献   

19.
Program to engineer peptides (PEP) is a build‐up approach for ligand docking and design with implicit solvation. It requires the knowledge of a seed from which it iteratively grows polymeric ligands consisting of any type of amino acid, i.e., natural and/or nonnatural from a user‐defined library. At every growing step, a genetic algorithm is used for conformational optimization of the last added monomer in the rigid binding site. Pruning is performed at every growing step by selecting sequences according to binding energy with electrostatic solvation. PEP is applied to three members of the caspase family of cysteine proteases using Asp at P1 as seed. The optimal P4–P2 peptide recognition motifs and variants thereof are docked correctly in the active site (backbone root‐mean‐square deviation < 0.9 Å). Moreover, for each caspase, the P4–P2 sequences of potent aldehyde inhibitors are ranked among the 15 hits with the most favorable PEP energy. © 2001 John Wiley & Sons, Inc. J Comput Chem 22: 1956–1970, 2001  相似文献   

20.
A quantum chemical method for rapid optimization of protein structures is proposed. In this method, a protein structure is treated as an assembly of amino acid units, and the geometry optimization of each unit is performed with taking the effect of its surrounding environment into account. The optimized geometry of a whole protein is obtained by repeated application of such a local optimization procedure over the entire part of the protein. Here, we implemented this method in the MOPAC program and performed geometry optimization for three different sizes of proteins. Consequently, these results demonstrate that the total energies of the proteins are much efficiently minimized compared with the use of conventional optimization methods, including the MOZYME algorithm (a representative linear-scaling method) with the BFGS routine. The proposed method is superior to the conventional methods in both CPU time and memory requirements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号