首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Knowing the mechanisms by which protein stability change is one of the most important and valuable tasks in molecular biology. The conventional methods of predicting protein stability changes mainly focus on improving prediction accuracy. However, it is desirable to extract domain knowledge from large databases that is beneficial to accurate prediction of the protein stability change. This paper presents an interpretable prediction tree method (named iPTREE) that produces explanatory rules to explore hidden knowledge accompanied with high prediction accuracy and consequently analyzes the factors influencing the protein stability changes. To evaluate iPTREE and the knowledge upon protein stability changes, a thermodynamic dataset consisting of 1615 mutants led by single point mutation from ProTherm is adopted. Being as a predictor for protein stability changes, the rule-based approach can achieve a prediction accuracy of 87%, which is better than other methods based on artificial neural networks (ANN) and support vector machines (SVM). Besides, these methods lack the ability in biological knowledge discovery. The human-interpretable rules produced by iPTREE reveal that temperature is a factor of concern in predicting protein stability changes. For example, one of interpretable rules with high support is as follows: if the introduced residue type is Alanine and temperature is between 4 °C and 40 °C, then the stability change will be negative (destabilizing). The present study demonstrates that iPTREE can easily be used in the application of protein stability changes where one requires more understandable knowledge.  相似文献   

2.
We introduce PULCHRA, a fast and robust method for the reconstruction of full-atom protein models starting from a reduced protein representation. The algorithm is particularly suitable as an intermediate step between coarse-grained model-based structure prediction and applications requiring an all-atom structure, such as molecular dynamics, protein-ligand docking, structure-based function prediction, or assessment of quality of the predicted structure. The accuracy of the method was tested on a set of high-resolution crystallographic structures as well as on a set of low-resolution protein decoys generated by a protein structure prediction algorithm TASSER. The method is implemented as a standalone program that is available for download from http://cssb.biology.gatech.edu/skolnick/files/PULCHRA.  相似文献   

3.
In this contribution, we present an algorithm for protein backbone reconstruction that comprises very high computational efficiency with high accuracy. Reconstruction of the main chain atomic coordinates from the alpha carbon trace is a common task in protein modeling, including de novo structure prediction, comparative modeling, and processing experimental data. The method employed in this work follows the main idea of some earlier approaches to the problem. The details and careful design of the present approach are new and lead to the algorithm that outperforms all commonly used earlier applications. BBQ (Backbone Building from Quadrilaterals) program has been extensively tested both on native structures as well as on near-native decoy models and compared with the different available existing methods. Obtained results provide a comprehensive benchmark of existing tools and evaluate their applicability to a large scale modeling using a reduced representation of protein conformational space. The BBQ package is available for downloading from our website at http://biocomp.chem.uw.edu.pl/services/BBQ/. This webpage also provides a user manual that describes BBQ functions in detail.  相似文献   

4.
Database-assisted ab initio protein structure prediction methods have exhibited considerable promise in the recent past, with several implementations being successful in community-wide experiments (CASP). We have employed combinatorial optimization techniques toward solving the protein structure prediction problem. A Monte Carlo minimization algorithm has been employed on a constrained search space to identify minimum energy configurations. The search space is constrained by using radius of gyration cutoffs, the loop backbone dihedral probability distributions, and various secondary structure packing conformations. Simulations have been carried out on several sequences and 1000 conformations have been initially generated. Of these, 50 best candidates have then been selected as probable conformations. The search for the optimum has been simplified by incorporating various geometrical constraints on secondary structural elements using distance restraint potential functions. The advantages of the reported methodology are its simplicity, and modifiability to include other geometric and probabilistic restraints.  相似文献   

5.
Protein structure prediction and design often involve discrete modeling of side‐chain conformations on structural templates. Introducing backbone flexibility into such models has proven important in many different applications. Backbone flexibility improves model accuracy and provides access to larger sequence spaces in computational design, although at a cost in complexity and time. Here, we show that the influence of backbone flexibility on protein conformational energetics can be treated implicitly, at the level of sequence, using the technique of cluster expansion. Cluster expansion provides a way to convert structure‐based energies into functions of sequence alone. It leads to dramatic speed‐ups in energy evaluation and provides a convenient functional form for the analysis and optimization of sequence‐structure relationships. We show that it can be applied effectively to flexible‐backbone structural models using four proteins: α‐helical coiled‐coil dimers and trimers, zinc fingers, and Bcl‐xL/peptide complexes. For each of these, low errors for the sequence‐based models when compared with structure‐based evaluations show that this new way of treating backbone flexibility has considerable promise, particularly for protein design. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009  相似文献   

6.
Prediction of protein accessibility from sequence, as prediction of protein secondary structure is an intermediate step for predicting structures and consequently functions of proteins. Most of the currently used methods are based on single residue prediction, either by statistical means or evolutionary information, and accessibility state of central residue in a window predicted. By expansion of databases of proteins with known 3D structures, we extracted information of pairwise residue types and conformational states of pairs simultaneously. For solving the problem of ambiguity in state prediction by one residue window sliding, we used dynamic programming algorithm to find the path with maximum score. The three state overall per-residue accuracy, Q3, of this method in a Jackknife test with dataset of known proteins is more than 65% which is an improvement on results of methods based on evolutionary information.  相似文献   

7.
The present article is concerned with the practical and theoretical aspects of gelation in protein systems. The processes involved are, inter alia, so complicated that several types of protein molecule assemblies and conformational changes during the association have to be considered. The gel network can be stabilized by five types of interaction: covalent crosslinking, polar interactions, hydrogen bonding, salt bridges and hydrophobic interactions. The microstructures of gels formed from fibrillated proteins (collagen) and from globular proteins (e.g. α- and β- casein) differ considerably. Of practical interest are methods of controlling the solubility of gels, including e.g. the use of cross-linking agents and addition of salts.  相似文献   

8.
A model of hydrophobic collapse, which is treated as the driving force for protein folding, is presented. This model is the superposition of three models commonly used in protein structure prediction: (1) 'oil-drop' model introduced by Kauzmann, (2) a lattice model introduced to decrease the number of degrees of freedom for structural changes and (3) a model of the formation of hydrophobic core as a key feature in driving the folding of proteins. These three models together helped to develop the idea of a fuzzy-oil-drop as a model for an external force field of hydrophobic character mimicking the hydrophobicity-differentiated environment for hydrophobic collapse. All amino acids in the polypeptide interact pair-wise during the folding process (energy minimization procedure) and interact with the external hydrophobic force field defined by a three-dimensional Gaussian function. The value of the Gaussian function usually interpreted as a probability distribution is treated as a normalized hydrophobicity distribution, with its maximum in the center of the ellipsoid and decreasing proportionally with the distance versus the center. The fuzzy-oil-drop is elastic and changes its shape and size during the simulated folding procedure.  相似文献   

9.
A united-residue model of polypeptide chains developed in our laboratories with united side-chains and united peptide groups as interaction sites is presented. The model is designed to work in continuous space; hence efficient global-optimization methods can be applied. In this work, we adopted the distance-scaling method that is based on continuous deformation of the original rugged energy hypersurface to obtain a smoothed surface. The method has been applied successfully to predict the structures of simple motifs, such as the three-helix bundle structure of the 10-58 fragment of staphylococcal protein A in de novo folding simulations and more complicated motifs in inverse-folding simulations. Received: 24 April 1998 / Accepted: 4 August 1998 / Published online: 2 November 1998  相似文献   

10.
11.
Water is an important component in living systems and deserves better understanding in chemistry and biology. However, due to the difficulty of investigating the water functions in protein structures, it is usually ignored in computational modeling, especially in the field of computer‐aided drug design. Here, using the potential of mean forces (PMFs) approach, we constructed a water PMF (wPMF) based on 3946 non‐redundant high resolution crystal structures. The extracted wPMF potential was first used to investigate the structure pattern of water and analyze the residue hydrophilicity. Then, the relationship between wPMF score and the B factor value of crystal waters was studied. It was found that wPMF agrees well with some previously reported experimental observations. In addition, the wPMF score was also tested in parallel with 3D‐RISM to measure the ability of retrieving experimentally observed waters, and showed comparable performance but with much less computational cost. In the end, we proposed a grid‐based clustering scheme together with a distance weighted wPMF score to further extend wPMF to predict the potential hydration sites of protein structure. From the test, this approach can predict the hydration site at the accuracy about 80% when the calculated score lower than ?4.0. It also allows the assessment of whether or not a given water molecule should be targeted for displacement in ligand design. Overall, the wPMF presented here provides an optional solution to many water related computational modeling problems, some of which can be highly valuable as part of a rational drug design strategy. © 2012 Wiley Periodicals, Inc.  相似文献   

12.
Summary P450SU1 and P450SU2 are herbicide-inducible bacterial cytochrome P450 enzymes from Streptomyces griseolus. They have two of the highest sequence identities to camphor hydroxylase (P450cam from Pseudomonas putida), the cytochrome P450 with the first known crystal structure. We have built several models of these two proteins to investigate the variability in the structures that can occur from using different modeling protocols. We looked at variability due to alignment methods, backbone loop conformations and refinement methods. We have constructed two models for each protein using two alignment algorithms, and then an additional model using an identical alignment but different loop conformations for both buried and surface loops. The alignments used to build the models were created using the Needleman-Wunsch method, adapted for multiple sequences, and a manual method that utilized both a dotmatrix search matrix and the Needleman-Wunsch method. After constructing the initial models, several energy minimization methods were used to explore the variability in the final models caused by the choice of minimization techniques. Features of cytochrome P450cam and the cytochrome P450 superfamily, such as the ferredoxin binding site, the heme binding site and the substrate binding site were used to evaluate the validity of the models. Although the final structures were very similar between the models with different alignments, active-site residues were found to be dependent on the conformations of buried loops and early stages of energy minimization. We show which regions of the active site are the most dependent on the particular methods used, and which parts of the structures seem to be independent of the methods.  相似文献   

13.
The protein‐protein docking server ClusPro is used by thousands of laboratories, and models built by the server have been reported in over 300 publications. Although the structures generated by the docking include near‐native ones for many proteins, selecting the best model is difficult due to the uncertainty in scoring. Small angle X‐ray scattering (SAXS) is an experimental technique for obtaining low resolution structural information in solution. While not sufficient on its own to uniquely predict complex structures, accounting for SAXS data improves the ranking of models and facilitates the identification of the most accurate structure. Although SAXS profiles are currently available only for a small number of complexes, due to its simplicity the method is becoming increasingly popular. Since combining docking with SAXS experiments will provide a viable strategy for fairly high‐throughput determination of protein complex structures, the option of using SAXS restraints is added to the ClusPro server. © 2015 Wiley Periodicals, Inc.  相似文献   

14.
The Fas antigen, a cell surface receptor belonging to the tumor necrosis factor receptor(TNFR) superfamily, triggers programmed cell death (apoptosis) in the immune system. Thethree-dimensional structure of Fas and molecular details of the interaction between Fas andits ligand are currently unknown. A three-dimensional model of the Fas extracellular regionwas generated by comparative modeling. Inverse folding analysis suggested goodsequence–structure compatibility of the model and thus reasonable accuracy. Themodel was analyzed in the light of information provided by studies on TNFR and CD40,another member of the TNFR family, and the Fas ligand binding site was predicted.  相似文献   

15.
Protein–protein interactions play a central role in the biological processes of cells. Accurate prediction of the interacting residues in protein–protein interactions enhances understanding of the interaction mechanisms and enables in silico mutagenesis, which can help facilitate drug design and deepen our understanding of the inner workings of cells. Correlations have been found among interacting residues as a result of selection pressure to retain the interaction during evolution. In previous work, incorporation of such correlations in the interaction profile hidden Markov models with a special decoding algorithm (ETB-Viterbi) has led to improvement in prediction accuracy. In this work, we first demonstrated the sub-optimality of the ETB-Viterbi algorithm, and then reformulated the optimality of decoding paths to include correlations between interacting residues. To identify optimal decoding paths, we propose a post-decoding re-ranking algorithm based on a genetic algorithm with simulated annealing and show that the new method gains an increase of near 14% in prediction accuracy over the ETB-Viterbi algorithm.  相似文献   

16.
One of the major challenges for protein tertiary structure prediction strategies is the quality of conformational sampling algorithms, which can effectively and readily search the protein fold space to generate near‐native conformations. In an effort to advance the field by making the best use of available homology as well as fold recognition approaches along with ab initio folding methods, we have developed Bhageerath‐H Strgen, a homology/ab initio hybrid algorithm for protein conformational sampling. The methodology is tested on the benchmark CASP9 dataset of 116 targets. In 93% of the cases, a structure with TM‐score ≥ 0.5 is generated in the pool of decoys. Further, the performance of Bhageerath‐H Strgen was seen to be efficient in comparison with different decoy generation methods. The algorithm is web enabled as Bhageerath‐H Strgen web tool which is made freely accessible for protein decoy generation ( http://www.scfbio‐iitd.res.in/software/Bhageerath‐HStrgen1.jsp ). © 2013 Wiley Periodicals, Inc.  相似文献   

17.
Parameterization and test calculations of a reduced protein model with new energy terms are presented. The new energy terms retain the steric properties and the most significant degrees of freedom of protein side chains in an efficient way using only one to three virtual atoms per amino acid residue. The energy terms are implemented in a force field containing predefined secondary structure elements as constraints, electrostatic interaction terms, and a solvent‐accessible surface area term to include the effect of solvation. In the force field the main‐chain peptide units are modeled as electric dipoles, which have constant directions in α‐helices and β‐sheets and variable conformation‐dependent directions in loops. Protein secondary structures can be readily modeled using these dipole terms. Parameters of the force field were derived using a large set of experimental protein structures and refined by minimizing RMS errors between the experimental structures and structures generated using molecular dynamics simulations. The final average RMS error was 3.7 Å for the main‐chain virtual atoms (Cα atoms) and 4.2 Å for all virtual atoms for a test set of 10 proteins with 58–294 amino acid residues. The force field was further tested with a substantially larger test set of 608 proteins yielding somewhat lower accuracy. The fold recognition capabilities of the force field were also evaluated using a set of 27,814 misfolded decoy structures. © 2001 John Wiley & Sons, Inc. J Comput Chem 22: 1229–1242, 2001  相似文献   

18.
Summary Proteins tend to use recurrent structural motifs on all levels of organization. In this paper we first survey the topics of recurrent motifs on the local secondary structure level and on the global fold level. Then, we focus on the intermediate level which we call the short structural motifs. We were able to identify a set of structural building blocks that are very common in protein structure. We suggest that these building blocks can be used as an important link between the primary sequence and the tertiary structure. In this framework, we present our latest results on the structural variability of the extended strand motifs. We show that extended strands can be divided into three distinct structural classes, each with its own sequence specificity. Other approaches to the study of short structural motifs are reviewed.  相似文献   

19.
Variable predictive model based class discrimination (VPMCD) algorithm is proposed as an effective protein secondary structure classification tool. The algorithm mathematically represents the characteristics amino acid interactions specific to each protein structure and exploits them further to distinguish different structures. The new concept and the VPMCD classifier are established using well-studied datasets containing four protein classes as benchmark. The protein samples selected from SCOP and PDB databases with varying homology (25-100%) and non-uniform distribution of class samples provide challenging classification problem. The performance of the new method is compared with advanced classification algorithms like component coupled, SVM and neural networks. VPMCD provides superior performance for high homology datasets. 100% classification is achieved for self-consistency test and an improvement of 5% prediction accuracy is obtained during Jackknife test. The sensitivity of the new algorithm is investigated by varying model structures/types and sequence homology. Simpler to implement VPMCD algorithm is observed to be a robust classification technique and shows potential for effective extensions to other clinical diagnosis and data mining applications in biological systems.  相似文献   

20.
All currently leading protein secondary structure prediction methods use a multiple protein sequence alignment to predict the secondary structure of the top sequence. In most of these methods, prior to prediction, alignment positions showing a gap in the top sequence are deleted, consequently leading to shrinking of the alignment and loss of position-specific information. In this paper we investigate the effect of this removal of information on secondary structure prediction accuracy. To this end, we have designed SymSSP, an algorithm that post-processes the predicted secondary structure of all sequences in a multiple sequence alignment by (i) making use of the alignment's evolutionary information and (ii) re-introducing most of the information that would otherwise be lost. The post-processed information is then given to a new dynamic programming routine that produces an optimally segmented consensus secondary structure for each of the multiple alignment sequences. We have tested our method on the state-of-the-art secondary structure prediction methods PHD, PROFsec, SSPro2 and JNET using the HOMSTRAD database of reference alignments. Our consensus-deriving dynamic programming strategy is consistently better at improving the segmentation quality of the predictions compared to the commonly used majority voting technique. In addition, we have applied several weighting schemes from the literature to our novel consensus-deriving dynamic programming routine. Finally, we have investigated the level of noise introduced by prediction errors into the consensus and show that predictions of edges of helices and strands are half the time wrong for all the four tested prediction methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号