首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Prediction of protein folding rates from amino acid sequences is one of the most important challenges in molecular biology. In this work, I have related the protein folding rates with physical-chemical, energetic and conformational properties of amino acid residues. I found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two- and three-state proteins, indicating the importance of native state topology in determining the protein folding rates. I have formulated a simple linear regression model for predicting the protein folding rates from amino acid sequences along with structural class information and obtained an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are 0.99, 0.96 and 0.95, respectively, for all-alpha, all-beta and mixed class proteins. This is the first available method, which is capable of predicting the protein folding rates just from the amino acid sequence with the aid of generic amino acid properties and structural class information.  相似文献   

2.
The prediction of protein unfolding rates from amino acid sequences is one of the most important challenges in computational biology and chemistry. The analysis on the relationship between protein unfolding rates and physical-chemical, energetic, and conformational properties of amino acid residues provides valuable information to understand and predict the unfolding rates of two- and three-state proteins. We found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and unfolding rates of two- and three-state proteins, indicating the importance of native-state topology in determining the protein unfolding rates. We have formulated three independent linear regression equations to different structural classes of proteins for predicting their unfolding rates from amino acid sequences and obtained an excellent agreement between predicted and experimentally observed unfolding rates of proteins; the correlation coefficients are 0.999, 0.990, and 0.992, respectively, for all-alpha, all-beta, and mixed-class proteins. Further, we have derived a general equation applicable to all structural classes of proteins, which can be used for predicting the unfolding rates for proteins of an unknown structural class. We observed a correlation of 0.987 and 0.930, respectively, for back-check and jack-knife tests. These accuracy levels are better than those of other methods in the literature.  相似文献   

3.
Prediction of protein folding rate change upon amino acid substitution is an important and challenging problem in protein folding kinetics and design. In this work, we have analyzed the relationship between amino acid properties and folding rate change upon mutation. Our analysis showed that the correlation is not significant with any of the studied properties in a dataset of 476 mutants. Further, we have classified the mutants based on their locations in different secondary structures and solvent accessibility. For each category, we have selected a specific combination of amino acid properties using genetic algorithm and developed a prediction scheme based on quadratic regression models for predicting the folding rate change upon mutation. Our results showed a 10-fold cross validation correlation of 0.72 between experimental and predicted change in protein folding rates. The correlation is 0.73, 0.65 and 0.79, respectively in strand, helix and coil segments. The method has been further tested with an extended dataset of 621 mutants and a blind dataset of 62 mutants, and we observed a good agreement with experiments. We have developed a web server for predicting the folding rate change upon mutation and it is available at .  相似文献   

4.
Understanding the relationship between amino acid sequences and folding rate of proteins is a challenging task similar to protein folding problem. In this work, we have analyzed the relative importance of protein sequence and structure for predicting the protein folding rates in terms of amino acid properties and contact distances, respectively. We found that the parameters derived with protein sequence (physical-chemical, energetic, and conformational properties of amino acid residues) show very weak correlation (|r| < 0.39) with folding rates of 28 two-state proteins, indicating that the sequence information alone is not sufficient to understand the folding rates of two-state proteins. However, the maximum positive correlation obtained for the properties, number of medium-range contacts, and alpha-helical tendency reveals the importance of local interactions to initiate protein folding. On the other hand, a remarkable correlation (r varies from -0.74 to -0.88) has been obtained between structural parameters (contact order, long-range order, and total contact distance) and protein folding rates. Further, we found that the secondary structure content and solvent accessibility play a marginal role in determining the folding rates of two-state proteins. Multiple regression analysis carried out with the combination of three properties, beta-strand tendency, enthalpy change, and total contact distance improved the correlation to 0.92 with protein folding rates. The relative importance of existing methods along with multiple-regression model proposed in this work will be discussed. Our results demonstrate that the native-state topology is the major determinant for the folding rates of two-state proteins.  相似文献   

5.
One of the most important challenges in computational and molecular biology is to understand the relationship between amino acid sequences and the folding rates of proteins. Recent works suggest that topological parameters, amino acid properties, chain length and the composition index relate well with protein folding rates, however, sequence order information has seldom been considered as a property for predicting protein folding rates. In this study, amino acid sequence order was used to derive an effective method, based on an extended version of the pseudo-amino acid composition, for predicting protein folding rates without any explicit structural information. Using the jackknife cross validation test, the method was demonstrated on the largest dataset (99 proteins) reported. The method was found to provide a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.81 (with a highly significant level) and the standard error is 2.46. The reported algorithm was found to perform better than several representative sequence-based approaches using the same dataset. The results indicate that sequence order information is an important determinant of protein folding rates.  相似文献   

6.
The ability to predict protein folding rates constitutes an important step in understanding the overall folding mechanisms. Although many of the prediction methods are structure based, successful predictions can also be obtained from the sequence. We developed a novel method called prediction of protein folding rates (PPFR), for the prediction of protein folding rates from protein sequences. PPFR implements a linear regression model for each of the mainstream folding dynamics including two-, multi-, and mixed-state proteins. The proposed method provides predictions characterized by strong correlations with the experimental folding rates, which equal 0.87 for the two- and multistate proteins and 0.82 for the mixed-state proteins, when evaluated with out-of-sample jackknife test. Based on in-sample and out-of-sample tests, the PPFR's predictions are shown to be better than most of other sequence only and structure-based predictors and complementary to the predictions of the most recent sequence-based QRSM method. We show that simultaneous incorporation of several characteristics, including the sequence, physiochemical properties of residues, and predicted secondary structure provides improved quality. This hybridized prediction model was analyzed to reveal the complementary factors that can be used in tandem to predict folding rates. We show that bigger proteins require more time for folding, higher helical and coil content and the presence of Phe, Asn, and Gln may accelerate the folding process, the inclusion of Ile, Val, Thr, and Ser may slow down the folding process, and for the two-state proteins increased beta-strand content may decelerate the folding process. Finally, PPFR provides strong correlation when predicting sequences with low similarity.  相似文献   

7.
Machine learning algorithms have wide range of applications in bioinformatics and computational biology such as prediction of protein secondary structures, solvent accessibility, binding site residues in protein complexes, protein folding rates, stability of mutant proteins, and discrimination of proteins based on their structure and function. In this work, we focus on two aspects of predictions: (i) protein folding rates and (ii) stability of proteins upon mutations. We briefly introduce the concepts of protein folding rates and stability along with available databases, features for prediction methods and measures for prediction performance. Subsequently, the development of structure based parameters and their relationship with protein folding rates will be outlined. The structure based parameters are helpful to understand the physical basis for protein folding and stability. Further, basic principles of major machine learning techniques will be mentioned and their applications for predicting protein folding rates and stability of mutant proteins will be illustrated. The machine learning techniques could achieve the highest accuracy of predicting protein folding rates and stability. In essence, statistical methods and machine learning algorithms are complimenting each other for understanding and predicting protein folding rates and the stability of protein mutants. The available online resources on protein folding rates and stability will be listed.  相似文献   

8.
折叠速率预测对阐明蛋白质折叠机理意义重大.本文收集了115条目前已知折叠速率的蛋白质样本(包括二态、多态和混态蛋白),为了较全面地表征蛋白质分子的一级结构信息,提取序列长度、氨基酸残基多尺度组分、成对残基k-space特征与基于残基物理化学性质的地统计学关联总共9357维特征.经改进的二元矩阵重排过滤器和多轮末尾淘汰非线性筛选,获得23个物理化学意义明确的保留特征,建立的非线性支持向量回归模型Jackknife交叉验证的相关系数R=0.95,优于文献报道及其他参比特征选择方法.支持向量回归解释体系表明折叠速率与保留描述符的非线性回归极显著,分析了各保留描述符对折叠速率的影响,结果表明蛋白质折叠速率与序列长度、中短程关联特征、三联体残基组份特征等密切相关.  相似文献   

9.
Prediction of transmembrane beta-strands in outer membrane proteins (OMP) is one of the important problems in computational chemistry and biology. In this work, we propose a method based on neural networks for identifying the membrane-spanning beta-strands. We introduce the concept of "residue probability" for assigning residues in transmembrane beta-strand segments. The performance of our method is evaluated with single-residue accuracy, correlation, specificity, and sensitivity. Our predicted segments show a good agreement with experimental observations with an accuracy level of 73% solely from amino acid sequence information. Further, the predictive power of N- and C-terminal residues in each segments, number of segments in each protein, and the influence of cutoff probability for identifying membrane-spanning beta-strands will be discussed. We have developed a Web server for predicting the transmembrane beta-strands from the amino acid sequence, and the prediction results are available at http://psfs.cbrc.jp/tmbeta-net/.  相似文献   

10.
A modified Sammon algorithm was developed to display a relationship between proteins based on their amino acid composition. In the first stage of the method, a 19-dimensional compositional space of representative proteins was mapped into a two-dimensional space (2D) using the original Sammon projection creating a contour map. In the second stage, this contour map was used as a reference for new proteins projected into 2D. Data analysis showed that proteins belonging to the same structural classes formed characteristic and distinct clusters, which could be potentially useful in the prediction of protein structural classes. However, we observed significant overlapping of the clusters, which may explain the limited success of previous protein folding prediction based solely on amino acid composition. Regardless, the modified Sammon projections can generate a unique index for each individually projected protein related to its amino acid composition, which may be a useful tool in the exploratory classification of proteins. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 1049–1059, 1999  相似文献   

11.
The proteins structure can be mainly classified into four classes: all-alpha, all-beta, alpha/beta, and alpha + beta protein according to their chain fold topologies. For the purpose of predicting the protein structural class, a new predicting algorithm, in which the increment of diversity combines with Quadratic Discriminant analysis, is presented to study and predict protein structural class. On the basis of the concept of the pseudo amino acid composition (Chou, Proteins: Struct Funct Genet 2001, 43, 246; Erratum: Proteins Struct Funct Genet 2001, 44, 60), 400 dipeptide components and 20 amino acid composition are, respectively, selected as parameters of diversity source. Total of 204 nonhomologous proteins constructed by Chou (Chou, Biochem Biophys Res Commun 1999, 264, 216) are used for training and testing the predictive model. The predicted results by using the pseudo amino acids approach as proposed in this paper can remarkably improve the success rates, and hence the current method may play a complementary role to other existing methods for predicting protein structural classification.  相似文献   

12.
The structural class is an important attribute used to characterize the overall folding type of a protein or its domain. Since the concept of protein structural class was developed about 3 decades ago based on a visual inspection of polypeptide chain topologies in a dataset of only 31 gloular proteins, the number of structure-known proteins has been increased rapidly. For example, as of 12-July-2005, the entries deposited into RCSB PDB Protein Data Bank for proteins, peptides, and viruses whose 3-dimensional structures were determined by X-ray and NMR techniques have been increased to 28,920. To properly cover more and more structure-known proteins, some modification and expansion from the original structural classification scheme have been developed. Meanwhile, many different approaches have been proposed for predicting the structural class of proteins. In this review, the new classification schemes are briefly introduced. The attention is focused on the progress in structural class prediction and its impact in stimulating the development of identifying the other attributes of proteins. It is interesting to point out that the development of the latter has actually in turn greatly enriched the power of the former. Also, some promising approaches for the further development of protein structural class prediction are also addressed.  相似文献   

13.
14.
Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important problem both for detecting outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures. In this work, we have systematically analyzed the distribution of amino acid residues in the sequences of globular and outer membrane proteins. We observed that the occurrence of two neighboring aliphatic and polar residues is significantly higher in outer membrane proteins than in globular proteins. From the information about the dipeptide composition we have devised a statistical method for discriminating outer membrane proteins from other globular and membrane proteins. Our approach correctly picked up the outer membrane proteins with an accuracy of 95% for the training set of 337 proteins. On the other hand, our method has correctly excluded the globular proteins at an accuracy of 79% in a non-redundant dataset of 674 proteins. Furthermore, the present method is able to correctly exclude alpha-helical membrane proteins up to an accuracy of 87%. These accuracy levels are comparable to other methods in the literature. The influence of protein size and structural class for discrimination is discussed.  相似文献   

15.
Hydroxyl radical protein footprinting coupled to mass spectrometry has been developed over the last decade and has matured to a powerful method for analyzing protein structure and dynamics. It has been successfully applied in the analysis of protein structure, protein folding, protein dynamics, and protein–protein and protein–DNA interactions. Using synchrotron radiolysis, exposure of proteins to a ‘white’ X‐ray beam for milliseconds provides sufficient oxidative modification to surface amino acid side chains, which can be easily detected and quantified by mass spectrometry. Thus, conformational changes in proteins or protein complexes can be examined using a time‐resolved approach, which would be a valuable method for the study of macromolecular dynamics. In this review, we describe a new application of hydroxyl radical protein footprinting to probe the time evolution of the calcium‐dependent conformational changes of gelsolin on the millisecond timescale. The data suggest a cooperative transition as multiple sites in different molecular subdomains have similar rates of conformational change. These findings demonstrate that time‐resolved protein footprinting is suitable for studies of protein dynamics that occur over periods ranging from milliseconds to seconds. In this review, we also show how the structural resolution and sensitivity of the technology can be improved as well. The hydroxyl radical varies in its reactivity to different side chains by over two orders of magnitude, thus oxidation of amino acid side chains of lower reactivity are more rarely observed in such experiments. Here we demonstrate that the selected reaction monitoring (SRM)‐based method can be utilized for quantification of oxidized species, improving the signal‐to‐noise ratio. This expansion of the set of oxidized residues of lower reactivity will improve the overall structural resolution of the technique. This approach is also suggested as a basis for developing hypothesis‐driven structural mass spectrometry experiments. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

16.
17.
依据氨基酸残基的相关性预测蛋白质的结构类型   总被引:2,自引:0,他引:2  
作为蛋白质的建筑构件,各种类型的蛋白质的20种氨基酸残基之间存在着特定的相互关联,反映了氨基酸残基之间的制约性,并有深刻的物理和化学的内在因素.某些氨基酸残基对之间的相关系数可以作为一种类型的蛋白质区别于其它类型蛋白质的特征,用于蛋白质结构类型的预测.研究了4种类型的蛋白质204个样品的氨基酸残基对的相关性系数,找出了可作为蛋白质结构类型特征的氨基酸残基的相关对,并用于蛋白质结构类型的预测,对于α型、β型、α/β型和α+β型蛋白质的204个蛋白质样品的交叉测试,正确率分别为94%、89%、79%和89%,平均为88%,高于简单距离法和欧几里德距离法.  相似文献   

18.
The folding of an extended protein to its unique native state requires establishment of specific, predetermined, often distant, contacts between amino acid residue pairs. The dynamics of contact pair formation between various hydrophobic residues during folding of two different small proteins, the chicken villin head piece (HP-36) and the Alzheimer protein beta-amyloid (betaA-40), are investigated by Brownian dynamics (BD) simulations. These two proteins represent two very different classes-HP-36 being globular while betaA-40 is nonglobular, stringlike. Hydropathy scale and nonlocal helix propensity of amino acids are used to model the complex interaction potential among the various amino acid residues. The minimalistic model we use here employs a connected backbone chain of atoms of equal size while an amino acid is attached to each backbone atom as an additional atom of differing sizes and interaction parameters, determined by the characteristics of each amino acid. Even for such simple models, we find that the low-energy structures obtained by BD simulations of both the model proteins mimic the native state of the real protein rather well, with a best root-mean-square deviation of 4.5 A for HP-36. For betaA-40 (where a single well-defined structure is not available), the simulated structures resemble the reported ensemble rather well, with the well-known beta-bend correctly reproduced. We introduce and calculate a contact pair distance time correlation function, C(P) (ij)(t), to quantify the dynamical evolution of the pair contact formation between the amino acid residue pairs i and j. The contact pair time correlation function exhibits multistage dynamics, including a two stage fast collapse, followed by a slow (microsecond long) late stage dynamics for several specific pairs. The slow late stage dynamics is in accordance with the findings of Sali et al. Analysis of the individual trajectories shows that the slow decay is due to the attempt of the protein to form energetically more favorable pair contacts to replace the less favorable ones. This late stage contact formation is a highly cooperative process, involving participation of several pairs and thus entropically unfavorable and expected to face a large free energy barrier. This is because any new pair contact formation among hydrophobic pairs will require breaking of several contacts, before the favorable ones can be formed. This aspect of protein folding dynamics is similar to relaxation in glassy liquids, where also alpha relaxation requires highly cooperative process of hopping. The present analysis suggests that waiting time for the necessary pair contact formation may obey the Poissonian distribution. We also study the dynamics of Forster energy transfer during folding between two tagged amino acid pairs. This dynamics can be studied by fluorescence resonance energy transfer (FRET). It is found that suitably placed donor-acceptor pairs can capture the slow dynamics during folding. The dynamics probed by FRET is predicted to be nonexponential.  相似文献   

19.
beta-barrel membrane proteins perform a variety of functions, such as mediating non-specific, passive transport of ions and small molecules, selectively passing the molecules like maltose and sucrose and are involved in voltage dependent anion channels. Understanding the structural features of beta-barrel membrane proteins and detecting them in genomic sequences are challenging tasks in structural and functional genomics. In this review, with the survey of experimentally known amino acid sequences and structures, the characteristic features of amino acid residues in beta-barrel membrane proteins and novel parameters for understanding their folding and stability will be described. The development of statistical methods and machine learning techniques for discriminating beta-barrel membrane proteins from other folding types of globular and membrane proteins will be explained along with their relative importance. Further, different methods including hydrophobicity profiles, rule based approach, amino acid properties, neural networks, hidden Markov models etc. for predicting membrane spanning segments of beta-barrel membrane proteins will be discussed. In addition, the applications of discrimination techniques for detecting beta-barrel membrane proteins in genomic sequences will be outlined. In essence, this comprehensive review would provide an overall picture about beta-barrel membrane proteins starting from the construction of datasets to genome-wide applications.  相似文献   

20.
Prediction of membrane spanning segments in β‐barrel outer membrane proteins (OMP) and their topology is an important problem in structural and functional genomics. In this work, we propose a method based on radial basis networks for predicting the number of β‐strands in OMPs and identifying their membrane spanning segments. Our method showed a leave‐one‐out cross validation accuracy of 96% in a set of 28 OMPs, which have the range of 8–22 β‐strand segments. The β‐strand segments in OMPs and the residues in membrane spanning segments are correctly predicted with the accuracy of 96% and 87%, respectively. We have developed a web server, TMBETAPRED‐RBF for predicting the transmembrane β‐strands from amino acid sequence and it is available at http://rbf.bioinfo.tw/~sachen/tmrbf.html . We suggest that our method could be an effective tool for predicting the membrane spanning regions and topology of β‐barrel membrane proteins. © 2009 Wiley Periodicals, Inc. J Comput Chem 2010  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号