首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Proteins are often characterized in terms of their primary, secondary, tertiary, and quaternary structure. Algorithms such as define secondary structure of proteins (DSSP) can automatically assign protein secondary structure based on the backbone hydrogen‐bonding pattern. However, the assignment of secondary structure elements (SSEs) becomes a challenge when only the Cα coordinates are available. In this work, we present protein C‐alpha secondary structure output (PCASSO), a fast and accurate program for assigning protein SSEs using only the Cα positions. PCASSO achieves ~95% accuracy with respect to DSSP and takes ~0.1 s using a single processor to analyze a 1000 residue system with multiple chains. Our approach was compared with current state‐of‐the‐art Cα‐based methods and was found to outperform all of them in both speed and accuracy. A practical application is also presented and discussed. © 2014 Wiley Periodicals, Inc.  相似文献   

2.
Preferred conformations of amino acid side chains have been well established through statistically obtained rotamer libraries. Typically, these provide bond torsion angles allowing a side chain to be traced atom by atom. In cases where it is desirable to reduce the complexity of a protein representation or prediction, fixing all side-chain atoms may prove unwieldy. Therefore, we introduce a general parametrization to allow positions of representative atoms (in the present study, these are terminal atoms) to be predicted directly given backbone atom coordinates. Using a large, culled data set of amino acid residues from high-resolution protein crystal structures, anywhere from 1 to 7 preferred conformations were observed for each terminal atom of the non-glycine residues. Side-chain length from the backbone C(alpha) is one of the parameters determined for each conformation, which should itself be useful. Prediction of terminal atoms was then carried out for a second, nonredundant set of protein structures to validate the data set. Using four simple probabilistic approaches, the Monte Carlo style prediction of terminal atom locations given only backbone coordinates produced an average root mean-square deviation (RMSD) of approximately 3 A from the experimentally determined terminal atom positions. With prediction using conditional probabilities based on the side-chain chi(1) rotamer, this average RMSD was improved to 1.74 A. The observed terminal atom conformations therefore provide reasonable and potentially highly accurate representations of side-chain conformation, offering a viable alternative to existing all-atom rotamers for any case where reduction in protein model complexity, or in the amount of data to be handled, is desired. One application of this representation with strong potential is the prediction of charge density in proteins. This would likely be especially valuable on protein surfaces, where side chains are much less likely to be fixed in single rotamers. Prediction of ensembles of structures provides a method to determine the probability density of charge and atom location; such a prediction is demonstrated graphically.  相似文献   

3.
The sequence-specific assignment of resonances is still the most time-consuming procedure that is necessary as the first step in high-resolution NMR studies of proteins. In many cases a reliable three-dimensional (3D) structure of the protein is available, for example, from X-ray spectroscopy or homology modeling. Here we introduce the st2nmr program that uses the 3D structure and Nuclear Overhauser Effect spectroscopy (NOESY) peak list(s) to evaluate and optimize trial sequence-specific assignments of spin systems derived from correlation spectra to residues of the protein. A distance-dependent target function that scores trial assignments based on the presence of expected NOESY crosspeaks is optimized in a Monte Carlo fashion. The performance of the program st2nmr is tested on real NMR data of an alpha-helical (cytochrome c) and beta-sheet (lipocalin) protein using homology models and/or X-ray structures; it succeeded in completely reproducing the correct sequence-specific assignments in most cases using 2D and/or 15N/13C Nuclear Overhauser Effect (NOE) data. Additionally to amino acid residues the program can also handle ligands that are bound to the protein, such as heme, and can be used as a complementary tool to fully automated assignment procedures.  相似文献   

4.
The major rate-limiting step in high-throughput NMR protein structure determination involves the calculation of a reliable initial fold, the elimination of incorrect nuclear Overhauser enhancement (NOE) assignments, and the resolution of NOE assignment ambiguities. We present a robust approach to automatically calculate structures with a backbone coordinate accuracy of 1.0-1.5 A from datasets in which as much as 80% of the long-range NOE information (i.e., between residues separated by more than five positions in the sequence) is incorrect. The current algorithm differs from previously published methods in that it has been expressly designed to ensure that the results from successive cycles are not biased by the global fold of structures generated in preceding cycles. Consequently, the method is highly error tolerant and is not easily funnelled down an incorrect path in either three-dimensional structure or NOE assignment space. The algorithm incorporates three main features: a linear energy function representation of the NOE restraints to allow maximization of the number of simultaneously satisfied restraints during the course of simulated annealing; a method for handling the presence of multiple possible assignments for each NOE cross-peak which avoids local minima by treating each possible assignment as if it were an independent restraint; and a probabilistic method to permit both inactivation and reactivation of all NOE restraints on the fly during the course of simulated annealing. NOE restraints are never removed permanently, thereby significantly reducing the likelihood of becoming trapped in a false minimum of NOE assignment space. The effectiveness of the algorithm is demonstrated using completely automatically peak-picked experimental NOE data from two proteins: interleukin-4 (136 residues) and cyanovirin-N (101 residues). The limits of the method are explored using simulated data on the 56-residue B1 domain of Streptococcal protein G.  相似文献   

5.
A first step toward predicting the structure of a protein is to determine its secondary structure. The secondary structure information is generally used as starting point to solve protein crystal structures. In the present study, a machine learning approach based on a complete set of two-class scoring functions was used. Such functions discriminate between two specific structural classes or between a single specific class and the rest. The approach uses a hierarchical scheme of scoring functions and a neural network. The parameters are determined by optimizing the recall of learning data. Quality control is performed by predicting separate independent test data. A first set of scoring functions is trained to correlate the secondary structures of residues with profiles of sequence windows of width 15, centered at these residues. The sequence profiles are obtained by multiple sequence alignment with PSI-BLAST. A second set of scoring functions is trained to correlate the secondary structures of the center residues with the secondary structures of all other residues in the sequence windows used in the first step. Finally, a neural network is trained using the results from the second set of scoring functions as input to make a decision on the secondary structure class of the residue in the center of the sequence window. Here, we consider the three-class problem of helix, strand, and other secondary structures. The corresponding prediction scheme "SPARROW" was trained with the ASTRAL40 database, which contains protein domain structures with less than 40% sequence identity. The secondary structures were determined with DSSP. In a loose assignment, the helix class contains all DSSP helix types (α, 3-10, π), the strand class contains β-strand and β-bridge, and the third class contains the other structures. In a tight assignment, the helix and strand classes contain only α-helix and β-strand classes, respectively. A 10-fold cross validation showed less than 0.8% deviation in the fraction of correct structure assignments between true prediction and recall of data used for training. Using sequences of 140,000 residues as a test data set, 80.46% ± 0.35% of secondary structures are predicted correctly in the loose assignment, a prediction performance, which is very close to the best results in the field. Most applications are done with the loose assignment. However, the tight assignment yields 2.25% better prediction performance. With each individual prediction, we also provide a confidence measure providing the probability that the prediction is correct. The SPARROW software can be used and downloaded on the Web page http://agknapp.chemie.fu-berlin.de/sparrow/ .  相似文献   

6.
Protein NMR spectroscopy has expanded dramatically over the last decade into a powerful tool for the study of their structure, dynamics, and interactions. The primary requirement for all such investigations is sequence‐specific resonance assignment. The demand now is to obtain this information as rapidly as possible and in all types of protein systems, stable/unstable, soluble/insoluble, small/big, structured/unstructured, and so on. In this context, we introduce here two reduced dimensionality experiments – (3,2)D‐hNCO canH and (3,2)D‐hN coCA nH – which enhance the previously described 2D NMR‐based assignment methods quite significantly. Both the experiments can be recorded in just about 2–3 h each and hence would be of immense value for high‐throughput structural proteomics and drug discovery research. The applicability of the method has been demonstrated using alpha‐helical bovine apo calbindin‐D9k P43M mutant (75 aa) protein. Automated assignment of this data using AUTOBA has been presented, which enhances the utility of these experiments. The backbone resonance assignments so derived are utilized to estimate secondary structures and the backbone fold using Web‐based algorithms. Taken together, we believe that the method and the protocol proposed here can be used for routine high‐throughput structural studies of proteins. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
A novel, yet simple and automated, protocol for reconstruction of complete peptide backbones from C(alpha) coordinates only is described, validated, and benchmarked. The described method collates a set of possible backbone conformations for each set of residue triads from a structural library derived from the PDB. The optimal permutation of these three residue segments of backbone conformations is determined using the dead-end elimination (DEE) algorithm. Putative conformations are evaluated using a pairwise-additive knowledge-based forcefield term and a fragment overlap term. The protocol described in this report is able to restore the full backbone coordinates to within 0.2-0.6 A of the actual crystal structure from C(alpha) coordinates only. In addition, it is insensitive to errors in the input C(alpha) coordinates with RMSDs of 3.0 A, and this is illustrated through application to deliberately distorted C(alpha) traces. The entire process, as described, is rapid, requiring of the order of a few minutes for a typical protein on a typical desktop PC. Approximations enable this to be reduced to a few seconds, although this is at the expense of prediction accuracy. This compares very favorably to previously published methods, being sufficiently fast for general use and being one of the most accurate methods. Because the method is not restricted to the reconstruction from only C(alpha) coordinates, reconstruction based on C(beta) coordinates is also demonstrated.  相似文献   

8.
Machine learning algorithms have wide range of applications in bioinformatics and computational biology such as prediction of protein secondary structures, solvent accessibility, binding site residues in protein complexes, protein folding rates, stability of mutant proteins, and discrimination of proteins based on their structure and function. In this work, we focus on two aspects of predictions: (i) protein folding rates and (ii) stability of proteins upon mutations. We briefly introduce the concepts of protein folding rates and stability along with available databases, features for prediction methods and measures for prediction performance. Subsequently, the development of structure based parameters and their relationship with protein folding rates will be outlined. The structure based parameters are helpful to understand the physical basis for protein folding and stability. Further, basic principles of major machine learning techniques will be mentioned and their applications for predicting protein folding rates and stability of mutant proteins will be illustrated. The machine learning techniques could achieve the highest accuracy of predicting protein folding rates and stability. In essence, statistical methods and machine learning algorithms are complimenting each other for understanding and predicting protein folding rates and the stability of protein mutants. The available online resources on protein folding rates and stability will be listed.  相似文献   

9.
Fully automated structure determination of proteins in solution (FLYA) yields, without human intervention, three-dimensional protein structures starting from a set of multidimensional NMR spectra. Integrating existing and new software, automated peak picking over all spectra is followed by peak list filtering, the generation of an ensemble of initial chemical shift assignments, the determination of consensus chemical shift assignments for all (1)H, (13)C, and (15)N nuclei, the assignment of NOESY cross-peaks, the generation of distance restraints, and the calculation of the three-dimensional structure by torsion angle dynamics. The resulting, preliminary structure serves as additional input to the second stage of the procedure, in which a new ensemble of chemical shift assignments and a refined structure are calculated. The three-dimensional structures of three 12-16 kDa proteins computed with the FLYA algorithm coincided closely with the conventionally determined structures. Deviations were below 0.95 A for the backbone atom positions, excluding the flexible chain termini. 96-97% of all backbone and side-chain chemical shifts in the structured regions were assigned to the correct residues. The purely computational FLYA method is suitable for substituting all manual spectra analysis and thus overcomes a main efficiency limitation of the NMR method for protein structure determination.  相似文献   

10.
A new method is described for generating all-atom protein structures from C-atom information. The method, which combines both local structural trace alignments and comparative side chain modeling with ab initio side chain modeling, makes use of both the virtual-bond and the dipole-path methods. Provided that 3D structures of structurally and functionally related proteins exist, the method presented here is highly suitable for generating all-atom coordinates of partly solved, low-resolution crystal structures. Particularly the active site region can be modeled accurately with this procedure, which enables investigation of the binding modes of different classes of ligands with molecular dynamics simulations. The method is applied to the trace of Streptococcus pneumoniae, in order to construct an all-atom structure of the transpeptidase domain. Since after generation of full coordinates of the transpeptidase domain the structure had been solved to 2.4 Å resolution, new X-ray coordinates for the worst modeled loop (residues T370 to M386; 17 out of a total number of 351 residues constituting the transpeptidase domain) were incorporated, as kindly provided by Dr. Dideberg. The structure was relaxed with molecular dynamics simulations and simulated annealing methods. The RMS deviation between the 144 aligned C-atoms and the corresponding ones in the originally solved 3.5 Å resolution crystal structure was 0.98. The 351 C-atoms of the whole transpeptidase domain of the final model showed an RMS deviation of 1.58. The Ramachandran plot showed that 79.3% of the residues are in the most favored regions, with only 1.0% occurring in disallowed regions. The model presented here can be used to investigate the three-dimensional influences of mutations around the active site of PBP2x.  相似文献   

11.
Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0% accuracy based on 10-fold cross validation (Q(3)). Surpassing 81% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3%, 82.3% and 81.8%, respectively. The prediction accuracy is further improved to 83.8% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0% of 1833 proteins, compared to 17.6% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5%) and over predicts them by 7%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/  相似文献   

12.
A novel strategy for fast NMR resonance assignment of (15)N HSQC spectra of proteins is presented. It requires the structure coordinates of the protein, a paramagnetic center, and one or more residue-selectively (15)N-labeled samples. Comparison of sensitive undecoupled (15)N HSQC spectra recorded of paramagnetic and diamagnetic samples yields data for every cross-peak on pseudocontact shift, paramagnetic relaxation enhancement, cross-correlation between Curie-spin and dipole-dipole relaxation, and residual dipolar coupling. Comparison of these four different paramagnetic quantities with predictions from the three-dimensional structure simultaneously yields the resonance assignment and the anisotropy of the susceptibility tensor of the paramagnetic center. The method is demonstrated with the 30 kDa complex between the N-terminal domain of the epsilon subunit and the theta subunit of Escherichia coli DNA polymerase III. The program PLATYPUS was developed to perform the assignment, provide a measure of reliability of the assignment, and determine the susceptibility tensor anisotropy.  相似文献   

13.
Four types of polypeptide (1)J(C alpha X) couplings are examined, involving the main-chain carbon C(alpha) and either of four possible substituents. A total 3105 values of (1)J(C alpha H alpha), (1)J(C alpha C beta), (1)J(C alpha C'), and (1)J(C alpha N') were collected from six proteins, averaging 143.4 +/- 3.3, 34.9 +/- 2.5, 52.6 +/- 0.9, and 10.7 +/- 1.2 Hz, respectively. Analysis of variances (ANOVA) reveals a variety of factors impacting on (1)J and ranks their relative statistical significance and importance to biomolecular NMR structure refinement. Accordingly, the spread in the (1)J values is attributed, in equal proportions, to amino-acid specific substituent patterns and to polypeptide-chain geometry, specifically torsions phi, psi, and chi(1) circumjacent to C(alpha). The (1)J coupling constants correlate with protein secondary structure. For alpha-helical phi, psi combinations, (1)J(C alpha H alpha) is elevated by more than one standard deviation (147.8 Hz), while both (1)J(C alpha N') and (1)J(C alpha C beta) fall short of their grand means (9.5 and 33.7 Hz). Rare positive phi torsion angles in proteins exhibit concomitant small (1)J(C alpha H alpha) and (1)J(C alpha N') (138.4 and 9.6 Hz) and large (1)J(C alpha C beta) (39.9 Hz) values. The (1)J(C alpha N') coupling varies monotonously over the phi torsion range typical of beta-sheet secondary structure and is largest (13.3 Hz) for phi around -160 degrees. All four coupling types depend on psi and thus help determine a torsion that is notoriously difficult to assess by traditional approaches using (3)J. Influences on (1)J stemming from protein secondary structure and other factors, such as amino-acid composition, are largely independent.  相似文献   

14.
Knowledge of chemical shift-structure relationships could greatly facilitate the NMR chemical shift assignment and structure refinement processes that occur during peptide/protein structure determination via NMR spectroscopy. To determine whether such correlations exist for polar side chain containing amino acid residues the serine dipeptide model, For-L-Ser-NH(2), was studied. Using the GIAO-RHF/6-31+G(d) and GIAO-RHF/TZ2P levels of theory the NMR chemical shifts of all hydrogen ((1)H(N), (1)H(alpha), (1)H(beta1), (1)H(beta2)), carbon ((13)C(alpha), (13)C(beta), (13)C') and nitrogen ((15)N) atoms have been computed for all 44 stable conformers of For-L-Ser-NH(2). An attempt was made to establish correlation between chemical shift of each nucleus and the major conformational variables (omega(0), phi, psi, omega(1), chi,(1) and chi(2)). At both levels of theory a linear correlation can be observed between (1)H(alpha)/phi, (13)C(alpha)/phi, and (13)C(alpha)/psi. These results indicate that the backbone and side-chain structures of For-L-Ser-NH(2) have a strong influence on its chemical shifts.  相似文献   

15.
In the NMR experiment, the protein backbone motion can be described by the N–H order parameters. Though protein dynamics is determined by a complex network of atomic interactions, we show that the order parameter of residues can be determined using a very simple method, the weighted protein contact number model. We computed for each Cα atom the number of neighboring Cα atoms weighted by the inverse distance squared between them. We show that the weighted contact number of each residue is directly related to its order parameter. Despite the simplicity of this model, it performs better than the other method. Since we can compute the order parameters directly from the topological properties (such as protein contact number) of protein structures, our study underscores a very direct link between protein topological structure and its dynamics.  相似文献   

16.
An automatic procedure is proposed for reconstruction of a protein backbone from its C(alpha)-trace; it is based on optimization of a simplified energy function of a peptide backbone, given its alpha-carbon trace. The energy is expressed as a sum of the energies of interaction between backbone peptide groups that are not neighbors in the sequence, the energies of local interactions within all amino acid residues, and a harmonic penalty function accounting for the conservation of standard bond angles. The energy of peptide group interactions is calculated using the assumption that each peptide group acts as a point dipole. For local interaction energy, use is made of a two-dimensional Fourier series expansion of the energies of model terminally blocked amino acid residues, calculated with the Empirical Conformational Energy Program for Peptides (ECEPP/3) force field in the angles lambda((1)) and lambda((2)) defining the rotation of peptide groups adjacent to a C(alpha) carbon atom about the corresponding C(alpha) em leader C(alpha) virtual-bond axes. To explore all possible rotations of peptide groups within a fixed C(alpha)-trace, a Monte Carlo search is carried out. The initial lambda angles are calculated by aligning the dipoles of the peptide groups that are close in space, subject to the condition of favorable local interactions. After the Monte Carlo search is accomplished with the simplified energy function, the energy of the structure is minimized with the ECEPP/3 force field, with imposition of distance constraints corresponding to the initial C(alpha)-trace geometry. The procedure was tested on model alpha-helices and beta-sheets, as well as on the crystal structure of the immunoglobulin binding protein (PDB code: 1IGD, an alpha/beta protein). In all cases, complete backbone geometry was reconstructed with a root-mean-square (rms) deviation of 0.5 A from the all-atom target structure.  相似文献   

17.
We present a computer simulation model of polymer melts representing each chain as one single particle. Besides the position coordinate of each particle, we introduce a parameter n(ij) for each pair of particles i and j within a specified distance from each other. These numbers, called entanglement numbers, describe the deviation of the system of ignored coordinates from its equilibrium state for the given configuration of the centers of mass of the polymers. The deviations of the entanglement numbers from their equilibrium values give rise to transient forces, which, together with the conservative forces derived from the potential of mean force, govern the displacements of the particles. We have applied our model to a melt of C(800)H(1602) chains at 450 K and have found good agreement with experiments and more detailed simulations. Properties addressed in this paper are radial distribution functions, dynamic structure factors, and linear as well as nonlinear rheological properties.  相似文献   

18.
采用密度泛函理论的四种方法:杂化密度泛函B3LYP与B3PW91、Perdew-Wang91交换与相关泛函WP91PW91、局域自旋密度近似SVWN,研究了A15、Al5-和Al5+团簇的多种可能结构,找到了它们稳定的结构与自旋态,与已有的理论结果作了比较,并计算了Al5-的绝热与垂直电子离解能、Al5的绝热与垂直电离势,同有关的实验数据比较,符合较好.同时对四种密度泛函方法的计算结果作了一些比较与讨论.  相似文献   

19.
In a wide variety of proteins, insolubility presents a challenge to structural biology, as X-ray crystallography and liquid-state NMR are unsuitable. Indeed, no general approach is available as of today for studying the three-dimensional structures of membrane proteins and protein fibrils. We here demonstrate, at the example of the microcrystalline model protein Crh, how high-resolution 3D structures can be derived from magic-angle spinning solid-state NMR distance restraints for fully labeled protein samples. First, we show that proton-mediated rare-spin correlation spectra, as well as carbon-13 spin diffusion experiments, provide enough short, medium, and long-range structural restraints to obtain high-resolution structures of this 2 x 10.4 kDa dimeric protein. Nevertheless, the large number of 13C/15N spins present in this protein, combined with solid-state NMR line widths of about 0.5-1 ppm, induces substantial ambiguities in resonance assignments, preventing 3D structure determination by using distance restraints uniquely assigned on the basis of their chemical shifts. In the second part, we thus demonstrate that an automated iterative assignment algorithm implemented in a dedicated solid-state NMR version of the program ARIA permits to resolve the majority of ambiguities and to calculate a de novo 3D structure from highly ambiguous solid-state NMR data, using a unique fully labeled protein sample. We present, using distance restraints obtained through the iterative assignment process, as well as dihedral angle restraints predicted from chemical shifts, the 3D structure of the fully labeled Crh dimer refined at a root-mean-square deviation of 1.33 A.  相似文献   

20.
β-Barrel membrane proteins are found in the outer membrane of gram-negative bacteria, mitochondria, and chloroplasts. They are important for pore formation, membrane anchoring, and enzyme activity. These proteins are also often responsible for bacterial virulence. Due to difficulties in experimental structure determination, they are sparsely represented in the protein structure databank. We have developed a computational method for predicting structures of the transmembrane (TM) domains of β-barrel membrane proteins. Based on physical principles, our method can predict structures of the TM domain of β-barrel membrane proteins of novel topology, including those from eukaryotic mitochondria. Our method is based on a model of physical interactions, a discrete conformational state space, an empirical potential function, as well as a model to account for interstrand loop entropy. We are able to construct three-dimensional atomic structure of the TM domains from sequences for a set of 23 nonhomologous proteins (resolution 1.8-3.0 ?). The median rmsd of TM domains containing 75-222 residues between predicted and measured structures is 3.9 ? for main chain atoms. In addition, stability determinants and protein-protein interaction sites can be predicted. Such predictions on eukaryotic mitochondria outer membrane protein Tom40 and VDAC are confirmed by independent mutagenesis and chemical cross-linking studies. These results suggest that our model captures key components of the organization principles of β-barrel membrane protein assembly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号