首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important problem both for detecting outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures. In this work, we have systematically analyzed the distribution of amino acid residues in the sequences of globular and outer membrane proteins. We observed that the occurrence of two neighboring aliphatic and polar residues is significantly higher in outer membrane proteins than in globular proteins. From the information about the dipeptide composition we have devised a statistical method for discriminating outer membrane proteins from other globular and membrane proteins. Our approach correctly picked up the outer membrane proteins with an accuracy of 95% for the training set of 337 proteins. On the other hand, our method has correctly excluded the globular proteins at an accuracy of 79% in a non-redundant dataset of 674 proteins. Furthermore, the present method is able to correctly exclude alpha-helical membrane proteins up to an accuracy of 87%. These accuracy levels are comparable to other methods in the literature. The influence of protein size and structural class for discrimination is discussed.  相似文献   

2.
Prediction of protein folding rates from amino acid sequences is one of the most important challenges in molecular biology. In this work, I have related the protein folding rates with physical-chemical, energetic and conformational properties of amino acid residues. I found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two- and three-state proteins, indicating the importance of native state topology in determining the protein folding rates. I have formulated a simple linear regression model for predicting the protein folding rates from amino acid sequences along with structural class information and obtained an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are 0.99, 0.96 and 0.95, respectively, for all-alpha, all-beta and mixed class proteins. This is the first available method, which is capable of predicting the protein folding rates just from the amino acid sequence with the aid of generic amino acid properties and structural class information.  相似文献   

3.
Unlike all-helices membrane proteins, beta-barrel membrane proteins can not be successfully discriminated from other proteins, especially from all-beta soluble proteins. This paper performs an analysis on the amino acid composition in membrane parts of 12 beta-barrel membrane proteins versus beta-strands of 79 all-beta soluble proteins. The average and variance of the amino acid composition in these two classes are calculated. Amino acids such as Gly, Asn, Val that are most likely associated with classification are selected based on Fishers discriminant ratio. A linear classifier built with these selected amino acids composition in observed beta-strands achieves 100% classification accuracy for 12 membrane proteins and 79 soluble proteins in a four-fold cross-validation experiment. Since at present the accuracy of secondary structure prediction is quite high, a promising method to identify beta-barrel membrane proteins is presented based on the linear classifier coupled with predicted secondary structure. Applied to 241 beta-barrel membrane proteins and 3855 soluble proteins with various structures, the method achieves 85.48% (206/241) sensitivity and 92.53% specificity (3567/3855).  相似文献   

4.
Machine learning algorithms have wide range of applications in bioinformatics and computational biology such as prediction of protein secondary structures, solvent accessibility, binding site residues in protein complexes, protein folding rates, stability of mutant proteins, and discrimination of proteins based on their structure and function. In this work, we focus on two aspects of predictions: (i) protein folding rates and (ii) stability of proteins upon mutations. We briefly introduce the concepts of protein folding rates and stability along with available databases, features for prediction methods and measures for prediction performance. Subsequently, the development of structure based parameters and their relationship with protein folding rates will be outlined. The structure based parameters are helpful to understand the physical basis for protein folding and stability. Further, basic principles of major machine learning techniques will be mentioned and their applications for predicting protein folding rates and stability of mutant proteins will be illustrated. The machine learning techniques could achieve the highest accuracy of predicting protein folding rates and stability. In essence, statistical methods and machine learning algorithms are complimenting each other for understanding and predicting protein folding rates and the stability of protein mutants. The available online resources on protein folding rates and stability will be listed.  相似文献   

5.
Understanding the relationship between amino acid sequences and folding rate of proteins is a challenging task similar to protein folding problem. In this work, we have analyzed the relative importance of protein sequence and structure for predicting the protein folding rates in terms of amino acid properties and contact distances, respectively. We found that the parameters derived with protein sequence (physical-chemical, energetic, and conformational properties of amino acid residues) show very weak correlation (|r| < 0.39) with folding rates of 28 two-state proteins, indicating that the sequence information alone is not sufficient to understand the folding rates of two-state proteins. However, the maximum positive correlation obtained for the properties, number of medium-range contacts, and alpha-helical tendency reveals the importance of local interactions to initiate protein folding. On the other hand, a remarkable correlation (r varies from -0.74 to -0.88) has been obtained between structural parameters (contact order, long-range order, and total contact distance) and protein folding rates. Further, we found that the secondary structure content and solvent accessibility play a marginal role in determining the folding rates of two-state proteins. Multiple regression analysis carried out with the combination of three properties, beta-strand tendency, enthalpy change, and total contact distance improved the correlation to 0.92 with protein folding rates. The relative importance of existing methods along with multiple-regression model proposed in this work will be discussed. Our results demonstrate that the native-state topology is the major determinant for the folding rates of two-state proteins.  相似文献   

6.
Understanding the relationship between amino acid sequences and folding rates of proteins is an important task in computational and molecular biology. In this work, we have systematically analyzed the composition of amino acid residues for proteins with different ranges of folding rates. We observed that the polar residues, Asn, Gln, Ser, and Lys, are dominant in fast folding proteins whereas the hydrophobic residues, Ala, Cys, Gly, and Leu, prefer to be in slow folding proteins. Further, we have developed a method based on quadratic response surface models for predicting the folding rates of 77 two- and three-state proteins. Our method showed a correlation of 0.90 between experimental and predicted protein folding rates using leave-one-out cross-validation method. The classification of proteins based on structural class improved the correlation to 0.98 and it is 0.99, 0.98, and 0.96, respectively, for all-alpha, all-beta, and mixed class proteins. In addition, we have utilized Baysean classification theory for discriminating two- and three-state proteins, which showed an accuracy of 90%. We have developed a web server for predicting protein folding rates and it is available at http://bioinformatics.myweb.hinet.net/foldrate.htm.  相似文献   

7.
Circular Dichroism (CD) relies on the differential absorption of left and right circularly polarised radiation by chromophores which either possess intrinsic chirality or are placed in chiral environments. Proteins possess a number of chromophores which can give rise to CD signals. In the far UV region (240-180 nm), which corresponds to peptide bond absorption, the CD spectrum can be analysed to give the content of regular secondary structural features such as alpha-helix and beta-sheet. The CD spectrum in the near UV region (320-260 nm) reflects the environments of the aromatic amino acid side chains and thus gives information about the tertiary structure of the protein. Other non-protein chromophores such as flavin and haem moieties can give rise to CD signals which depend on the precise environment of the chromophore concerned. Because of its relatively modest resource demands, CD has been used extensively to give useful information about protein structure, the extent and rate of structural changes and ligand binding. In the protein design field, CD is used to assess the structure and stability of the designed protein fragments. Studies of protein folding make extensive use of CD to examine the folding pathway; the technique has been especially important in characterising molten globule intermediates which may be involved in the folding process. CD is an extremely useful technique for assessing the structural integrity of membrane proteins during extraction and characterisation procedures. The interactions between chromophores can give rise to characteristic CD signals. This is well illustrated by the case of the light harvesting complex from photosynthetic bacteria, where the CD spectra can be analysed to indicate the extent of orbital overlap between the rings of bacteriochlorophyll molecules. It is therefore evident that CD is a versatile technique in structural biology, with an increasingly wide range of applications.  相似文献   

8.
The localization of membrane transporters at the forefront of natural barriers makes these proteins very interesting due to their involvement in the absorption and distribution of nutrients and xenobiotics, including drugs. Over the years, structure/function relationship studies have been performed employing several strategies, including chemical modification of exposed amino acid residues. These approaches are very meaningful when applied to membrane transporters, given that these proteins are characterized by both hydrophobic and hydrophilic domains with a different degree of accessibility to employed chemicals. Besides basic features, the chemical targeting approaches can disclose information useful for pharmacological applications as well. An eminent example of this picture is the histidine/large amino acid transporter SLC7A5, known as LAT1 (Large Amino Acid Transporter 1). This protein is crucial in cell life because it is responsible for mediating the absorption and distribution of essential amino acids in peculiar body districts, such as the blood brain barrier and placenta. Furthermore, LAT1 can recognize a large variety of molecules of pharmacological interest and is also considered a hot target for drugs due to its over-expression in virtually all human cancers. Therefore, it is not surprising that the chemical targeting approach, coupled with bioinformatics, site-directed mutagenesis and transport assays, proved fundamental in describing features of LAT1 such as the substrate binding site, regulatory domains and interactions with drugs that will be discussed in this review. The results on LAT1 can be considered to have general applicability to other transporters linked with human diseases.  相似文献   

9.
Combinatorial protein libraries provide a promising route to investigate the determinants and features of protein folding and to identify novel folding amino acid sequences. A library of sequences based on a pool of different monomer types are screened for folding molecules, consistent with a particular foldability criterion. The number of sequences grows exponentially with the length of the polymer, making both experimental and computational tabulations of sequences infeasible. Herein a statistical theory is extended to specify the properties of sequences having particular values of global energetic quantities that specify their energy landscape. The theory yields the site-specific monomer probabilities. A foldability criterion is derived that characterizes the properties of sequences by quantifying the energetic separation of the target state from low-energy states in the unfolded ensemble and the fluctuations of the energies in the unfolded state ensemble. For a simple lattice model of proteins, excellent agreement is observed between the theory and the results of exact enumeration. The theory may be used to provide a quantitative framework for the design and interpretation of combinatorial experiments.  相似文献   

10.
We have developed a novel approach for dissecting transmembrane beta-barrel proteins (TMBs) in genomic sequences. The features include (i) the identification of TMBs using the preference of residue pairs in globular, transmembrane helical (TMH) and TMBs, (ii) elimination of globular/TMH proteins that show sequence identity of more than 70% for the coverage of 80% residues with known structures, (iii) elimination of globular/TMH proteins that have sequence identity of more than 60% with known sequences in SWISS-PROT, and (iv) exclusion of TMH proteins using SOSUI, a prediction system for TMH proteins. Our approach picked up 7% TMBs in all the considered genomes. The comparison between the identified TMBs in E. coli genome and available experimental data demonstrated that the new approach could correctly identify all the 11 known TMBs, whose crystal structures are available. Further, it revealed the presence of 19 TMBs, homology with known structures, 60 TMBs similar to well annotated sequences, and 54 TMBs that have high sequence similarity with Escherichia coli beta-barrel proteins deposited in Transport Classification Database (TCDB). Interestingly, the present approach identified TMBs from all 15 families in TCDB. In human genome, the occurrence of TMBs varies from 0 to 3% in different chromosomes. We suggest that our approach could lead to a step forward in the advancement of structural and functional genomics.  相似文献   

11.
The intriguing structural diversity in folded topologies available to guanine-rich nucleic acid repeat sequences have made four-stranded G-quadruplex structures the focus of both basic and applied research, from cancer biology and novel therapeutics through to nanoelectronics. Distributed widely in the human genome as targets for regulating gene expression and chromosomal maintenance, they offer unique avenues for future cancer drug development. In particular, the recent advances in chemical and structural biology have enabled the construction of bespoke selective DNA based aptamers to be used as novel therapeutic agents and access to detailed structural models for structure based drug discovery. In this critical review, we will explore the important underlying characteristics of G-quadruplexes that make them functional, stable, and predictable nanoscaffolds. We will review the current structural database of folding topologies, molecular interfaces and novel interaction surfaces, with a consideration to their future exploitation in drug discovery, molecular biology, supermolecular assembly and aptamer design. In recent years the number of potential applications for G-quadruplex motifs has rapidly grown, so in this review we aim to explore the many future challenges and highlight where possible successes may lie. We will highlight the similarities and differences between DNA and RNA folded G-quadruplexes in terms of stability, distribution, and exploitability as small molecule targets. Finally, we will provide a detailed review of basic G-quadruplex geometry, experimental tools used, and a critical evaluation of the application of high-resolution structural biology and its ability to provide meaningful and valid models for future applications (255 references).  相似文献   

12.
One of the most important challenges in computational and molecular biology is to understand the relationship between amino acid sequences and the folding rates of proteins. Recent works suggest that topological parameters, amino acid properties, chain length and the composition index relate well with protein folding rates, however, sequence order information has seldom been considered as a property for predicting protein folding rates. In this study, amino acid sequence order was used to derive an effective method, based on an extended version of the pseudo-amino acid composition, for predicting protein folding rates without any explicit structural information. Using the jackknife cross validation test, the method was demonstrated on the largest dataset (99 proteins) reported. The method was found to provide a good correlation between the predicted and experimental folding rates. The correlation coefficient is 0.81 (with a highly significant level) and the standard error is 2.46. The reported algorithm was found to perform better than several representative sequence-based approaches using the same dataset. The results indicate that sequence order information is an important determinant of protein folding rates.  相似文献   

13.
Peptides based on the amino acid sequences found at protein-protein interaction sites make excellent leads for antagonist development. A statistical picture of amino acids involved in protein-protein interactions indicates that proteins recognize and interact with one another through the restricted set of specialized interface amino acid residues, Pro, Ile, Tyr, Trp, Asp and Arg. These amino acids represent residues from each of the three classes of amino acids, hydrophobic, aromatic and charged, with one anionic and one cationic residue at neutral pH. The use of peptides as drug leads has been successfully used to search for antagonists of cell-surface receptors. Peptide, peptidomimetic, and non-peptide organic inhibitors of a class of cell surface receptors, the integrins, currently serve as therapeutic and diagnostic imaging agents. In this review, we discuss the structural features of protein-protein interactions as well as the design of peptides, peptidomimetics, and small organic molecules for the inhibition of protein-protein interactions. Information gained from studying inhibitors of integrin functions is now being applied to the design and testing of inhibitors of other protein-protein interactions. Most drug development progress in the past several decades has been made using the enzyme binding-pocket model of drug targets. Small molecules are designed to fit into the substrate-binding pockets of proteins based on a lock-and-key, induced-fit, or conformational ensemble model of the protein binding site. Traditionally, enzymes have been used as therapeutic drug targets because it was easier to develop rapid, sensitive screening assays, and to find low molecular weight inhibitors that blocked the active site. However, for proteins which interact with other proteins, rather than with small substrate molecules, the lack of binding pockets means that this approach will not generally succeed. There exist many diseases in which the inhibition of protein-protein interactions would provide therapeutic benefit, but there are no general methods available to address such problems. The focus of the first part of this review is to discuss the features of protein-protein interactions which may serve as general guidelines for the development and design of inhibitors for protein-protein interactions. In the second part we focus on the design of peptides (lead compounds) and their conversion into peptidomimetics or small organic molecules for the inhibition of protein-protein interactions. We draw examples from the important and emerging area of integrin-based cell adhesion and show how the principles of protein-protein interactions are followed in the discovery, optimization and usage of specific protein interface peptides as drug leads.  相似文献   

14.
Only a vanishingly small proportion of the almost infinite number of possible proteins occur in nature. Can this remaining potential of structural and functional diversity be used in the construction of new proteins? Is a “second evolution” of proteins and enzymes about to occur? These questions have suddenly become of interest because the recombinant DNA technique allows the synthesis of any given amino acid sequence. Examples of enzyme models demonstrate clearly that the unusual catalytic properties of enzymes are associated with the presence of a specifically folded polypeptide chain which has a complex three-dimensional form. The critical hurdle in the path of artificial proteins is thus the design of amino acid sequences which are able to fold into tertiary structures. — Recent studies on the topology and the mechanism of folding have provided considerable insight into the occurrence of, and the rules governing the three-dimensional architecture of proteins. Secondary structures apparently play a key role in the folding process; helices and “β-structures” act as nucleation centers directing folding and account for the surprisingly small number of different folding topologies. The problem of secondary structure formation can be investigated directly by means of conformational studies on model peptides. Oligopeptides with tailormade physicochemical, structural and conformational properties can already be designed. The theoretical and experimental basis for the construction of polypeptides with stable tertiary structures is therefore established. The path to macromolecules with an immense variety of novel properties lays before us.  相似文献   

15.
Despite a wide variety of biological functions, alpha-helical membrane proteins display a rather simple transmembrane architecture. Although not many high resolution structures of transmembrane proteins are available today, our understanding of membrane protein folding has emerged in the recent years. Now we begin to develop a basic understanding of the forces that guide folding and interaction of alpha-helical membrane proteins. Some structural requirements for transmembrane helix interactions are defined, and common motifs have been discovered in the recent years which can drive helix-helix interactions. Nevertheless, many open questions remain to be addressed in future studies. One general problem with investigating transmembrane helix interactions is the limited number of appropriate tools, which can be applied to investigate membrane protein folding. Only recently several new techniques have been developed and established, including genetic systems, which allow measuring transmembrane helix interactions in vitro and in vivo. In the first part of this review, we summarize several aspects of the current understanding of membrane protein folding and assembly. In the second part, we discuss genetic systems, which were developed in the recent years to measure interaction of transmembrane helices in the inner membrane of E. coli.  相似文献   

16.
De novo design of artificial proteins is an essential approach to elucidate the principles of protein architecture and to understand specific functions of natural proteins and also to yield novel molecules for medical and industrial aims. We have designed artificial sequences of 153 amino acids to fit the main-chain framework of the sperm whale myoglobin structure based on the knowledge-based energy functions to evaluate the compatibility between protein tertiary structures and amino acid sequences. The synthesized artificial globins bind a single heme per protein molecule as designed, which show well-defined electrochemical and spectroscopic features characteristic of proteins with a low-spin heme. Redox and ligand binding reactions of the artificial heme proteins were investigated and these heme-related functions were found to vary with their structural uniqueness. Relationships between the structural and functional properties are discussed.  相似文献   

17.
Through billions of years of evolution nature has created and refined structural proteins for a wide variety of specific purposes. Amino acid sequences and their associated folding patterns combine to create elastic, rigid or tough materials. In many respects, nature's intricately designed products provide challenging examples for materials scientists, but translation of natural structural concepts into bio-inspired materials requires a level of control of macromolecular architecture far higher than that afforded by conventional polymerization processes. An increasingly important approach to this problem has been to use biological systems for production of materials. Through protein engineering, artificial genes can be developed that encode protein-based materials with desired features. Structural elements found in nature, such as beta-sheets and alpha-helices, can be combined with great flexibility, and can be outfitted with functional elements such as cell binding sites or enzymatic domains. The possibility of incorporating non-natural amino acids increases the versatility of protein engineering still further. It is expected that such methods will have large impact in the field of materials science, and especially in biomedical materials science, in the future.  相似文献   

18.
Interactions of lipids are central to the folding and stability of membrane proteins. Coarse-grained molecular dynamics simulations have been used to reveal the mechanisms of self-assembly of protein/membrane and protein/detergent complexes for representatives of two classes of membrane protein, namely, glycophorin (a simple alpha-helical bundle) and OmpA (a beta-barrel). The accuracy of the coarse-grained simulations is established via comparison with the equivalent atomistic simulations of self-assembly of protein/detergent micelles. The simulation of OmpA/bilayer self-assembly reveals how a folded outer membrane protein can be inserted in a bilayer. The glycophorin/bilayer simulation supports the two-state model of membrane folding, in which transmembrane helix insertion precedes dimer self-assembly within a bilayer. The simulations also suggest that a dynamic equilibrium exists between the glycophorin helix monomer and dimer within a bilayer. The simulated glycophorin helix dimer is remarkably close in structure to that revealed by NMR. Thus, coarse-grained methods may help to define mechanisms of membrane protein (re)folding and will prove suitable for simulation of larger scale dynamic rearrangements of biological membranes.  相似文献   

19.
NMR of membrane-associated peptides and proteins   总被引:1,自引:0,他引:1  
In living cells, membrane proteins are essential to signal transduction, nutrient use, and energy exchange between the cell and environment. Due to challenges in protein expression, purification and crystallization, deposition of membrane protein structures in the Protein Data Bank lags far behind existing structures for soluble proteins. This review describes recent advances in solution NMR allowing the study of a select set of peripheral and integral membrane proteins. Surface-binding proteins discussed include amphitropic proteins, antimicrobial and anticancer peptides, the HIV-1 gp41 peptides, human alpha-synuclein and apolipoproteins. Also discussed are transmembrane proteins including bacterial outer membrane beta-barrel proteins and oligomeric alpha-helical proteins. These structural studies are possible due to solubilization of the proteins in membrane-mimetic constructs such as detergent micelles and bicelles. In addition to protein dynamics, protein-lipid interactions such as those between arginines and phosphatidylglycerols have been detected directly by NMR. These examples illustrate the unique role solution NMR spectroscopy plays in structural biology of membrane proteins.  相似文献   

20.
The efficient synthesis of small molecules having many molecular skeletons is an unsolved problem in diversity-oriented synthesis (DOS). We describe the development and application of a synthesis strategy that uses common reaction conditions to transform a collection of similar substrates into a collection of products having distinct molecular skeletons. The substrates have different appendages that pre-encode skeletal information, called sigma-elements. This approach is analogous to the natural process of protein folding in which different primary sequences of amino acids are transformed into macromolecules having distinct three-dimensional structures under common folding conditions. Like sigma-elements, the amino acid sequences pre-encode structural information. An advantage of using folding processes to generate skeletal diversity in DOS is that skeletal information can be pre-encoded into substrates in a combinatorial fashion, similar to the way protein structural information is pre-encoded combinatorially in polypeptide sequences, thus making it possible to generate skeletal diversity in an efficient manner. This efficiency was realized in the context of a fully encoded, split-pool synthesis of approximately 1260 compounds potentially representing all possible combinations of building block, stereochemical, and skeletal diversity elements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号