首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Unlike all-helices membrane proteins, β-barrel membrane proteins can not be successfully discriminated from other proteins, especially from all-β soluble proteins. This paper performs an analysis on the amino acid composition in membrane parts of 12 β-barrel membrane proteins versus β-strands of 79 all-β soluble proteins. The average and variance of the amino acid composition in these two classes are calculated. Amino acids such as Gly, Asn, Val that are most likely associated with classification are selected based on Fishers discriminant ratio. A linear classifier built with these selected amino acids composition in observed β-strands achieves 100% classification accuracy for 12 membrane proteins and 79 soluble proteins in a four-fold cross-validation experiment. Since at present the accuracy of secondary structure prediction is quite high, a promising method to identify β-barrel membrane proteins is presented based on the linear classifier coupled with predicted secondary structure. Applied to 241 β-barrel membrane proteins and 3855 soluble proteins with various structures, the method achieves 85.48% (206/241) sensitivity and 92.53% specificity (3567/3855).  相似文献   

2.
A novel method is developed to model and predict the transmembrane regions of beta-barrel membrane proteins. It is based on a Hidden Markov model (HMM) with architecture obeying those proteins' construction principles. The HMM is trained and tested on a non-redundant set of 11 beta-barrel membrane proteins known to date at atomic resolution with a jack-knife procedure. As a result, the method correctly locates 97% of 172 transmembrane beta-strands. Out of the 11 proteins, the barrel size for ten proteins and the overall topology for seven proteins are correctly predicted. Additionally, it successfully assigns the entire topology for two new beta-barrel membrane proteins that have no significant sequence homology to the 11 proteins. Predicted topology for two candidates for beta-barrel structure of the outer mitochondrial membrane is also presented in the paper.  相似文献   

3.
Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important problem both for detecting outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures. In this work, we have systematically analyzed the distribution of amino acid residues in the sequences of globular and outer membrane proteins. We observed that the occurrence of two neighboring aliphatic and polar residues is significantly higher in outer membrane proteins than in globular proteins. From the information about the dipeptide composition we have devised a statistical method for discriminating outer membrane proteins from other globular and membrane proteins. Our approach correctly picked up the outer membrane proteins with an accuracy of 95% for the training set of 337 proteins. On the other hand, our method has correctly excluded the globular proteins at an accuracy of 79% in a non-redundant dataset of 674 proteins. Furthermore, the present method is able to correctly exclude alpha-helical membrane proteins up to an accuracy of 87%. These accuracy levels are comparable to other methods in the literature. The influence of protein size and structural class for discrimination is discussed.  相似文献   

4.
beta-barrel membrane proteins perform a variety of functions, such as mediating non-specific, passive transport of ions and small molecules, selectively passing the molecules like maltose and sucrose and are involved in voltage dependent anion channels. Understanding the structural features of beta-barrel membrane proteins and detecting them in genomic sequences are challenging tasks in structural and functional genomics. In this review, with the survey of experimentally known amino acid sequences and structures, the characteristic features of amino acid residues in beta-barrel membrane proteins and novel parameters for understanding their folding and stability will be described. The development of statistical methods and machine learning techniques for discriminating beta-barrel membrane proteins from other folding types of globular and membrane proteins will be explained along with their relative importance. Further, different methods including hydrophobicity profiles, rule based approach, amino acid properties, neural networks, hidden Markov models etc. for predicting membrane spanning segments of beta-barrel membrane proteins will be discussed. In addition, the applications of discrimination techniques for detecting beta-barrel membrane proteins in genomic sequences will be outlined. In essence, this comprehensive review would provide an overall picture about beta-barrel membrane proteins starting from the construction of datasets to genome-wide applications.  相似文献   

5.
Prediction of transmembrane beta-strands in outer membrane proteins (OMP) is one of the important problems in computational chemistry and biology. In this work, we propose a method based on neural networks for identifying the membrane-spanning beta-strands. We introduce the concept of "residue probability" for assigning residues in transmembrane beta-strand segments. The performance of our method is evaluated with single-residue accuracy, correlation, specificity, and sensitivity. Our predicted segments show a good agreement with experimental observations with an accuracy level of 73% solely from amino acid sequence information. Further, the predictive power of N- and C-terminal residues in each segments, number of segments in each protein, and the influence of cutoff probability for identifying membrane-spanning beta-strands will be discussed. We have developed a Web server for predicting the transmembrane beta-strands from the amino acid sequence, and the prediction results are available at http://psfs.cbrc.jp/tmbeta-net/.  相似文献   

6.
The proteins structure can be mainly classified into four classes: all-alpha, all-beta, alpha/beta, and alpha + beta protein according to their chain fold topologies. For the purpose of predicting the protein structural class, a new predicting algorithm, in which the increment of diversity combines with Quadratic Discriminant analysis, is presented to study and predict protein structural class. On the basis of the concept of the pseudo amino acid composition (Chou, Proteins: Struct Funct Genet 2001, 43, 246; Erratum: Proteins Struct Funct Genet 2001, 44, 60), 400 dipeptide components and 20 amino acid composition are, respectively, selected as parameters of diversity source. Total of 204 nonhomologous proteins constructed by Chou (Chou, Biochem Biophys Res Commun 1999, 264, 216) are used for training and testing the predictive model. The predicted results by using the pseudo amino acids approach as proposed in this paper can remarkably improve the success rates, and hence the current method may play a complementary role to other existing methods for predicting protein structural classification.  相似文献   

7.
A new method is presented for identification of beta-barrel membrane proteins. It is based on a hidden Markov model (HMM) with an architecture obeying these proteins' construction principles. Once the HMM is trained, log-odds score relative to a null model is used to discriminate beta-barrel membrane proteins from other proteins. The method achieves only 10% false positive and false negative rates in a six-fold cross-validation procedure. The results compare favorably with existing methods. This method is proposed to be a valuable tool to quickly scan proteomes of entirely sequenced organisms for beta-barrel membrane proteins.  相似文献   

8.
Membrane transporters catalyze the active transport of molecules across biological barriers such as lipid bilayer membranes. Currently, the experimental annotation of which proteins transport which substrates is far from complete and will likely remain so for much longer. Therefore, it is highly desirable to develop computational methods that may aid in the substrate annotation of putative membrane transport proteins. Here, we measured the similarity of membrane transporters from Arabidopsis thaliana by their amino acid composition, higher sequence order information, amino acid characteristics, or sequence conservation. We considered the substrate classes amino acids, oligopeptides, phosphates, and hexoses. Substrate classification based on the amino acid frequency yielded an accuracy of 75% or higher. Integrating additional information improved the prediction performance to 90% and higher.  相似文献   

9.
Electromembrane extraction (EME) proved to be a simple and rapid pretreatment method for analysis of amino acids and related compounds in body fluid samples. Body fluids were acidified to the final concentration of 2.5 M acetic acid and served as donor solutions. Amino acids, present as cations in the donor solutions, migrated through a supported liquid membrane (SLM) composed of 1-ethyl-2-nitrobenzene/bis-(2-ethylhexyl)phosphonic acid (85:15 (v/v)) into the lumen of a porous polypropylene hollow fiber (HF) on application of electric field. The HF was filled with 2.5 M acetic acid serving as the acceptor solution. Matrix components in body fluids were efficiently retained on the SLM and did not interfere with subsequent analysis. Capillary electrophoresis with capacitively coupled contactless conductivity detection was used for determination of 17 underivatized amino acids in background electrolyte solution consisting of 2.5 M acetic acid. Parameters of EME, such as composition of SLM, pH and composition of donor and acceptor solution, agitation speed, extraction voltage, and extraction time were studied in detail. At optimized conditions, repeatability of migration times and peak areas of 17 amino acids was better than 0.3% and 13%, respectively, calibration curves were linear in a range of two orders of magnitude (r(2)=0.9968-0.9993) and limits of detection ranged from 0.15 to 10 μM. Endogenous concentrations of 12 amino acids were determined in EME treated human serum, plasma, and whole blood. The method was also suitable for simple and rapid pretreatment and determination of elevated concentrations of selected amino acids, which are markers of severe inborn metabolic disorders.  相似文献   

10.
Biomolecular surface engineering of materials often requires precise, versatile and efficient quantification of immobilized proteins at solid surfaces. Acidic hydrolysis of surface-bound proteins and subsequent HPLC analysis of fluorescence-derivatized amino acids were adapted and critically evaluated for that purpose. Contaminations and concentration-dependent amino acid retrieval during HPLC were found to influence the accuracy of the method. In addition to the choice of adequate conditions for hydrolysis, derivatization and chromatographic separation extensions of the data evaluation were suggested to improve the accuracy of the approach when applied to single protein systems: comparing the experimentally obtained amino acid ratio to the protein constitution enabled to identify the properly separated and detected amino acids. Those amino acids were selected for a more precise calculation of the amount of immobilized protein. To further increase the accuracy of the method, the retrieval of amino acids corresponding to protein amounts in the range between 0.5 and 4.0 microg was analyzed for a variety of proteins of interest to derive protein-specific correction factors. The evaluation of amino acid data was furthermore applied to quantify binary protein mixtures at similar settings. This method was proven useful to detect the composition of protein mixtures throughout a wide range of absolute and relative concentrations.  相似文献   

11.
MotivationPrimary and secondary active transport are two types of active transport that involve using energy to move the substances. Active transport mechanisms do use proteins to assist in transport and play essential roles to regulate the traffic of ions or small molecules across a cell membrane against the concentration gradient. In this study, the two main types of proteins involved in such transport are classified from transmembrane transport proteins. We propose a Support Vector Machine (SVM) with contextualized word embeddings from Bidirectional Encoder Representations from Transformers (BERT) to represent protein sequences. BERT is a powerful model in transfer learning, a deep learning language representation model developed by Google and one of the highest performing pre-trained model for Natural Language Processing (NLP) tasks. The idea of transfer learning with pre-trained model from BERT is applied to extract fixed feature vectors from the hidden layers and learn contextual relations between amino acids in the protein sequence. Therefore, the contextualized word representations of proteins are introduced to effectively model complex structures of amino acids in the sequence and the variations of these amino acids in the context. By generating context information, we capture multiple meanings for the same amino acid to reveal the importance of specific residues in the protein sequence.ResultsThe performance of the proposed method is evaluated using five-fold cross-validation and independent test. The proposed method achieves an accuracy of 85.44 %, 88.74 % and 92.84 % for Class-1, Class-2, and Class-3, respectively. Experimental results show that this approach can outperform from other feature extraction methods using context information, effectively classify two types of active transport and improve the overall performance.  相似文献   

12.
The prediction of protein unfolding rates from amino acid sequences is one of the most important challenges in computational biology and chemistry. The analysis on the relationship between protein unfolding rates and physical-chemical, energetic, and conformational properties of amino acid residues provides valuable information to understand and predict the unfolding rates of two- and three-state proteins. We found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and unfolding rates of two- and three-state proteins, indicating the importance of native-state topology in determining the protein unfolding rates. We have formulated three independent linear regression equations to different structural classes of proteins for predicting their unfolding rates from amino acid sequences and obtained an excellent agreement between predicted and experimentally observed unfolding rates of proteins; the correlation coefficients are 0.999, 0.990, and 0.992, respectively, for all-alpha, all-beta, and mixed-class proteins. Further, we have derived a general equation applicable to all structural classes of proteins, which can be used for predicting the unfolding rates for proteins of an unknown structural class. We observed a correlation of 0.987 and 0.930, respectively, for back-check and jack-knife tests. These accuracy levels are better than those of other methods in the literature.  相似文献   

13.
Understanding the relationship between amino acid sequences and folding rates of proteins is an important task in computational and molecular biology. In this work, we have systematically analyzed the composition of amino acid residues for proteins with different ranges of folding rates. We observed that the polar residues, Asn, Gln, Ser, and Lys, are dominant in fast folding proteins whereas the hydrophobic residues, Ala, Cys, Gly, and Leu, prefer to be in slow folding proteins. Further, we have developed a method based on quadratic response surface models for predicting the folding rates of 77 two- and three-state proteins. Our method showed a correlation of 0.90 between experimental and predicted protein folding rates using leave-one-out cross-validation method. The classification of proteins based on structural class improved the correlation to 0.98 and it is 0.99, 0.98, and 0.96, respectively, for all-alpha, all-beta, and mixed class proteins. In addition, we have utilized Baysean classification theory for discriminating two- and three-state proteins, which showed an accuracy of 90%. We have developed a web server for predicting protein folding rates and it is available at http://bioinformatics.myweb.hinet.net/foldrate.htm.  相似文献   

14.
Transmembrane beta-barrel (TMB) proteins play pivotal roles in many aspects of bacterial functions. This paper presents a k-nearest neighbor (K-NN) method for discriminating TMB and non-TMB proteins. We start with a method that makes predictions based on a distance computed from residue composition and gradually improve the prediction performance by including homologous sequences and searching for a set of residues and di-peptides for calculating the distance. The final method achieves an accuracy of 97.1%, with 0.876 MCC, 86.4% sensitivity and 98.8% specificity. A web server based on the proposed method is available at http://yanbioinformatics.cs.usu.edu:8080/TMBKNNsubmit.  相似文献   

15.
Prediction of protein folding rates from amino acid sequences is one of the most important challenges in molecular biology. In this work, I have related the protein folding rates with physical-chemical, energetic and conformational properties of amino acid residues. I found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two- and three-state proteins, indicating the importance of native state topology in determining the protein folding rates. I have formulated a simple linear regression model for predicting the protein folding rates from amino acid sequences along with structural class information and obtained an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are 0.99, 0.96 and 0.95, respectively, for all-alpha, all-beta and mixed class proteins. This is the first available method, which is capable of predicting the protein folding rates just from the amino acid sequence with the aid of generic amino acid properties and structural class information.  相似文献   

16.
This paper focuses on the prediction of the dimensionless retention time of proteins (DRT) in hydrophobic interaction chromatography (HIC) by means of mathematical models based, essentially, only on aminoacidic composition. The results show that such prediction is indeed possible. Our main contribution was the design of models that predict the DRT using the minimal information concerning a protein: its aminoacidic composition. The performance is similar to that observed in models that use much more sophisticated information such as the three-dimensional structure of proteins. Three models that, in addition to the amino acid composition, use different assumptions about the amino acids tendency to be exposed to the solvent, were evaluated in 12 proteins with known experimental DRT. In all the cases analyzed, the model that obtained the best results was the one based on a linear estimation of the aminoacidic surface composition. The models were adjusted using a collection of 74 vectors of aminoacidic properties plus a set of 6388 vectors derived from these using two mathematical tools: k-means and self-organizing maps (SOM) algorithms. The best vector was generated by the SOM algorithm and was interpreted as a hydrophobicity scale based partly on the tendency of the amino acids to be hidden in proteins. The prediction error (MSE(JK)) obtained by this model was almost 35% smaller than that obtained by the model that supposes that all the amino acids are completely exposed and 40% smaller than that obtained by the model that uses a simple correction factor considering the general tendency of each amino acid to be exposed to the solvent. In fact, the performance of the best model based on the aminoacidic composition was 5% better than that observed in the model based on the three-dimensional structure of proteins.  相似文献   

17.
反萃分散组合液膜分离提取氨基酸   总被引:1,自引:0,他引:1  
建立了分离提取蛋氨酸、亮氨酸、苯丙氨酸和色氨酸的磷酸二(2-乙基己基)酯(D2EHPA) 煤油-HCl反萃分散组合液膜体系,考察了料液相pH值、载体D2EHPA浓度、液膜相与反萃相体积比、反萃相组成、料液相与反萃分散相流速、传输时间以及支撑膜重复使用次数对氨基酸渗透系数和传输效率的影响。 在优化的条件下,建立的反萃分散组合液膜体系对4种氨基酸均可以获得大于35%的传输效率,其中色氨酸和亮氨酸的传输效率超过了79%,且传输效率呈Et,Trp>Et,Leu>Et,Phe>Et,Met的趋势。 支撑膜重复使用25次,对氨基酸的传输效率没有明显改变。建立的液膜体系对考察的氨基酸展示了较高传输效率和优越的传输选择性,是一种简单和环境友好的分离技术。  相似文献   

18.
19.
PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha–beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/.  相似文献   

20.
Determining the amino acid content of a protein involves the hydrolysis of that protein, usually in acid, until the protein-bound amino acids are released and made available for detection. Both the variability in the ease of peptide bond cleavage and differences in the acid stability of certain amino acids can significantly affect determination of a protein's amino acid content. By using multiple hydrolysis intervals, a greater degree of accuracy can be obtained in amino acid analysis. Correction factors derived by linear extrapolation of serial hydrolysis data are currently used. Compartmental modeling of the simultaneous hydrolysis (yield) and degradation (decay) of amino acids by nonlinear multiple regression of serial hydrolysis data has also been validated and applied to determine the amino acid composition of various biological samples, including egg-white lysozyme, human milk protein, and hair. Implicit in the routine application of serial hydrolysis in amino acid analysis, however, is an understanding that correction factors, derived either linearly or through the more accurate nonlinear multiple regression approach, need to be determined for individual proteins rather than be applied uniformly across all protein types.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号