首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a method for predicting RNA base pairing which imposes no restrictions on the order of base pairs, allows for pseudoknots and runs in O(mN2) time for N base pairs and m iterations. It employs a self‐consistent mean field method in which all base pairs are possible, but with each iteration, the most energetically favored base pairs become more likely as long as they are consistent with their neighbors. Performance was compared against three other programs using three test sets. Sensitivity varied from 20% to 74% and specificity from 44% to 77% and generally, the method predicts too many base pairs leading to good sensitivity and worse specificity. The predicted structures have excellent energies suggesting that, algorithmically, the method performs well, but the classic literature energy models may not be appropriate when pseudoknots are permitted. Website and source code for the simulations are available at http://cardigan.zbh.uni‐hamburg.de/~rnascmf . © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

2.
Literature contains over fifty years of accumulated methods proposed by researchers for predicting the secondary structures of proteins in silico. A large part of this collection is comprised of artificial neural network-based approaches, a field of artificial intelligence and machine learning that is gaining increasing popularity in various application areas. The primary objective of this paper is to put together the summary of works that are important but sparse in time, to help new researchers have a clear view of the domain in a single place. An informative introduction to protein secondary structure and artificial neural networks is also included for context. This review will be valuable in designing future methods to improve protein secondary structure prediction accuracy. The various neural network methods found in this problem domain employ varying architectures and feature spaces, and a handful stand out due to significant improvements in prediction. Neural networks with larger feature scope and higher architecture complexity have been found to produce better protein secondary structure prediction. The current prediction accuracy lies around the 84% marks, leaving much room for further improvement in the prediction of secondary structures in silico. It was found that the estimated limit of 88% prediction accuracy has not been reached yet, hence further research is a timely demand.  相似文献   

3.
A fast genetic algorithm for RNA secondary structure analysis   总被引:2,自引:0,他引:2  
A fast genetic algorithm GArna for mass calculations of RNA secondary structures through the Internet is proposed. The algorithm GArna was used to study the effects of nucleotide composition on characteristics of the secondary structure of random RNA sequences. A contextual characteristics for evaluation of the stability was proposed and the application of standard statistical tests for heterogeneous RNA samplings was justified. The structure-contextual characteristics by which the 5"-untranslated regions of high- and low-expression genes of dicot plants and mammals differ were found, and the results were interpreted in terms of secondary structure influence on translation initiation and on the general scheme of expression regulation. The application of the results obtained for the development of computer methods for RNA structural genomics, in particular, for RNA search in genome sequences, is discussed.  相似文献   

4.
Consider the network of all secondary structures of a given RNA sequence, where nodes are connected when the corresponding structures have base pair distance one. The expected degree of the network is the average number of neighbors, where average may be computed with respect to the either the uniform or Boltzmann probability. Here, we describe the first algorithm, RNAexpNumNbors , that can compute the expected number of neighbors, or expected network degree, of an input sequence. For RNA sequences from the Rfam database, the expected degree is significantly less than the constrained minimum free energy structure, defined to have minimum free energy (MFE) over all structures consistent with the Rfam consensus structure. The expected degree of structural RNAs, such as purine riboswitches, paradoxically appears to be smaller than that of random RNA, yet the difference between the degree of the MFE structure and the expected degree is larger than that of random RNA. Expected degree does not seem to correlate with standard structural diversity measures of RNA, such as positional entropy and ensemble defect. The program RNAexpNumNbors is written in C, runs in cubic time and quadratic space, and is publicly available at http://bioinformatics.bc.edu/clotelab/RNAexpNumNbors . © 2014 Wiley Periodicals, Inc.  相似文献   

5.
The graphical representation of biological sequences is an important subject in the area of genome studies. We propose a novel visual representation for RNA secondary structures. Some symmetric properties and information on the base distribution and compositions can be intuitively reflected by the projection graphs of the points corresponding to the RNA secondary structures. Then our method is applied to compute the similarity of 12 classical samples and 11 real RNA secondary structures. The results indicate that our method can not only effectively analyze the similarity between RNA secondary structures but also show a high consistency with other literatures. Moreover, our method only needs the geometrical center of the characteristic curve of the RNA secondary structure to compute the similarity matrix, which means a low computational complexity. © 2011 Wiley Periodicals, Inc. Int J Quantum Chem, 2011  相似文献   

6.
A novel protocol for all‐atom RNA tertiary structure prediction is presented that uses restrained molecular mechanics and simulated annealing. The restraints are from secondary structure, covariation analysis, coaxial stacking predictions for helices in junctions, and, when available, cross‐linking data. Results are demonstrated on the Alu domain of the mammalian signal recognition particle RNA, the Saccharomyces cerevisiae phenylalanine tRNA, the hammerhead ribozyme, the hepatitis C virus internal ribosomal entry site, and the P4–P6 domain of the Tetrahymena thermophila group I intron. The predicted structure is selected from a pool of decoy structures with a score that maximizes radius of gyration and base–base contacts, which was empirically found to select higher quality decoys. This simple ab initio approach is sufficient to make good predictions of the structure of RNAs compared to current crystal structures using both root mean square deviation and the accuracy of base–base contacts. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011  相似文献   

7.
Predicting RNA secondary structure using evolutionary history can be carried out by using an alignment of related RNA sequences with conserved structure. Accurately determining evolutionary substitution rates for base pairs and single stranded nucleotides is a concern for methods based on this type of approach. Determining these rates can be hard to do reliably without a large and accurate initial alignment, which ideally also has structural annotation. Hence, one must often apply rates extracted from other RNA families with trusted alignments and structures. Here, we investigate this problem by applying rates derived from tRNA and rRNA to the prediction of the much more rapidly evolving 5'-region of HIV-1. We find that the HIV-1 prediction is in agreement with experimental data, even though the relative evolutionary rate between A and G is significantly increased, both in stem and loop regions. In addition we obtained an alignment of the 5' HIV-1 region that is more consistent with the structure than that currently in the database. We added randomized noise to the original values of the rates to investigate the stability of predictions to rate matrix deviations. We find that changes within a fairly large range still produce reliable predictions and conclude that using rates from a limited set of RNA sequences is valid over a broader range of sequences.  相似文献   

8.
The identification of RNA secondary structure has been an important tool for the characterization of nucleic acids. Computational structure prediction has been an effective approach toward this end, but improvement of established methods is often slow and reliant on redundant methodology. Here we present a novel consensus scoring approach, created to incorporate inputs from an array of established methods with the goal of producing outputs that contain mutual structures from these programs. This method is implemented in RNAdemocracy, a python program capable of competing with existing methods. This ensemble approach was limited by commonalities in established methods like parameter sourcing, which may lead to agreement error, an unavoidable outcome due to the limit of available RNA structure datasets. The modular construction of RNAdemocracy allows for its easy upgrading and customization to suit user’s needs. RNAdemocracy, while capable of accurate predictions, is best suited to guide users to regions of the sequence space that exhibit agreement instead of a totally reliant predictor of structure. It is also capable of grading predictions for potential accuracy by providing a percentage of consensus between contributing methods in the final structure.  相似文献   

9.
Recently, we proposed a three‐dimensional cube representation of RNA secondary structure. An efficient method for mutation analysis has been proposed based on the introduced representation. According to the proposed three‐dimensional cube representations, we will introduce an extended binary coding method for RNA secondary structure alignment by converting the structure alignment to sequence alignment. Using our method, the result of structure alignment can be obtained quickly. © 2010 Wiley Periodicals, Inc. Int J Quantum Chem, 2011  相似文献   

10.
RNA structure comparison is a fundamental problem in structural biology, structural chemistry, and bioinformatics. It can be used for analysis of RNA energy landscapes, conformational switches, and facilitating RNA structure prediction. The purpose of our integrated tool RNACluster is twofold: to provide a platform for computing and comparison of different distances between RNA secondary structures, and to perform cluster identification to derive useful information of RNA structure ensembles, using a minimum spanning tree (MST) based clustering algorithm. RNACluster employs a cluster identification approach based on a MST representation of the RNA ensemble data and currently supports six distance measures between RNA secondary structures. RNACluster provides a user-friendly graphical interface to allow a user to compare different structural distances, analyze the structure ensembles, and visualize predicted structural clusters.  相似文献   

11.
A two‐dimensional graphical representation (2DGRR) of RNA secondary structures using a two Cartesian coordinates system has been derived for mathematical denotation of RNA structure. The 2DGRR resolves structure degeneracy and avoids loss of information and the limitation that different structures correspond to the same curve. The RNA pseudo‐knots also can be represented as 2D graphical representations. © 2006 Wiley Periodicals, Inc. Int J Quantum Chem, 2006  相似文献   

12.
13.
In this article, we propose a relatively similar measure to compare RNA secondary structures. We first transform an RNA secondary structure into a special sequence representation. Then, on the basis of symbolic sequence complexity, we obtain the relative distance of RNA secondary structures. The examination of similarities/dissimilarities of a set of RNA secondary structures at the 3'-terminus of different viruses illustrates the utility of the approach.  相似文献   

14.
Nucleic acid secondary structure models usually exclude pseudoknots due to the difficulty of treating these nonnested structures efficiently in structure prediction and partition function algorithms. Here, the standard secondary structure energy model is extended to include the most physically relevant pseudoknots. We describe an O(N(5)) dynamic programming algorithm, where N is the length of the strand, for computing the partition function and minimum energy structure over this class of secondary structures. Hence, it is possible to determine the probability of sampling the lowest energy structure, or any other structure of particular interest. This capability motivates the use of the partition function for the design of DNA or RNA molecules for bioengineering applications.  相似文献   

15.
Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0% accuracy based on 10-fold cross validation (Q(3)). Surpassing 81% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3%, 82.3% and 81.8%, respectively. The prediction accuracy is further improved to 83.8% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0% of 1833 proteins, compared to 17.6% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5%) and over predicts them by 7%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/  相似文献   

16.
Deep learning methods for RNA secondary structure prediction have shown higher performance than traditional methods, but there is still much room to improve. It is known that the lengths of RNAs are very different, as are their secondary structures. However, the current deep learning methods all use length-independent models, so it is difficult for these models to learn very different secondary structures. Here, we propose a length-dependent model that is obtained by further training the length-independent model for different length ranges of RNAs through transfer learning. 2dRNA, a coupled deep learning neural network for RNA secondary structure prediction, is used to do this. Benchmarking shows that the length-dependent model performs better than the usual length-independent model.  相似文献   

17.
The protein disulfide bond is a covalent bond that forms during post-translational modification by the oxidation of a pair of cysteines. In protein, the disulfide bond is the most frequent covalent link between amino acids after the peptide bond. It plays a significant role in three-dimensional (3D) ab initio protein structure prediction (aiPSP), stabilizing protein conformation, post-translational modification, and protein folding. In aiPSP, the location of disulfide bonds can strongly reduce the conformational space searching by imposing geometrical constraints. Existing experimental techniques for the determination of disulfide bonds are time-consuming and expensive. Thus, developing sequence-based computational methods for disulfide bond prediction becomes indispensable. This study proposed a stacking-based machine learning approach for disulfide bond prediction (diSBPred). Various useful sequence and structure-based features are extracted for effective training, including conservation profile, residue solvent accessibility, torsion angle flexibility, disorder probability, a sequential distance between cysteines, and more. The prediction of disulfide bonds is carried out in two stages: first, individual cysteines are predicted as either bonding or non-bonding; second, the cysteine-pairs are predicted as either bonding or non-bonding by including the results from cysteine bonding prediction as a feature.The examination of the relevance of the features employed in this study and the features utilized in the existing nearest neighbor algorithm (NNA) method shows that the features used in this study improve about 7.39 % in jackknife validation balanced accuracy. Moreover, for individual cysteine bonding prediction and cysteine-pair bonding prediction, diSBPred provides a 10-fold cross-validation balanced accuracy of 82.29 % and 94.20 %, respectively. Altogether, our predictor achieves an improvement of 43.25 % based on balanced accuracy compared to the existing NNA based approach. Thus, diSBPred can be utilized to annotate the cysteine bonding residues of protein sequences whose structures are unknown as well as improve the accuracy of the aiPSP method, which can further aid in experimental studies of the disulfide bond and structure determination.  相似文献   

18.
We propose a 4-D representation of RNA secondary structures. The four-dimensional representation resolves structures’ degeneracy and avoids loss of information and the limitation that different structures correspond the same plot set (or presentation). The RNA pseudoknpts also can be represented as four-dimensional representations. Based on this representation, we outline an approach to compute the similarities between six RNA secondary structures for illustrating the utility of our approach.  相似文献   

19.
According to the characterization of RNA secondary structures, the RNA secondary structures are transformed into elementary sequences, namely characteristic sequences of RNA secondary structures, by representing A, U, G, C in A-U/ G-C pairs, as A′, U′, G′, C′. Based on the representation, three recurrences for mapping RNA secondary structures into 1-D graph, 2-D graph and 3-D graph are given, respectively. Furthermore, a frequency-based method for RNA secondary structures is given in terms of 1-D graph.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号