首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have sequenced cDNA and genomic clones coding for phytochrome of the fern Selaginella. On the amino acid level, this phytochrome shares sequence homologies with phytochromes of higher plants which range between 62 (phytochrome B of Arabidopsis) and 55 (56)% [phytochrome C of Arabidopsis (Avena)]. Introns in the Selaginella gene are short and occupy positions known from phytochrome sequences of higher plants. A rooted phylogenetic tree based on mutation distances puts Selaginella phytochrome closest to the hypothetical ancestor. A similar tree arises if the tree is constructed with partial sequences (about 200 amino acids) around the chromophore attachment site. An extension of this tree by sequences of other cryptogamic plants (Mougeotia, Ceratodon, Psilotum) shows all these sequences including those of the phytochromes B and C of Arabidopsis on a branch, well separated from the branch formed by phytochromes known to accumulate in etiolated plants. The rooted phytochrome phylogenetic tree, however, is difficult to reconcile with the fossil record.  相似文献   

2.
A novel computer algorithm FluClass has been developed to facilitate the phylogenetic classification of influenza virus using mass spectral data. FluClass accepts a DNA or protein-based phylogenetic tree as input and generates theoretical peptide mass lists for each node. An experimental mass spectrum from an influenza virus protein digest is then placed onto the phylogenetic tree using a novel random resampling function (Z-score) that allows the scoring of spectrum against both internal and leaf nodes. Testing of the algorithm using hemagglutinin protein sequences from human-host influenza viruses showed that the Z-score performs comparably to the Profound scoring method for the scoring of leaf nodes and is substantially better at scoring internal nodes. Scoring of internal nodes allows colorizations of nodes of the phylogenetic tree enabling the classification of the query spectrum to be rapidly visualized. Finally we demonstrate the utility of FluClass on experimental spectra from six strains. Given that mass spectrometry data can be generated rapidly for influenza virus proteins, FluClass provides a fast and direct method for phylogenetic analysis of influenza proteins.  相似文献   

3.
We propose an algorithm of global multiple sequence alignment that is based on a measure of what we call information discrepancy. The algorithm follows a progressive alignment iteration strategy that makes use of what we call a function of degree of disagreement (FDOD). MSAID begins with distance calculation of pairwise sequences, based on FDOD as a numerical scoring measure. In the next step, the resulting distance matrix is used to construct a guide tree via the neighbor-joining method. The tree is then used to produce a multiple alignment. Current alignment is next used to produce a new matrix and a new tree (with FDOD scoring measure again). This iterative process continues until convergence criteria (or a stopping rule) are satisfied. MSAID was tested and compared with other prior methods by using reference alignments from BAliBASE 2.01. For the alignments with no large N/C-terminal extensions or internal insertions MSAID received the top overall average in the tests. Moreover, the results of testing indicate that MSAID performs as well as other alignment methods with an occasional tendency to perform better than these prior techniques. We, therefore, believe that MSAID is a solid and reliable method of choice, which is often (if not always) superior to other global alignment techniques.  相似文献   

4.
Mutual information (MI) is an approach commonly used to estimate the evolutionary correlation of 2 amino acid sites. Although several MI methods exist, prior to our contribution no systematic method had been developed to assess their performance, or to establish numerical thresholds to detect co-evolving amino acid sites. The current study performed a Markov chain Monte Carlo (MCMC) algorithm on influenza viral sequences to capture their evolutionary characteristics. A consensus maximum clade credibility (MCC) tree was estimated from the samples, together with their amino acid substitution statistics, from which we generated synthetic sequences of known dependent and independent paired amino acid sites. A pair-to-pair and influenza-specific amino acid substitution matrix (P2PFLU) incorporated into Bayesian Evolutionary Analysis Sampling Trees (BEAST) enumerated these synthetic sequences. The sequences inherited evolutionary features and co-varying characteristics from the real viral sequences, rendering these synthetic data ideal for exploring their co-evolving features. For the MI measure, we proposed a novel metric called the empirical MI (MIEm), which outperformed other MI measures in analysis of receiver operating characteristics (ROC). We implemented our approach on 1086 all-time PB2 sequences of influenza A H5N1 viruses, in which we found 97 sites exhibiting co-evolutionary substitution of one or more amino acid sites. In particular, PB2 451, along with eight other PB2 sites of various MIEm scores, was found to co-evolve with PB2 627, a known species-associated amino acid residue which plays a critical role in influenza virus replication.  相似文献   

5.
为实现对人腺病毒六邻体家族蛋白进行快速准确的结构建模, 本研究发展了一种新的基于进化树的预测蛋白质家族中一系列分子三维空间结构的快速建模方法. 首先利用邻接法对7株D亚属人腺病毒的六邻体序列构建了基于距离的进化树, 并根据进化树所提供的信息, 确定最佳六邻体家族蛋白渐进式建模路径, 然后利用Modeler与Charmm程序实现六邻体家族蛋白的快速建模. 新的建模方法与传统方法相比, 所需要的计算量大大减少, 结果经过结构评估以及与传统方法建模所得到的结构进行比较,证实基于进化树的快速模建结果是可靠的. 人腺病毒六邻体家族蛋白的快速建模, 对于实现快速高通量的表位预测及开发多价人腺病毒疫苗与腺病毒分型诊断试剂都具有非常重大的意义.  相似文献   

6.
SSCP is a widespread method for mutation detection in biomedical research. Yet, its potential as a tool for population genetics is still not fully utilized. Based on mitochondrial DNA sequences of 96 specimens of the wood-boring beetle Pityogenes chalcographus we constructed a phylogenetic tree of European populations. This tree consisted of six broadly sympatric diverged lineages containing in total 34 haplotypes. Genetic regions of high mutational activity were determined and used for targeted SSCP primer development. In an SSCP mass screening of 427 individuals more than 80% could be assigned to a distinct clade, revealing the insect's genetic structure in Europe. It was demonstrated that analysis of known sequences allows the setup of a functional SSCP protocol within less than two weeks of working time and that phylogenetic data may be retrieved with high accuracy and significantly reduced costs compared to direct sequencing of PCR products.  相似文献   

7.
Plant proteome databases were mined for a flavin monooxygenase (YUCCA), tryptophan decarboxylase (TDC), nitrilase (NIT), and aldehyde oxidase (AO) enzymes that could be involved in the tryptophan-dependent pathway of auxin biosynthesis. Phylogenetic trees for enzyme sequences obtained were constructed. The YUCCA and TDC trees showed that these enzymes were conserved across the plant kingdom and therefore could be involved in auxin synthesis. YUCCAs branched into two clades. Most experimentally studied YUCCAs were found in the first clade. The second clade which has representatives from only seed plants contained Arabidopsis sequences linked to embryonic development. Therefore, sequences in this clade were suggested to be evolved with seed development. Examination of TDC activity and expression had previously linked this enzyme to secondary products synthesis. However, the phylogenetic finding of a conserved TDC clade across land plants suggested its essential role in plant growth. Phylogenetic analysis of AOs showed that plants inherited one AO. Recent gene duplication was suggested as AO sequences from each species were similar to each other rather than to AO from other species. Taken together and based on the experimental support of the involvement of AO in abscisic synthesis, AO was excluded as an intermediate in IAA production. Phylogenetic tree for NIT showed that the first clade contained sequences from species across the plant kingdom whereas the second branch contained sequences from only Brassicaceae. Even though NIT4 orthologues were conserved in the second clade, their major role seems to be detoxification of hydrogen cyanide rather than producing IAA.  相似文献   

8.
This paper presents a novel method of mining biological data using a self-organizing map (SOM). After partitioning a set of protein sequences using SOM, conventional homology alignment is applied to each cluster to determine the conserved local motif (biological pattern) for the cluster. These local motifs are then regarded as rules for prediction and classification. In the application to the prediction of HIV protease cleavage sites in proteins, we found that the rules derived from this method are much more robust than those derived from the decision tree method.  相似文献   

9.
In this article, we introduce a new method to analysis avian influenza virus (AIV) of subtype H5N1 and study the similarity of these sequences. We make a comparison for some nucleic acid sequences of H5N1 AIV in Asia by using the 2D and 3D graphic representation. Comparing these sequences, we structured a phylogenetic tree and discussed the evolutional relationship among these viruses. The sequences analysis shows that there are some obvious traits depending on different areas, periods, and hosts. © 2009 Wiley Periodicals, Inc. Int J Quantum Chem, 2010  相似文献   

10.
The homology of peptide sequences selected from a 7mer phage display library with antibodies elicited by the multicelled parasite Taenia solium in cerebrospinal fluid and serum of neurocysticercosis (NCC) patients and by antibodies of uninfected control patients with similar neurological complications of other ethiology (non-NCC) were analyzed using a PILEUP-Tudos sequence alignments program. The analysis generated dendrograms bearing two types of sequence clusters, those containing (1) only NCC patients-derived peptides and (2) both NCC- and control non-CC -- patient derivatives. By using ELISA, peptides that were selected by the antibodies were identified predominantly in the NCC-derived clusters. In repeated analysis in which sequences were added or removed, the first type of clusters maintained their structure, while the second type of clusters were split into many separate homology units dispersed throughout the guide tree. These results are interpreted as the ability of the analysis to segregate NCC-specific peptide sequences from other sequences. Altogether, this study demonstrates the high potential of the PILEUP-Tudos computer program to analyze phagotope collections recovered through biopanning with polyclonal antibodies elicited in patients by complex and as yet unknown multiple pathogenic antigens and to separate all phagotopes that are disease-relevant on the basis of the sequence homology.  相似文献   

11.
In this paper, we propose a method to create the 60-dimensional feature vector for protein sequences via the general form of pseudo amino acid composition. The construction of the feature vector is based on the contents of amino acids, total distance of each amino acid from the first amino acid in the protein sequence and the distribution of 20 amino acids. The obtained cosine distance metric (also called the similarity matrix) is used to construct the phylogenetic tree by the neighbour joining method. In order to show the applicability of our approach, we tested it on three proteins: 1) ND5 protein sequences from nine species, 2) ND6 protein sequences from eight species, and 3) 50 coronavirus spike proteins. The results are in agreement with known history and the output from the multiple sequence alignment program ClustalW, which is widely used. We have also compared our phylogenetic results with six other recently proposed alignment-free methods. These comparisons show that our proposed method gives a more consistent biological relationship than the others. In addition, the time complexity is linear and space required is less as compared with other alignment-free methods that use graphical representation. It should be noted that the multiple sequence alignment method has exponential time complexity.  相似文献   

12.
In this paper we propose a matrix depiction and two new invariants of DNA sequences. This approach is illustrated on the primate mitochondrial DNA sequences for 11 different species and 80 different H5N1 avian influenza virus DNA sequences. We also construct the dendrogram tree for them. These phylogenies obtained are generally consistent with evolutionary trees constructed in previous studies.  相似文献   

13.
Tree shrews are more closely related to primate animals than rodents in many aspects.In addition, they also possess several advantageous characteristics including small body size, high brain-to-body mass ratio, low cost of feeding and maintenance, short reproductive cycle and life span, which make them promising novel laboratory animals to replace more precious larger primate animals. Testis-specific serine/threonine kinase (Tssk) plays important roles in spermatogenesis and/or the regulation of sperm function. However, studies on Tssk in tree shrews have not been reported yet. In the present study, the full-length sequences of five members of the Tssk family in tree shrews were cloned and their CDS region sequences were analyzed by basic bioinformatics. The phylogenetic tree and prokaryotic protein expression system of Tssk gene of tree shrews were constructed. The mRNA expressions of Tssk genes in 11 tissues/organs from tree shrews were studied. The results showed that: 1. the length of the CDS region of tree shrew Tssk gene for Tssk1B, Tssk2, Tssk3 (variant X1 / X2), Tssk4 (variant X1 / X2) and Tssk6 is 1080bp, 1077bp, 867 / 807bp, 1014 / 984bp, 822bp, respectively, encoding 359, 358, 288/268, 337/327 and 273 amino acids, respectively; the cloned sequences of Tssk genes have been submitted to GenBank with the following accession numbers: KX091161(Tssk1B), KX091162(Tssk2), KX091163(Tssk3 variant X1)/KX091164(Tssk3 variant X2), KX091165(Tssk4variant X1)/KX091166(Tssk4variant X2), KX091160(Tssk6). 2. All tree shrew Tssk proteins distribute in cytoplasm, indicating that they are hydrophilic and non-secretory proteins, with multiple phosphorylation sites of serine and/or threonine. In addition, they are all mixed proteins with similar tertiary structures sharing a highly conserved functional domain of S_TKc (Serine/Threonine protein kinases, catalytic domain). 3.The molecular phylogenetic tree of five Tssk genes in tree shrews indicates that they are neither rodent nor primate animal, but are closely related to primate animals. 4. Five members of the Tssk recombinant proteins in tree shrews were successfully obtained using the constructed prokaryotic protein expression system. 5. Five Tssk genes are specifically expressed in the testis and/or sperm of tree shrews. Additionally, small amount of Tssk1B was expressed in several tissues other than testis and sperm. Limited mRNA levels of Tssk2 and Tssk4 were expressed in the brain, while mRNA of Tssk3 or Tssk6 could only be detected in the testis and sperm.This study will provide fundamental data on reproductive biology of tree shrews, which paves a way for further studying Tssk’s biological function in this novel model animal.  相似文献   

14.
15.
A new three‐dimensional graphical representation of DNA sequences, three‐unit semicircles (TUS)‐curve, which maps a given sequence into a dot sequences embedded in three‐unit semicircles, is proposed based on three biclassifications of nucleotides. TUS‐curve has the merit of compactness and could avoid the degeneracy and loss of information. The geometrical center of the curve, which indicates the distribution of base frequencies of the corresponding DNA sequence, is extracted and applied to analyze the similarity of various species. Phylogenetic tree of 11 species based on their first exons of β‐globin genes showed that the TUS‐curve is a powerful tool to get valuable biological information. © 2011 Wiley Periodicals, Inc. Int J Quantum Chem, 2011  相似文献   

16.
He J  Fang G  Deng Q  Wang S 《Analytica chimica acta》2011,704(1-2):57-62
The classification and regression trees (CART) possess the advantage of being able to handle large data sets and yield readily interpretable models. A conventional method of building a regression tree is recursive partitioning, which results in a good but not optimal tree. Ant colony system (ACS), which is a meta-heuristic algorithm and derived from the observation of real ants, can be used to overcome this problem. The purpose of this study was to explore the use of CART and its combination with ACS for modeling of melting points of a large variety of chemical compounds. Genetic algorithm (GA) operators (e.g., cross averring and mutation operators) were combined with ACS algorithm to select the best solution model. In addition, at each terminal node of the resulted tree, variable selection was done by ACS-GA algorithm to build an appropriate partial least squares (PLS) model. To test the ability of the resulted tree, a set of approximately 4173 structures and their melting points were used (3000 compounds as training set and 1173 as validation set). Further, an external test set containing of 277 drugs was used to validate the prediction ability of the tree. Comparison of the results obtained from both trees showed that the tree constructed by ACS-GA algorithm performs better than that produced by recursive partitioning procedure.  相似文献   

17.
We propose a 6D representation of protein sequences consisting of 20 amino acids. Based on this 6D representation, we propose a proteome distance measure for constructing phylogenic tree. And we make use of the corresponding similarity matrix to construct phylogenic tree. The examination of phylogenic tree belong to 30 mitochondrial sequence illustrates the utility of our approach. © 2006 Wiley Periodicals, Inc. Int J Quantum Chem, 2007  相似文献   

18.
High‐affinity aptamers for important signal transduction proteins, i.e. Cdc42‐GTP, p21‐activated kinase1 (PAK1) and MRCK (myotonic dystrophy kinase‐related Cdc42‐binding kinase) α were successfully selected in the low micro‐ to nanomolar range using non‐systematic evolution of ligands by exponential enrichment (SELEX) with at least three orders of magnitude enhancement from their respective bulk affinity of naïve DNA library. In the non‐SELEX procedure, CE was used as a highly efficient affinity method to select aptamers for the desired molecular target through a process that involved repetitive steps of partitioning, known as non‐equilibrium CE of equilibrium mixtures with no PCR amplification between successive steps. Various non‐SELEX conditions including the type, concentration and pH of the run buffer were optimized. Other considerations such as salt composition of selection buffer, protein concentration and sample injection size were also studied for high stringency during selection. After identifying the best enriched aptamer pool, randomly selected clones from the aptamer pool were sequenced to obtain the individual DNA sequences. The dissociation constants (Kd) of these sequences were in the low micromolar to nanomolar range, indicating high affinity to the respective proteins. The best binders were also subjected to sequence alignment to generate a phylogenetic tree. No significant consensus region based on approximately 50 sequences for each protein was observed, suggesting the high efficiency of non‐SELEX for the selection of numerous unique sequences with high selectivity.  相似文献   

19.
A 3D graphical representation of DNA sequences, which has no circuit or degeneracy, is derived for mathematical denotation of DNA sequence. Based on this graphical representation, we propose a new sequence distance measure. We make use of the corresponding similarity matrix to construct a phylogenic tree by virtue of the fuzzy theory. The examination of phylogenic tree belong to eight species illustrates the utility of our approach. © 2006 Wiley Periodicals, Inc. Int J Quantum Chem, 2006  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号