首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
A novel approach to the problem of protein alignments is described, which in comparison with existing approaches is visibly more efficient. This approach is based on superposition of amino acid adjacency matrices of a pair of proteins, which have been modified to record the sequential order of amino acids. As a result, one obtains simultaneously all segments of the two proteins which are shifted relative to one another by one or more positions in either directions, without need of a prior exhaustive search for an alignment that included unproductive directions and unknown displacement steps.  相似文献   

2.
We illustrate solving the protein alignment problem exactly using the algorithm VESPA (very efficient search for protein alignment). We have compared our result with the approximate solution obtained with BLAST (basic local alignment search tool) software, which is currently the most widely used for searching for protein alignment. We have selected human and mouse proteins having around 170 amino acids for comparison. The exact solution has found 78 pairs of amino acids, to which one should add 17 individual amino acid alignments giving a total of 95 aligned amino acids. BLAST has identified 64 aligned amino acids which involve pairs of more than two adjacent amino acids. However, the difference between the two outputs is not as large as it may appear, because a number of amino acids that are adjacent have been reported by BLAST as single amino acids. So if one counts all amino acids, whether isolated (single) or in a group of two and more amino acids, then the count for BLAST is 89 and for VESPA is 95, a difference of only six. © 2015 Wiley Periodicals, Inc.  相似文献   

3.
The presented program ALIGN_MTX makes alignment of two textual sequences with an opportunity to use any several characters for the designation of sequence elements and arbitrary user substitution matrices. It can be used not only for the alignment of amino acid and nucleotide sequences but also for sequence-structure alignment used in threading, amino acid sequence alignment, using preliminary known PSSM matrix, and in other cases when alignment of biological or non-biological textual sequences is required. This distinguishes it from the majority of similar alignment programs that make, as a rule, alignment only of amino acid or nucleotide sequences represented as a sequence of single alphabetic characters. ALIGN_MTX is presented as downloadable zip archive at http://www.imbbp.org/software/ALIGN_MTX/ and available for free use.As application of using the program, the results of comparison of different types of substitution matrix for alignment quality in distantly related protein pair sets were presented. Threading matrix SORDIS, based on side-chain orientation in relation to hydrophobic core centers with evolutionary change-based substitution matrix BLOSUM and using multiple sequence alignment information position-specific score matrices (PSSM) were taken for test alignment accuracy. The best performance shows PSSM matrix, but in the reduced set with lower sequence similarity threading matrix SORDIS shows the same performance and it was shown that combined potential with SORDIS and PSSM can improve alignment quality in evolutionary distantly related protein pairs.  相似文献   

4.
The dielectric relaxation spectrum over the frequency range 102 to 1.8×109 Hz of 4‐octyl‐4′‐cyanobiphenyl, 8CB, in bulk and confined to 200 nm diameter cylindrical pores is reported. We used matrices with parallel cylindrical pores, obtaining different alignments of the molecular director depending on the treatment. Results show that there are two relaxations in the isotropic phase and in the mesophases for parallel alignment and three for perpendicular alignment. The molecular origin of theses modes and the effect of the confinement on their dynamics are discussed. To compare properly the results for bulk and confined 8CB, a re‐scaling of the experimental data is proposed.  相似文献   

5.
We consider a novel numerical representation of proteins obtained by assigning to individual amino acids the polar coordinate on a unit circle. As a result one can represent protein sequence as one-dimensional numerical sequence, the entries of which when subtracted facilitates search for alignment between pairs of proteins of interest. The alignment is sought by shifting one sequence relative to another by several sequence units to the left or to the right. The novel approach is illustrated on two yeast proteins having 174 and 171 amino acids. Visiting Emeritus from the Department of Mathematics & Computer Science Drake University, Des Moines, Iowa.  相似文献   

6.
For biological applications, sequence alignment is an important strategy to analyze DNA and protein sequences. Multiple sequence alignment is an essential methodology to study biological data, such as homology modeling, phylogenetic reconstruction and etc. However, multiple sequence alignment is a NP-hard problem. In the past decades, progressive approach has been proposed to successfully align multiple sequences by adopting iterative pairwise alignments. Due to rapid growth of the next generation sequencing technologies, a large number of sequences can be produced in a short period of time. When the problem instance is large, progressive alignment will be time consuming. Parallel computing is a suitable solution for such applications, and GPU is one of the important architectures for contemporary parallel computing researches. Therefore, we proposed a GPU version of ClustalW v2.0.11, called CUDA ClustalW v1.0, in this work. From the experiment results, it can be seen that the CUDA ClustalW v1.0 can achieve more than 33× speedups for overall execution time by comparing to ClustalW v2.0.11.  相似文献   

7.
《Chemical physics letters》2006,417(1-3):173-178
We considered constructing three 8-component vectors for a DNA primary sequence using triplets of nucleic acid bases. For two DNA sequences, using the corresponding vectors, we constructed a set of 3 × 3 matrices called the related matrix. The normalized leading eigenvalues from the constructed matrices were selected as invariants to characterize the degree of similarity between the two DNA sequences. Construction of similarity/dissimilarity tables based on this invariant for sequences of DNA of the first exon of the β-globin gene from 11 species illustrates the utility of the newly formulated matrices for DNA.  相似文献   

8.
Point Accepted Mutation (PAM) is the Markov model of amino acid replacements in proteins introduced by Dayhoff and her co-workers (Dayhoff et al., 1978). The PAM matrices and other matrices based on the PAM model have been widely accepted as the standard scoring system of protein sequence similarity in protein sequence alignment tools. Here, we present Contact Accepted mutatiOn (CAO), a Markov model of protein residue contact mutations. The CAO model simulates the interchanging of structurally defined side-chain contacts, and introduces additional structural information into protein sequence alignments. Therefore, similarities between structurally conserved sequences can be detected even without apparent sequence similarity. CAO has been benchmarked on the HOMSTRAD database and a subset of the CATH database, by comparing sequence alignments with reference alignments derived from structural superposition. CAO yields scores that reflect coherently the structural quality of sequence alignments, which has implications particularly for homology modelling and threading techniques.  相似文献   

9.
Recently, KOD and its related DNA polymerases have been used for preparing various modified nucleic acids, including not only base-modified nucleic acids, but also sugar-modified ones, such as bridged/locked nucleic acid (BNA/LNA) which would be promising candidates for nucleic acid drugs. However, thus far, reasons for the effectiveness of KOD DNA polymerase for such purposes have not been clearly elucidated. Therefore, using mutated KOD DNA polymerases, we studied here their catalytic properties upon enzymatic incorporation of nucleotide analogues with base/sugar modifications. Experimental data indicate that their characteristic kinetic properties enabled incorporation of various modified nucleotides. Among those KOD mutants, one achieved efficient successive incorporation of bridged nucleotides with a 2'-ONHCH?CH?-4' linkage. In this study, the characteristic kinetic properties of KOD DNA polymerase for modified nucleoside triphosphates were shown, and the effectiveness of genetic engineering in improvement of the enzyme for modified nucleotide polymerization has been demonstrated.  相似文献   

10.
A simple and efficient transformation of the zwitterionic luminarosine into a brightly fluorescent cationic analogue, namely 1‐amino‐9‐methoxy‐2,4,10‐triaza‐4b‐azoniaphenanthrene ( 3 ), is reported. The fluorescence quenching of 3 by common nucleotides, calf‐thymus (CT) DNA, and halide ions was investigated by means of spectrophotometric and spectrofluorometric methods. Intermolecular static and dynamic fluorescence‐quenching constants for quenching of 3 by nucleotides and halide ions were determined in aqueous solution. Evidence for formation of nonfluorescent ground‐state complexes of 3 with nucleotides and CT‐DNA is presented. Scatchard analysis of the CT‐DNA quenching data resulted in a binding constant of 2.8×104 M −1 and a number of binding sites per base pair of 0.049.  相似文献   

11.
We introduce a method for ungapped local multiple alignment (ULMA) in a given set of amino acid or nucleotide sequences. This method explores two search spaces using a linked optimization strategy. The first search space M consists of all possible words of a given length W, defined on the residue alphabet. An evolutionary algorithm searches this space globally. The second search space P consists of all possible ULMAs in the sequence set, each ULMA being represented by a position vector defining exactly one subsequence of length W per sequence. This search space is sampled with hill-climbing processes. The search of both spaces are coupled by projecting high scoring results from the global evolutionary search of M onto P. The hill-climbing processes then refine the optimization by local search, using the relative entropy between the ULMA and background residue frequencies as an objective function. We demonstrate some advantages of our strategy by analyzing difficult natural amino acid sequences and artificial datasets. A web interface is available at  相似文献   

12.
The binding specificity of silver cations to abasic (AP) site-containing DNA was electrochemically investigated by comparison with the fully matched DNA without the AP site. AP site-containing DNA is designed in a way that only the nucleotide opposite the AP site is variable to allow for coexistence of an unpaired nucleotide and a number of DNA base pairs. The surface of a gold electrode was modified by AP site-containing DNA duplex on which Ag+ binding specificity was evaluated. Electrochemical investigations on the AP-DNA-modified electrodes reveal that Ag+ preferentially associates to the unpaired nucleotides instead of the coexisted base pairs and shows sequence-dependant binding, especially stronger for purines than for pyrimidines. Additionally, the hydrogen bond pattern moieties of the unpaired nucleotides should be involved in Ag+ binding evidenced by a decrease of the redox signal when introducing a ligand with its hydrogen bond moiety complementary to the nucleotide deoxycytidine. This is the first attempt to make a comparison in one DNA molecule for metal ion binding to coexisted unpaired nucleotide and DNA base pairs. The present method demonstrates an easy way for investigating binding specificity of heavy metal ions to AP site in the presence of coexisted DNA base pairs.  相似文献   

13.
The carp mitochondrial URFA6L gene consists of 165 base pairs. The overall structural organization of the gene is very similar to that of the Xenopus URFA6L gene. Their nucleotide sequences exhibit 68% homology. The carp URFA6L gene encodes a protein of 54 amino acids. The amino acid composition of the protein is unusual because almost half of the residues consist of 5 hydrophobic amino acids(proline, tryptophan, leueine, isoleueine and tyrosine). A comparison between the amino acid sequences of 5 vertebrate URFA6L proteins and the yeast ATPase8 showed that they have weak but very important common structural features, suggesting that the vertebrate URFA6L proteins may function asATPase8. The nucleotide sequence of the lysine tRNA gene from carp has been determined and represented in cloverleaf secondary structure. Similar to amphibian and mammalian mitochondrial tRNA~(Lys) genes, the carp mitochondrial tRNA~(Tys) gene also has some unusual structural features as compared with its cytoplasmic counterpart  相似文献   

14.
The carp mitochondrial URFA6L gene consists of 165 base pairs. The overall structural organization of the gene is very similar to that of the Xenopus URFA6L gene. Their nucleotide sequences exhibit 68% homology. The carp URFA6L gene encodes a protein of 54 amino acids. The amino acid composition of the protein is unusual because almost half of the residues consist of 5 hydrophobic amino acids (proline, tryptophan, leucine, isoleucine and tyrosine). A comparison between the amino acid sequences of 5 vertebrate URFA6L proteins and the yeast ATPase 8 showed that they have weak but very important common structural features, suggesting that the vertebrate URFA6L proteins may function as ATPase8. The nucleotide sequence of the lysine tRNA gene from carp has been determined and represented in clover-leaf secondary structure. Similar to amphibian and mammalian mitochondrial tRNA(Lys) genes, the carp mitochondrial tRNA(Lys) gene also has some unusual structural features as compared with its cytoplasmic counterpart. A comparison between the nucleotide sequences of the tRNA(Lys) gene from 7 vertebrates showed that the most conservative portions are the anticodon loop, nucleotides 8 and 9, the variable loop, the anticodon stem and the aminoacyl stem. The least conservative portions are the D-loop and the T-loop. These structural features may show that the mitochondrial tRNA(Lys) has a different interaction with mitochondrial ribosome.  相似文献   

15.
Predicting RNA secondary structure using evolutionary history can be carried out by using an alignment of related RNA sequences with conserved structure. Accurately determining evolutionary substitution rates for base pairs and single stranded nucleotides is a concern for methods based on this type of approach. Determining these rates can be hard to do reliably without a large and accurate initial alignment, which ideally also has structural annotation. Hence, one must often apply rates extracted from other RNA families with trusted alignments and structures. Here, we investigate this problem by applying rates derived from tRNA and rRNA to the prediction of the much more rapidly evolving 5'-region of HIV-1. We find that the HIV-1 prediction is in agreement with experimental data, even though the relative evolutionary rate between A and G is significantly increased, both in stem and loop regions. In addition we obtained an alignment of the 5' HIV-1 region that is more consistent with the structure than that currently in the database. We added randomized noise to the original values of the rates to investigate the stability of predictions to rate matrix deviations. We find that changes within a fairly large range still produce reliable predictions and conclude that using rates from a limited set of RNA sequences is valid over a broader range of sequences.  相似文献   

16.
A training set of 55 antifungal p450 analogue inhibitors was used to construct receptor-independent four-dimensional quantitative structure-activity relationship (RI 4D-QSAR) models. Ten different alignments were used to build the models, and one alignment yields a significantly better model than the other alignments. Two different methodologies were used to measure the similarity of the best 4D-QSAR models of each alignment. One method compares the residual of fit between pairs of models using the cross-correlation coefficient of their residuals of fit as a similarity measure. The other method compares the spatial distributions of the IPE types (3D-pharmacophores) of pairs of 4D-QSAR models from different alignments. Optimum models from several different alignments have nearly the same correlation coefficients, r(2), and cross-validation correlation coefficients, xv-r(2), yet the 3D-pharmacophores of these models are very different from one another. The highest 3D-pharmacophore similarity correlation coefficient between any pair of 4D-QSAR models from the 10 alignments considered is only 0.216. However, the best 4D-QSAR models of each alignment do contain some proximate common pharmacorphore sites. A test set of 10 compounds was used to validate the predictivity of the best 4D-QSAR models of each alignment. The "best" model from the 10 alignments has the highest predictivity. The inferred active sites mapped out by the 4D-QSAR models suggest that hydrogen bond interactions are not prevalent when this class of P450 analogue inhibitors binds to the receptor active site. This feature of the 4D-QSAR models is in agreement with the crystal structure results that indicate no ligand-receptor hydrogen bonds are formed.  相似文献   

17.
Summary A new database of conserved amino acid residues is derived from the multiple sequence alignment of over 84 families of protein sequences that have been reported in the literature. This database contains sequences of conserved hydrophobic core patterns which are probably important for structure and function, since they are conserved for most sequences in that family. This database differs from other single-motif or signature databases reported previously, since it contains multiple patterns for each family. The new database is used to align a new sequence with the conserved regions of a family. This is analogous to reports in the literature where multiple sequence alignments are used to improve a sequence alignment. A program called Homology-Plot (suitable for IBM or compatible computers) uses this database to find homology of a new sequence to a family of protein sequences. There are several advantages to using multiple patterns. First, the program correctly identifies a new sequence as a member of a known family. Second, the search of the entire database is rapid and requires less than one minute. This is similar to performing a multiple sequence alignment of a new sequence to all of the known protein family sequences. Third, the alignment of a new sequence to family members is reliable and can reproduce the alignment of conserved regions already described in the literature. The speed and efficiency of this method is enhanced, since there is no need to score for insertions or deletions as is done in the more commonly used sequence alignment methods. In this method only the patterns are aligned. HomologyPlot also provides general information on each family, as well as a listing of patterns in a family.  相似文献   

18.
Thiourea, like some nucleotides and certain amino acids in a relatively narrow potential region, shows catalytic properties for the transfer of Cu(II) ions through an adsorbed layer of tri-n-butyl phosphate (TBP). In the trace region between 5×10?8M and 2×10?5M thiourea forms current peaks whose height is proportional to the thiourea concentration in solution. In favourable cases, if thiourea is present with e.g. adenosine-5′-diphosphate (ADP) two peaks appear beside one another. It is supposed that the formation of an adsorption complex is the cause of the development of these peaks.  相似文献   

19.
The positions of a given fold always occupied by strong hydrophobic amino acids (V, I, L, F, M, Y, W), which we call “topohydrophobic positions”, were detected and their properties demonstrated within 153 non-redundant families of homologous domains, through 3D structural alignments. Sets of divergent sequences possessing at least four to five members appear to be as informative as larger sets, provided that their mean pairwise sequence identity is low. Amino acids in topohydrophobic positions exhibit several interesting features: they are much more buried than their equivalents in non-topohydrophobic positions, their side chains are far less dispersed; and they often constitute a lattice of close contacts in the inner core of globular domains. In most cases, each regular secondary structure possesses one to three topohydrophobic positions, which cluster in the domain core. Moreover, using sensitive alignment processes such as hydrophobic cluster analysis (HCA), it is possible to identify topohydrophobic positions from only a small set of divergent sequences. Amino acids in topohydrophobic positions, which can be identified directly from sequences, constitute key markers of protein folds, define long-range structural constraints, which, together with secondary structure predictions, limit the number of possible conformations for a given fold. Received: 24 April 1998 / Accepted: 4 August 1998 / Published online: 16 November 1998  相似文献   

20.
A fast computer algorithm brings computation of the permanents of sparse matrices, specifically, molecular adjacency matrices. Examples and results are presented, along with a discussion of the relationship of the permanent to the Kekulé structure count. A simple method is presented for determining the Kekulé structure count of alternant hydrocarbons. For these hydrocarbons, the square of the Kekulé structure count is equal to the permanent of the adjacency matrix. In addition, for alternant structures the adjacency matrix for N atoms can be written in such a way that only an N/2 × N/2 matrix need be evaluated. The Kekulé structure count correlates with topological indices. The inclusion of the number of cycles improves the fit. When comparing with previous results, the variance decreases 74%. The calculated standard heat of formation correlates with the logarithm of the Kekulé structure count. This heat increments 349 kJ/mol each time the Kekulé structure count increases by one order of magnitude. © 2002 Wiley Periodicals, Inc. Int J Quantum Chem, 2002  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号