首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 26 毫秒
1.
Point Accepted Mutation (PAM) is the Markov model of amino acid replacements in proteins introduced by Dayhoff and her co-workers (Dayhoff et al., 1978). The PAM matrices and other matrices based on the PAM model have been widely accepted as the standard scoring system of protein sequence similarity in protein sequence alignment tools. Here, we present Contact Accepted mutatiOn (CAO), a Markov model of protein residue contact mutations. The CAO model simulates the interchanging of structurally defined side-chain contacts, and introduces additional structural information into protein sequence alignments. Therefore, similarities between structurally conserved sequences can be detected even without apparent sequence similarity. CAO has been benchmarked on the HOMSTRAD database and a subset of the CATH database, by comparing sequence alignments with reference alignments derived from structural superposition. CAO yields scores that reflect coherently the structural quality of sequence alignments, which has implications particularly for homology modelling and threading techniques.  相似文献   

2.
Despite recent advances in fold recognition algorithms that identify template structures with distant homology to the target sequence, the quality of the target-template alignment can be a major problem for distantly related proteins in comparative modeling. Here we report for the first time on the use of ensembles of pairwise alignments obtained by stochastic backtracking as a means to improve three-dimensional comparative protein models. In every one of the 35 cases, the ensemble produced by the program probA resulted in alignments that were closer to the structural alignment than those obtained from the optimal alignment. In addition, we examined the lowest energy structure among these ensembles from four different structural assessment methods and compared these with the optimal and structural alignment model. The structural assessment methods consisted of the DFIRE, DOPE, and ProsaII statistical potential energies and the potential energy from the CHARMM protein force field coupled to a Generalized Born implicit solvent model. The results demonstrate that the generation of alignment ensembles through stochastic backtracking using probA combined with one of the statistical potentials for assessing three-dimensional structures can be used to improve comparative models.  相似文献   

3.
The use of a computational docking protocol in conjunction with a protein homology model to derive molecular alignments for Comparative Molecular Field Analysis (CoMFA) was examined. In particular, the DOCK program and a model of the herbicidal target site, photosystem II (PSII), was used to derive alignments for two PSII inhibitor training sets, a set of benzo- and napthoquinones and a set of butenanilides. The protein design software in the QUANTA molecular modeling package was used to develop a homology model of spinach PSII based on the reported amino acid sequence and the X-ray crystal structure of the purple bacterium reaction center. The model is very similar to other reported PSII protein homology models. DOCK was then used to derive alignments for CoMFA modeling by docking the inhibitors in the PSII binding pocket. The molecular alignments produced from docking yielded highly predictive CoMFA models. As a comparison, the more traditional atom-atom alignments of the same two training sets failed to produce predictive CoMFA models. The general utilities of this application for homology model refinement and as an alternative scoring method are discussed.  相似文献   

4.
5.
6.
ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.  相似文献   

7.
Herein, we describe a method to flexibly align molecules (FLAME = FLexibly Align MolEcules). FLAME aligns two molecules by first finding maximum common pharmacophores between them using a genetic algorithm. The resulting alignments are then subjected to simultaneous optimizations of their internal energies and an alignment score. The utility of the method in pairwise alignment, multiple molecule flexible alignment, and database searching was examined. For pairwise alignment, two carboxypeptidase ligands (Protein Data Bank codes and ), two estrogen receptor ligands ( and ), and two thrombin ligands ( and ) were used as test sets. Alignments generated by FLAME starting from CONCORD structures compared very well to the X-ray structures (average root-mean-square deviation = 0.36 A) even without further minimization in the presence of the protein. For multiple flexible alignments, five structurally diverse D3 receptor ligands were used as a test set. The FLAME alignment automatically identified three common pharmacophores: a base, a hydrogen-bond acceptor, and a hydrophobe/aromatic ring. The best alignment was then used to search the MDDR database. The search results were compared to the results using atom pair and Daylight fingerprint similarity. A similar database search comparison was also performed using estrogen receptor modulators. In both cases, hits identified by FLAME were structurally more diverse compared to those from the atom pair and Daylight fingerprint methods.  相似文献   

8.
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman’s subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.  相似文献   

9.
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman’s subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.  相似文献   

10.
Summary A new database of conserved amino acid residues is derived from the multiple sequence alignment of over 84 families of protein sequences that have been reported in the literature. This database contains sequences of conserved hydrophobic core patterns which are probably important for structure and function, since they are conserved for most sequences in that family. This database differs from other single-motif or signature databases reported previously, since it contains multiple patterns for each family. The new database is used to align a new sequence with the conserved regions of a family. This is analogous to reports in the literature where multiple sequence alignments are used to improve a sequence alignment. A program called Homology-Plot (suitable for IBM or compatible computers) uses this database to find homology of a new sequence to a family of protein sequences. There are several advantages to using multiple patterns. First, the program correctly identifies a new sequence as a member of a known family. Second, the search of the entire database is rapid and requires less than one minute. This is similar to performing a multiple sequence alignment of a new sequence to all of the known protein family sequences. Third, the alignment of a new sequence to family members is reliable and can reproduce the alignment of conserved regions already described in the literature. The speed and efficiency of this method is enhanced, since there is no need to score for insertions or deletions as is done in the more commonly used sequence alignment methods. In this method only the patterns are aligned. HomologyPlot also provides general information on each family, as well as a listing of patterns in a family.  相似文献   

11.
In the absence of an experimentally solved structure, a homology model of a protein target can be used instead for virtual screening of drug candidates by docking and scoring. This approach poses a number of questions regarding the choice of the template to use in constructing the model, the accuracy of the screening results, and the importance of allowing for protein flexibility. The present study addresses such questions with compound screening calculations for multiple homology models of five drug targets. A central result is that docking to homology models frequently yields enrichments of known ligands as good as that obtained by docking to a crystal structure of the actual target protein. Interestingly, however, standard measures of the similarity of the template used to build the homology model to the targeted protein show little correlation with the effectiveness of the screening calculations, and docking to the template itself often is as successful as docking to the corresponding homology model. Treating key side chains as mobile produces a modest improvement in the results. The reasons for these sometimes unexpected results, and their implications for future methodologic development, are discussed.  相似文献   

12.
The success of ligand docking calculations typically depends on the quality of the receptor structure. Given improvements in protein structure prediction approaches, approximate protein models now can be routinely obtained for the majority of gene products in a given proteome. Structure‐based virtual screening of large combinatorial libraries of lead candidates against theoretically modeled receptor structures requires fast and reliable docking techniques capable of dealing with structural inaccuracies in protein models. Here, we present Q‐DockLHM, a method for low‐resolution refinement of binding poses provided by FINDSITELHM, a ligand homology modeling approach. We compare its performance to that of classical ligand docking approaches in ligand docking against a representative set of experimental (both holo and apo) as well as theoretically modeled receptor structures. Docking benchmarks reveal that unlike all‐atom docking, Q‐DockLHM exhibits the desired tolerance to the receptor's structure deformation. Our results suggest that the use of an evolution‐based approach to ligand homology modeling followed by fast low‐resolution refinement is capable of achieving satisfactory performance in ligand‐binding pose prediction with promising applicability to proteome‐scale applications. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

13.
There is currently far more sequence information than structural information available, and the ability to use homology models for virtual screening applications is desirable in many cases where structures have not yet been solved. This review focuses on the application of protein kinase homology models for virtual screening use. In addition to reviewing previous cases in which kinase homology models have been used in inhibitor design, we present new data - useful for template selection in homology modeling applications - indicating that the template structure with the highest sequence or structural similarity with the target structure may not always be the best choice. This new work explored the simple hypothesis that better results might be obtained for docking a ligand to a target receptor using a homology model of the target created from a different kinase template co-crystallized with the ligand, than from a crystal structure of the actual kinase target that is unliganded or bound to an unrelated ligand. This hypothesis was tested in docking studies of staurosporine with eight different kinases: AutoDock was used to dock staurosporine to homology models of each kinase created from staurosporine-bound template structures, and the results were compared with docking staurosporine to crystal structures of the target kinase that were obtained in complex with a non-staurosporine ligand or no ligand. It was found that the homology models performed as well as or better than the crystal structures, suggesting that using a homology model created from a template crystallized with a representative ligand may in some cases be a preferred approach, especially in virtual screening experiments that focus on enriching for members of a particular inhibitor class.  相似文献   

14.
The three-dimensional (3D) superimposition of molecules of one biological target reflecting their relative bioactive orientation is key for several ligand-based drug design studies (e.g., QSAR studies, pharmacophore modeling). However, with the lack of sufficient ligand-protein complex structures, an experimental alignment is difficult or often impossible to obtain. Several computational 3D alignment tools have been developed by academic or commercial groups to address this challenge. Here, we present a new approach, MARS (Multiple Alignments by ROCS-based Similarity), that is based on the pairwise alignment of all molecules within the data set using the tool ROCS (Rapid Overlay of Chemical Structures). Each pairwise alignment is scored, and the results are captured in a score matrix. The ideal superimposition of the compounds in the set is then identified by the analysis of the score matrix building stepwise a superimposition of all molecules. The algorithm exploits similarities among all molecules in the data set to compute an optimal 3D alignment. This alignment tool presented here can be used for several applications, including pharmacophore model generation, 3D QSAR modeling, 3D clustering, identification of structural outliers, and addition of compounds to an already existing alignment. Case studies are shown, validating the 3D alignments for six different data sets.  相似文献   

15.
Class A G-protein-coupled receptors (GPCRs) are among the most important targets for drug discovery. However, a large set of experimental structures, essential for a structure-based approach, will likely remain unavailable in the near future. Thus, there is an actual need for modeling tools to characterize satisfactorily at least the binding site of these receptors. Using experimentally solved GPCRs, we have enhanced and validated the ligand-steered homology method through cross-modeling and investigated the performance of the thus generated models in docking-based screening. The ligand-steered modeling method uses information about existing ligands to optimize the binding site by accounting for protein flexibility. We found that our method is able to generate quality models of GPCRs by using one structural template. These models perform better than templates, crude homology models, and random selection in small-scale high-throughput docking. Better quality models typically exhibit higher enrichment in docking exercises. Moreover, they were found to be reliable for selectivity prediction. Our results support the fact that the ligand-steered homology modeling method can successfully characterize pharmacologically relevant sites through a full flexible ligand-flexible receptor procedure.  相似文献   

16.
17.
Evolutionarily related proteins have similar sequences. Such similarity is called homology and can be described using substitution matrices such as Blosum 60. Naturally occurring homologous proteins usually have similar stable tertiary structures and this fact is used in so-called homology modeling. In contrast, the artificial protein designed by the Regan group has 50% identical sequence to the B1 domain of Streptococcal IgG-binding protein and a structure similar to the protein Rop. In this study, we asked the question whether artificial similar protein sequences (pseudohomologs) tend to encode similar protein structures, such as proteins existing in nature. To answer this question, we designed sets of protein sequences (pseudohomologs) homologous to sequences having known three-dimensional structures (template structures), same number of identities, same composition and equal level of homology, according to Blosum 60 substitution matrix as the known natural homolog. We compared the structural features of homologs and pseudohomologs by fitting them to the template structure. The quality of such structures was evaluated by threading potentials. The packing quality was measured using three-dimensional homology models. The packing quality of the models was worse for the “pseudohomologs” than for real homologs. The native homologs have better threading potentials (indicating better sequence-structure fit) in the native structure than the designed sequences. Therefore, we have shown that threading potentials and proper packing are evolutionarily more strongly conserved than sequence homology measured using the Blosum 60 matrix. Our results indicate that three-dimensional protein structure is evolutionarily more conserved than expected due to sequence conservation.  相似文献   

18.
For biological applications, sequence alignment is an important strategy to analyze DNA and protein sequences. Multiple sequence alignment is an essential methodology to study biological data, such as homology modeling, phylogenetic reconstruction and etc. However, multiple sequence alignment is a NP-hard problem. In the past decades, progressive approach has been proposed to successfully align multiple sequences by adopting iterative pairwise alignments. Due to rapid growth of the next generation sequencing technologies, a large number of sequences can be produced in a short period of time. When the problem instance is large, progressive alignment will be time consuming. Parallel computing is a suitable solution for such applications, and GPU is one of the important architectures for contemporary parallel computing researches. Therefore, we proposed a GPU version of ClustalW v2.0.11, called CUDA ClustalW v1.0, in this work. From the experiment results, it can be seen that the CUDA ClustalW v1.0 can achieve more than 33× speedups for overall execution time by comparing to ClustalW v2.0.11.  相似文献   

19.
Due to the exponential growth of sequenced genomes, the need to quickly provide accurate annotation for existing and new sequences is paramount to facilitate biological research. Current sequence comparison approaches fail to detect homologous relationships when sequence similarity is low. Support vector machine (SVM) algorithms approach this problem by transforming all proteins into a feature space of equal dimension based on protein properties, such as sequence similarity scores against a basis set of proteins or motifs. This multivariate representation of the protein space is then used to build a classifier specific to a pre-defined protein family. However, this approach is not well suited to large-scale annotation. We have developed a SVM approach that formulates remote homology as a single classifier that answers the pairwise comparison problem by integrating the two feature vectors for a pair of sequences into a single vector representation that can be used to build a classifier that separates sequence pairs into homologs and non-homologs. This pairwise SVM approach significantly improves the task of remote homology detection on the benchmark dataset, quantified as the area under the receiver operating characteristic curve; 0.97 versus 0.73 and 0.70 for PSI-BLAST and Basic Local Alignment Search Tool (BLAST), respectively.  相似文献   

20.
X-ray-based alignments of bioactive compounds are commonly used to correlate structural changes with changes in potencies, ultimately leading to three-dimensional quantitative structure–activity relationships such as CoMFA or CoMSIA models that can provide further guidance for the design of new compounds. We have analyzed data sets where the alignment of the compounds is entirely based on experimentally derived ligand poses from X-ray-crystallography. We developed CoMFA and CoMSIA models from these X-ray-determined receptor-bound conformations and compared the results with models generated from ligand-centric Template CoMFA, finding that the fluctuations in the positions and conformations of compounds dominate X-ray-based alignments can yield poorer predictions than those from the self-consistent template CoMFA alignments. Also, when there exist multiple different binding modes, structural interpretation in terms of binding site constraints can often be simpler with template-based alignments than with X-ray-based alignments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号