首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Summary This paper describes techniques for calculating the degree of similarity between an input query molecule and each of the molecules in a database of 3-D chemical structures. The inter-molecular similarity measure used is the number of atoms in the 3-D common substructure (CS) between the two molecules which are being compared. The identification of 3-D CSs is very demanding of computational resources, even when an efficient clique detection algorithm is used for this purpose. Two types of upperbound calculation are described which allow reductions in the number of exact CS searches which need to be carried out to identify those molecules from a database which are similar to a 3-D target molecule.  相似文献   

2.
Summary This paper reports a comparison of several methods for measuring the degree of similarity between pairs of 3-D chemical structures that are represented by inter-atomic distance matrices. The methods that have been tested use the distance information in very different ways and have very different computational requirements. Experiments with 10 small datasets, for which both structural and biological activity data are available, suggest that the most cost-effective technique is based on a mapping procedure that tries to match pairs of atoms, one from each of the molecules that are being compared, that have neighbouring atoms at approximately the same distances.  相似文献   

3.
This paper describes a method for calculating the similarity between pairs of chemical structures represented by 3D molecular graphs. The method is based on a graph matching procedure that accommodates conformational flexibility by using distance ranges between pairs of atoms, rather than fixing the atom pair distances. These distance ranges are generated using triangle and tetrangle bound smoothing techniques from distance geometry. The effectiveness of the proposed method in retrieving other compounds of like biological activity is evaluated, and the results are compared with those obtained from other, 2D-based methods for similarity searching.  相似文献   

4.
This paper describes a program for 3D similarity searching, called CLIP (for Candidate Ligand Identification Program), that uses the Bron-Kerbosch clique detection algorithm to find those structures in a file that have large structures in common with a target structure. Structures are characterized by the geometric arrangement of pharmacophore points and the similarity between two structures calculated using modifications of the Simpson and Tanimoto association coefficients. This modification takes into account the fact that a distance tolerance is required to ensure that pairs of interatomic distances can be regarded as equivalent during the clique-construction stage of the matching algorithm. Experiments with HIV assay data demonstrate the effectiveness and the efficiency of this approach to virtual screening.  相似文献   

5.
The use of two types of parallel computer hardware for increasing the efficiency of processing in chemical structure data bases is discussed. The distributed array processor can be used for the clustering of 2-D chemical structure data bases by using the Jarvis—Patrick clustering method and for the ranking of output in an experimental system for substructure searching in the 3-D macromolecules in the Protein Data Bank. The Inmos transputer can be used in the construction of PC-based systems for 2-D substructure searching and in the identification of the maximal substructures common to pairs of 3-D molecules.  相似文献   

6.
An algorithm for similarity recognition of molecules and molecular clusters is presented which also establishes the optimum matching among atoms of different structures. In the first step of the algorithm, a set of molecules are coarsely superimposed by transforming them into a common reference coordinate system. The optimum atomic matching among structures is then found with the help of the Hungarian algorithm. For this, pairs of structures are represented as complete bipartite graphs with a weight function that uses intermolecular atomic distances. In the final step, a rotational superposition method is applied using the optimum atomic matching found. This yields the minimum root mean square deviation of intermolecular atomic distances with respect to arbitrary rotation and translation of the molecules. Combined with an effective similarity prescreening method, our algorithm shows robustness and an effective quadratic scaling of computational time with the number of atoms.  相似文献   

7.
8.
9.
This paper presents an exploratory study of a novel method for flexible 3-D similarity searching based on autocorrelation vectors and smoothed bounded distance matrices. Although the new approach is unable to outperform an existing 2-D similarity searching in terms of enrichment factors, it is able to retrieve different compounds at a given percentage of the hit-list and so may be a useful adjunct to other similarity searching methods.  相似文献   

10.
In this study we evaluate how far the scope of similarity searching can be extended to identify not only ligands binding to the same target as the reference ligand(s) but also ligands of other homologous targets without initially known ligands. This "homology-based similarity searching" requires molecular representations reflecting the ability of a molecule to interact with target proteins. The Similog keys, which are introduced here as a new molecular representation, were designed to fulfill such requirements. They are based only on the molecular constitution and are counts of atom triplets. Each triplet is characterized by the graph distances and the types of its atoms. The atom-typing scheme classifies each atom by its function as H-bond donor or acceptor and by its electronegativity and bulkiness. In this study the Similog keys are investigated in retrospective in silico screening experiments and compared with other conformation independent molecular representations. Studied were molecules of the MDDR database for which the activity data was augmented by standardized target classification information from public protein classification databases. The MDDR molecule set was split randomly into two halves. The first half formed the candidate set. Ligands of four targets (dopamine D2 receptor, opioid delta-receptor, factor Xa serine protease, and progesterone receptor) were taken from the second half to form the respective reference sets. Different similarity calculation methods are used to rank the molecules of the candidate set by their similarity to each of the four reference sets. The accumulated counts of molecules binding to the reference target and groups of targets with decreasing homology to it were examined as a function of the similarity rank for each reference set and similarity method. In summary, similarity searching based on Unity 2D-fingerprints or Similog keys are found to be equally effective in the identification of molecules binding to the same target as the reference set. However, the application of the Similog keys is more effective in comparison with the other investigated methods in the identification of ligands binding to any target belonging to the same family as the reference target. We attribute this superiority to the fact that the Similog keys provide a generalization of the chemical elements and that the keys are counted instead of merely noting their presence or absence in a binary form. The second most effective molecular representation are the occurrence counts of the public ISIS key fragments, which like the Similog method, incorporates key counting as well as a generalization of the chemical elements. The results obtained suggest that ligands for a new target can be identified by the following three-step procedure: 1. Select at least one target with known ligands which is homologous to the new target. 2. Combine the known ligands of the selected target(s) to a reference set. 3. Search candidate ligands for the new targets by their similarity to the reference set using the Similog method. This clearly enlarges the scope of similarity searching from the classical application for a single target to the identification of candidate ligands for whole target families and is expected to be of key utility for further systematic chemogenomics exploration of previously well explored target families.  相似文献   

11.
12.
We propose a suite of novel algorithms for image analysis of protein expression images obtained from 2-D electrophoresis. These algorithms are a segmentation algorithm for protein spot identification, and an algorithm for matching protein spots from two corresponding images for differential expression study. The proposed segmentation algorithm employs the watershed transformation, k-means analysis, and distance transform to locate the centroids and to extract the regions of the proteins spots. The proposed spot matching algorithm is an integration of the hierarchical-based and optimization-based methods. The hierarchical method is first used to find corresponding pairs of protein spots satisfying the local cross-correlation and overlapping constraints. The matching energy function based on local structure similarity, image similarity, and spatial constraints is then formulated and optimized. Our new algorithm suite has been extensively tested on synthetic and actual 2-D gel images from various biological experiments, and in quantitative comparisons with ImageMaster2D Platinum the proposed algorithms exhibit better spot detection and spot matching.  相似文献   

13.
We present a new algorithm for the fast and reliable structure prediction of synthetic receptor-ligand complexes. Our method is based on the protein-ligand docking program FlexX and extends our recently introduced docking technique for synthetic receptors, which has been implemented in the program FlexR. To handle the flexibility of the relevant molecules, we apply a novel docking strategy that uses an adaptive two-sided incremental construction algorithm which incorporates the structural flexibility of both the ligand and synthetic receptor. We follow an adaptive strategy, in which one molecule is expanded by attaching its next fragment in all possible torsion angles, whereas the other (partially assembled) molecule serves as a rigid binding partner. Then the roles of the molecules are exchanged. Geometric filters are used to discard partial conformations that cannot realize a targeted interaction pattern derived in a graph-based precomputation phase. The process is repeated until the entire complex is built up. Our algorithm produces promising results on a test data set comprising 10 complexes of synthetic receptors and ligands. The method generated near-native solutions compared to crystal structures in all but one case. It is able to generate solutions within a couple of minutes and has the potential of being used as a virtual screening tool for searching for suitable guest molecules for a given synthetic receptor in large databases of guests and vice versa.  相似文献   

14.
Similarity-based methods for virtual screening are widely used. However, conventional searching using 2D chemical fingerprints or 2D graphs may retrieve only compounds which are structurally very similar to the original target molecule. Of particular current interest then is scaffold hopping, that is, the ability to identify molecules that belong to different chemical series but which could form the same interactions with a receptor. Reduced graphs provide summary representations of chemical structures and, therefore, offer the potential to retrieve compounds that are similar in terms of their gross features rather than at the atom-bond level. Using only a fingerprint representation of such graphs, we have previously shown that actives retrieved were more diverse than those found using Daylight fingerprints. Maximum common substructures give an intuitively reasonable view of the similarity between two molecules. However, their calculation using graph-matching techniques is too time-consuming for use in practical similarity searching in larger data sets. In this work, we exploit the low cardinality of the reduced graph in graph-based similarity searching. We reinterpret the reduced graph as a fully connected graph using the bond-distance information of the original graph. We describe searches, using both the maximum common induced subgraph and maximum common edge subgraph formulations, on the fully connected reduced graphs and compare the results with those obtained using both conventional chemical and reduced graph fingerprints. We show that graph matching using fully connected reduced graphs is an effective retrieval method and that the actives retrieved are likely to be topologically different from those retrieved using conventional 2D methods.  相似文献   

15.
16.
As an intuitive concept, molecular similarity has played a fundamental role in chemistry. It is implicit in Hammiond's postulate, in the principle of minimum structure change, and in the assumption that similar structures tend to have similar properties, With the advent of large computers, computable definitions of similarity are being used in the pharmaceutical industry for similarity searching, dissimilarity selection, molecular superpositioning, structure generation, and quantitative structure-activity analysis. The diversity of applications of computable definitions of molecular similarity has often obscured important mathematical commonalities underlying these definitions. The broadest commonalities are relationships based of equivalence, matching, partial ordering, and proximity. A mathematical space suitable for molecular similarity analysis consists of a set of mathematical structures and one or more of these similarity relationships defined on that set. This report Surveys the mathematical spaces used in molecular similarity analysis. The survey covers the types of chemical information, similarity relationships, and applications associated with the use of each mathematical space in a molecular similarity context.  相似文献   

17.
18.
Recognition of small molecules by proteins depends on three-dimensional molecular surface complementarity. However, the dominant techniques for analyzing the similarity of small molecules are based on two-dimensional chemical structure, with such techniques often outperforming three-dimensional techniques in side-by-side comparisons of correlation to biological activity. This paper introduces a new molecular similarity method, termed morphological similarity (MS), that addresses the apparent paradox. Two sets of molecule pairs are identified from a set of ligands whose protein-bound states are known crystallographically. Pairs that bind the same protein sites form the first set, and pairs that bind different sites form the second. MS is shown to separate the two sets significantly better than a benchmark 2D similarity technique. Further, MS agrees with crystallographic observation of bound ligand states, independent of information about bound states. MS is efficient to compute and can be practically applied to large libraries of compounds.  相似文献   

19.
Semi-empirical calculations including an empirical dispersive correction are used to calculate intermolecular interaction energies and structures for a large database containing 156 biologically relevant molecules (hydrogen-bonded DNA base pairs, interstrand base pairs, stacked base pairs and amino acid base pairs) for which MP2 and CCSD(T) complete basis set (CBS) limit estimates of the interaction energies are available. The dispersion corrected semi-empirical methods are parameterised against a small training set of 22 complexes having a range of biologically important non-covalent interactions. For the full molecule set (156 complexes), compared to the high-level ab initio database, the mean unsigned errors of the interaction energies at the corrected semi-empirical level are 1.1 (AM1-D) and 1.2 (PM3-D) kcal mol(-1), being a significant improvement over existing AM1 and PM3 methods (8.6 and 8.2 kcal mol(-1)). Importantly, the new semi-empirical methods are capable of describing the diverse range of biological interactions, most notably stacking interactions, which are poorly described by both current AM1 and PM3 methods and by many DFT functionals. The new methods require no more computer time than existing semi-empirical methods and therefore represent an important advance in the study of important biological interactions.  相似文献   

20.
A knowledge-based method for calculating the similarity of functional groups is described and validated. The method is based on experimental information derived from small molecule crystal structures. These data are used in the form of scatterplots that show the likelihood of a non-bonded interaction being formed between functional group A (the `central group') and functional group B (the `contact group' or `probe'). The scatterplots are converted into three-dimensional maps that show the propensity of the probe at different positions around the central group. Here we describe how to calculate the similarity of a pair of central groups based on these maps. The similarity method is validated using bioisosteric functional group pairs identified in the Bioster database and Relibase. The Bioster database is a critical compilation of thousands of bioisosteric molecule pairs, including drugs, enzyme inhibitors and agrochemicals. Relibase is an object-oriented database containing structural data about protein-ligand interactions. The distributions of the similarities of the bioisosteric functional group pairs are compared with similarities for all the possible pairs in IsoStar, and are found to be significantly different. Enrichment factors are also calculated showing the similarity method is statistically significantly better than random in predicting bioisosteric functional group pairs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号