首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
We introduce a method to determine a structural distance between any pair of molecular scaffolds. The development of this approach was motivated by the need to accurately evaluate scaffold hopping studies in virtual screening and medicinal chemistry and assess the degree of difficulty involved in facilitating a transition from one structure to another. In order to consistently derive structural distances, scaffolds of different composition and topology are subjected to molecular editing procedures that abstract from original scaffolds in a defined manner until compositional and topological equivalence can be established. Pairs of corresponding scaffold representations are transformed into one-dimensional atom sequences that are aligned using approaches adapted from biological sequence comparison. From best scoring atom sequence alignments, interscaffold distances are derived. The algorithm is evaluated at different levels including the analysis of a series of model scaffolds with defined chemical changes, a scaffold library, and scaffolds from reference compounds and hits of successful virtual screening applications. It is demonstrated that chemically intuitive scaffold distances are obtained for pairs of scaffolds with varying composition and topology. Distance threshold values for close and remote structural relationships between scaffolds are also determined. The methodology is made publicly available in order to provide a basis for a consistent assessment of scaffold hopping ability and to aid in the evaluation and comparison of virtual screening methods.  相似文献   

2.
Computational scaffold hopping aims to identify core structure replacements in active compounds. To evaluate scaffold hopping potential from a principal point of view, regardless of the computational methods that are applied, a global analysis of conventional scaffolds in analog series from compound activity classes was carried out. The majority of analog series was found to contain multiple scaffolds, thus enabling the detection of intra-series scaffold hops among closely related compounds. More than 1000 activity classes were found to contain increasing proportions of multi-scaffold analog series. Thus, using such activity classes for scaffold hopping analysis is likely to overestimate the scaffold hopping (core structure replacement) potential of computational methods, due to an abundance of artificial scaffold hops that are possible within analog series.  相似文献   

3.
For a systematic exploration of structural relationships between molecular scaffolds, ~24,000 unique scaffolds were extracted from 458 different target sets. Substructure relationships between these scaffolds were systematically determined. The scaffold tree data structure was utilized to study structural relationships between original scaffolds and derivative scaffolds obtained by rule-based decomposition. Leaf-to-root substructure relationships that resulted from rule-based decomposition were compared to leaf-to-leaf relationships between original scaffolds most of which were not part of the scaffold tree hierarchy. Decomposed scaffolds not contained in active target set compounds were prioritized on the basis of hierarchical scaffold patterns and additional substructure relationships. For high-priority virtual scaffolds, activity predictions were carried out, and these scaffolds were often found in external test compounds having the predicted activity. Taken together, our results suggest that leaf-to-root substructure relationships in scaffold trees should best be complemented with additional substructure relationships to determine high-priority virtual scaffolds for activity prediction.  相似文献   

4.
We describe a method for docking of a scaffold-based series and present its advantages over docking of individual ligands, for determining the binding mode of a molecular scaffold in a binding site. The method has been applied to eight different scaffolds of protein kinase inhibitors (PKI). A single analog of each of these eight scaffolds was previously crystallized with different protein kinases. We have used FlexX to dock a set of molecules that share the same scaffold, rather than docking a single molecule. The main mode of binding is determined by the mode of binding of the largest cluster among the docked molecules that share a scaffold. Clustering is based on our 'nearest single neighbor' method [J. Chem. Inf. Comput. Sci., 43 (2003) 208-217]. Additional criteria are applied in those cases in which more than one significant binding mode is found. Using the proposed method, most of the crystallographic binding modes of these scaffolds were reconstructed. Alternative modes, that have not been detected yet by experiments, could also be identified. The method was applied to predict the binding mode of an additional molecular scaffold that was not yet reported and the predicted binding mode has been found to be very similar to experimental results for a closely related scaffold. We suggest that this approach be used as a virtual screening tool for scaffold-based design processes.  相似文献   

5.
6.
7.
Benchmark calculations are essential for the evaluation of virtual screening (VS) methods. Typically, classes of known active compounds taken from the medicinal chemistry literature are divided into reference molecules (search templates) and potential hits that are added to background databases assumed to consist of compounds not sharing this activity. Then VS calculations are carried out, and the recall of known active compounds is determined. However, conventional benchmarking is affected by a number of problems that reduce its value for method evaluation. In addition to often insufficient statistical validation and the lack of generally accepted evaluation standards, the artificial nature of typical benchmark settings is often criticized. Retrospective benchmark calculations generally overestimate the potential of VS methods and do not scale with their performance in prospective applications. In order to provide additional opportunities for benchmarking that more closely resemble practical VS conditions, we have designed a publicly available compound database (DB) of reproducible virtual screens (REPROVIS-DB) that organizes information from successful ligand-based VS applications including reference compounds, screening databases, compound selection criteria, and experimentally confirmed hits. Using the currently available 25 hand-selected compound data sets, one can attempt to reproduce successful virtual screens with other than the originally applied methods and assess their potential for practical applications.  相似文献   

8.
9.
In chemoinformatics, searching for compounds which are structurally diverse and share a biological activity is called scaffold hopping. Scaffold hopping is important since it can be used to obtain alternative structures when the compound under development has unexpected side-effects. Pharmaceutical companies use scaffold hopping when they wish to circumvent prior patents for targets of interest. We propose a new method for scaffold hopping using inductive logic programming (ILP). ILP uses the observed spatial relationships between pharmacophore types in pretested active and inactive compounds and learns human-readable rules describing the diverse structures of active compounds. The ILP-based scaffold hopping method is compared to two previous algorithms (chemically advanced template search, CATS, and CATS3D) on 10 data sets with diverse scaffolds. The comparison shows that the ILP-based method is significantly better than random selection while the other two algorithms are not. In addition, the ILP-based method retrieves new active scaffolds which were not found by CATS and CATS3D. The results show that the ILP-based method is at least as good as the other methods in this study. ILP produces human-readable rules, which makes it possible to identify the three-dimensional features that lead to scaffold hopping. A minor variant of a rule learnt by ILP for scaffold hopping was subsequently found to cover an inhibitor identified by an independent study. This provides a successful result in a blind trial of the effectiveness of ILP to generate rules for scaffold hopping. We conclude that ILP provides a valuable new approach for scaffold hopping.  相似文献   

10.
11.
An analysis method termed similarity search profiling has been developed to evaluate fingerprint-based virtual screening calculations. The analysis is based on systematic similarity search calculations using multiple template compounds over the entire value range of a similarity coefficient. In graphical representations, numbers of correctly identified hits and other detected database compounds are separately monitored. The resulting profiles make it possible to determine whether a virtual screening trial can in principle succeed for a given compound class, search tool, similarity metric, and selection criterion. As a test case, we have analyzed virtual screening calculations using a recently designed fingerprint on 23 different biological activity classes in a compound source database containing approximately 1.3 million molecules. Based on our predefined selection criteria, we found that virtual screening analysis was successful for 19 of 23 compound classes. Profile analysis also makes it possible to determine compound class-specific similarity threshold values for similarity searching.  相似文献   

12.
Protein-ligand interaction fingerprints have been used to postprocess docking poses of three ligand data sets: a set of 40 low-molecular-weight compounds from the Protein Data Bank, a collection of 40 scaffolds from pharmaceutically relevant protein ligands, and a database of 19 scaffolds extracted from true cdk2 inhibitors seeded in 2230 scaffold decoys. Four popular docking tools (FlexX, Glide, Gold, and Surflex) were used to generate poses for ligands of the three data sets. In all cases, scoring by the similarity of interaction fingerprints to a given reference was statistically superior to conventional scoring functions in posing low-molecular-weight fragments, predicting protein-bound scaffold coordinates according to the known binding mode of related ligands, and screening a scaffold library to enrich a hit list in true cdk2-targeted scaffolds.  相似文献   

13.
Medicinal chemists have traditionally realized assessments of chemical diversity and subsequent compound acquisition, although a recent study suggests that experts are usually inconsistent in reviewing large data sets. To analyze the scaffold diversity of commercially available screening collections, we have developed a general workflow aimed at (1) identifying druglike compounds, (2) clustering them by maximum common substructures (scaffolds), (3) measuring the scaffold diversity encoded by each screening collection independently of its size, and finally (4) merging all common substructures in a nonredundant scaffold library that can easily be browsed by structural and topological queries. Starting from 2.4 million compounds out of 12 commercial sources, four categories of libraries could be identified: large- and medium-sized combinatorial libraries (low scaffold diversity), diverse libraries (medium diversity, medium size), and highly diverse libraries (high diversity, low size). The chemical space covered by the scaffold library can be searched to prioritize scaffold-focused libraries.  相似文献   

14.
We developed a novel approach called SHAFTS (SHApe-FeaTure Similarity) for 3D molecular similarity calculation and ligand-based virtual screening. SHAFTS adopts a hybrid similarity metric combined with molecular shape and colored (labeled) chemistry groups annotated by pharmacophore features for 3D similarity calculation and ranking, which is designed to integrate the strength of pharmacophore matching and volumetric overlay approaches. A feature triplet hashing method is used for fast molecular alignment poses enumeration, and the optimal superposition between the target and the query molecules can be prioritized by calculating corresponding "hybrid similarities". SHAFTS is suitable for large-scale virtual screening with single or multiple bioactive compounds as the query "templates" regardless of whether corresponding experimentally determined conformations are available. Two public test sets (DUD and Jain's sets) including active and decoy molecules from a panel of useful drug targets were adopted to evaluate the virtual screening performance. SHAFTS outperformed several other widely used virtual screening methods in terms of enrichment of known active compounds as well as novel chemotypes, thereby indicating its robustness in hit compounds identification and potential of scaffold hopping in virtual screening.  相似文献   

15.
High-throughput screening (HTS) campaigns in pharmaceutical companies have accumulated a large amount of data for several million compounds over a couple of hundred assays. Despite the general awareness that rich information is hidden inside the vast amount of data, little has been reported for a systematic data mining method that can reliably extract relevant knowledge of interest for chemists and biologists. We developed a data mining approach based on an algorithm called ontology-based pattern identification (OPI) and applied it to our in-house HTS database. We identified nearly 1500 scaffold families with statistically significant structure-HTS activity profile relationships. Among them, dozens of scaffolds were characterized as leading to artifactual results stemming from the screening technology employed, such as assay format and/or readout. Four types of compound scaffolds can be characterized based on this data mining effort: tumor cytotoxic, general toxic, potential reporter gene assay artifact, and target family specific. The OPI-based data mining approach can reliably identify compounds that are not only structurally similar but also share statistically significant biological activity profiles. Statistical tests such as Kruskal-Wallis test and analysis of variance (ANOVA) can then be applied to the discovered scaffolds for effective assignment of relevant biological information. The scaffolds identified by our HTS data mining efforts are an invaluable resource for designing SAR-robust diversity libraries, generating in silico biological annotations of compounds on a scaffold basis, and providing novel target family specific scaffolds for focused compound library design.  相似文献   

16.
17.
Identification of meaningful chemical patterns in the increasing amounts of high-throughput-generated bioactivity data available today is an increasingly important challenge for successful drug discovery. Herein, we present the scaffold network as a novel approach for mapping and navigation of chemical and biological space. A scaffold network represents the chemical space of a library of molecules consisting of all molecular scaffolds and smaller "parent" scaffolds generated therefrom by the pruning of rings, effectively leading to a network of common scaffold substructure relationships. This algorithm provides an extension of the scaffold tree algorithm that, instead of a network, generates a tree relationship between a heuristically rule-based selected subset of parent scaffolds. The approach was evaluated for the identification of statistically significantly active scaffolds from primary screening data for which the scaffold tree approach has already been shown to be successful. Because of the exhaustive enumeration of smaller scaffolds and the full enumeration of relationships between them, about twice as many statistically significantly active scaffolds were identified compared to the scaffold-tree-based approach. We suggest visualizing scaffold networks as islands of active scaffolds.  相似文献   

18.
We present a ligand-based virtual screening technique (PhAST) for rapid hit and lead structure searching in large compound databases. Molecules are represented as strings encoding the distribution of pharmacophoric features on the molecular graph. In contrast to other text-based methods using SMILES strings, we introduce a new form of text representation that describes the pharmacophore of molecules. This string representation opens the opportunity for revealing functional similarity between molecules by sequence alignment techniques in analogy to homology searching in protein or nucleic acid sequence databases. We favorably compared PhAST with other current ligand-based virtual screening methods in a retrospective analysis using the BEDROC metric. In a prospective application, PhAST identified two novel inhibitors of 5-lipoxygenase product formation with minimal experimental effort. This outcome demonstrates the applicability of PhAST to drug discovery projects and provides an innovative concept of sequence-based compound screening with substantial scaffold hopping potential.  相似文献   

19.
The scaffold concept is widely applied in chemoinformatics and medicinal chemistry to organize bioactive compounds according to common core structures or associate compound classes with specific biological activities. A variety of scaffold analyses have been carried out to derive statistics for scaffold distributions, generate structural organization schemes, or identify scaffolds that preferentially occur in given compound activity classes. Herein we further extend scaffold analysis by identifying scaffolds that display defined SAR profiles consisting of multiple properties. A structural relationship-based scaffold network has been designed as the basic data structure underlying our analysis. From network representations of scaffolds extracted from compounds active against 32 different target families, scaffolds with different SAR profiles have been extracted on the basis of decision trees that capture structural and functional characteristics of scaffolds in different ways. More than 600 scaffolds and 100 scaffold clusters were assigned to 10 SAR profiles. These scaffold sets represent different activity and target selectivity profiles and are provided for further SAR investigations including, for example, the exploration of alternative analog series for a given target of target family or the design of novel compounds on the basis of scaffold(s) with desired SAR profiles.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号