首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Docking studies have become popular approaches in drug design, where the binding energy of the ligand in the active site of the protein is estimated by a scoring function. Many promising techniques were developed to enhance the performance of scoring functions including the fusion of multiple scoring functions outcomes into a so-called consensus scoring function. Hereby, we evaluated the target oriented consensus technique using the energetic terms of several scoring functions. The approach was denoted PLSDA-DOCET. Optimization strategies for consensus energetic terms and scoring functions based on ROC metric were compared to classical rigid docking and to ligand-based similarity search methods comprising 2D fingerprints and ROCS. The ROCS results indicate large performance variations depending on the biological target. The AUC-based strategy of PLSDA-DOCET outperformed the other docking approaches regarding simple retrieval and scaffold-hopping. The superior performance of PLSDA-DOCET protocol relative to single and combined scoring functions was validated on an external test set. We found a relative low mean correlation of the ranks of the chemotypes retrieved by the PLSDA-DOCET protocol and all the other methods employed here.  相似文献   

2.
This paper reports an evaluation of both graph-based and fingerprint-based measures of structural similarity, when used for virtual screening of sets of 2D molecules drawn from the MDDR and ID Alert databases. The graph-based measures employ a new maximum common edge subgraph isomorphism algorithm, called RASCAL, with several similarity coefficients described previously for quantifying the similarity between pairs of graphs. The effectiveness of these graph-based searches is compared with that resulting from similarity searches using BCI, Daylight and Unity 2D fingerprints. Our results suggest that graph-based approaches provide an effective complement to existing fingerprint-based approaches to virtual screening.  相似文献   

3.
Virtual screening benchmarking studies were carried out on 11 targets to evaluate the performance of three commonly used approaches: 2D ligand similarity (Daylight, TOPOSIM), 3D ligand similarity (SQW, ROCS), and protein structure-based docking (FLOG, FRED, Glide). Active and decoy compound sets were assembled from both the MDDR and the Merck compound databases. Averaged over multiple targets, ligand-based methods outperformed docking algorithms. This was true for 3D ligand-based methods only when chemical typing was included. Using mean enrichment factor as a performance metric, Glide appears to be the best docking method among the three with FRED a close second. Results for all virtual screening methods are database dependent and can vary greatly for particular targets.  相似文献   

4.
Inhibition of amyloid fibril formation by stabilization of the native form of the protein transthyretin (TTR) is a viable approach for the treatment of familial amyloid polyneuropathy that has been gaining momentum in the field of amyloid research. The TTR stabilizer molecules discovered to date have shown efficacy at inhibiting fibrilization in vitro but display impairing issues of solubility, affinity for TTR in the blood plasma and/or adverse effects. In this study we present a benchmark of four protein- and ligand-based virtual screening (VS) methods for identifying novel TTR stabilizers: (i) two-dimensional (2D) similarity searches with chemical hashed, pharmacophore, and UNITY fingerprints, (ii) 3D searches based on shape, chemical, and electrostatic similarity, (iii) LigMatch, a new ligand-based method which uses multiple templates and combines 3D geometric hashing with a 2D preselection process, and (iv) molecular docking to consensus X-ray crystal structures of TTR. We illustrate the potential of the best-performing VS protocols to retrieve promising new leads by ranking a tailored library of 2.3 million commercially available compounds. Our predictions show that the top-scoring molecules possess distinctive features from the known TTR binders, holding better solubility, fraction of halogen atoms, and binding affinity profiles. To the best of our knowledge, this is the first attempt to rationalize the utilization of a large battery of in silico screening techniques toward the identification of a new generation of TTR amyloid inhibitors.  相似文献   

5.
Molecular fingerprints are widely used for similarity-based virtual screening in drug discovery projects. In this paper we discuss the performance and the complementarity of nine two-dimensional fingerprints (Daylight, Unity, AlFi, Hologram, CATS, TRUST, Molprint 2D, ChemGPS, and ALOGP) in retrieving active molecules by similarity searching against a set of query compounds. For this purpose, we used biological data from HTS screening campaigns of four protein families (GPCRs, kinases, ion channels, and proteases). We have established threshold values for the similarity index (Tanimoto index) to be used as starting points for similarity searches. Based on the complementarities between the selections made by using different fingerprints we propose a multifingerprint approach as an efficient tool to balance the strengths and weaknesses of various fingerprints.  相似文献   

6.
Virtual screening (VS) can be accomplished in either ligand- or structure-based methods. In recent times, an increasing number of 2D fingerprint and 3D shape similarity methods have been used in ligand-based VS. To evaluate the performance of these ligand-based methods, retrospective VS was performed on a tailored directory of useful decoys (DUD). The VS performances of 14 2D fingerprints and four 3D shape similarity methods were compared. The results revealed that 2D fingerprints ECFP_2 and FCFP_4 yielded better performance than the 3D Phase Shape methods. These ligand-based methods were also compared with structure-based methods, such as Glide docking and Prime molecular mechanics generalized Born surface area rescoring, which demonstrated that both 2D fingerprint and 3D shape similarity methods could yield higher enrichment during early retrieval of active compounds. The results demonstrated the superiority of ligand-based methods over the docking-based screening in terms of both speed and hit enrichment. Therefore, considering ligand-based methods first in any VS workflow would be a wise option.  相似文献   

7.
Similarity-based methods for virtual screening are widely used. However, conventional searching using 2D chemical fingerprints or 2D graphs may retrieve only compounds which are structurally very similar to the original target molecule. Of particular current interest then is scaffold hopping, that is, the ability to identify molecules that belong to different chemical series but which could form the same interactions with a receptor. Reduced graphs provide summary representations of chemical structures and, therefore, offer the potential to retrieve compounds that are similar in terms of their gross features rather than at the atom-bond level. Using only a fingerprint representation of such graphs, we have previously shown that actives retrieved were more diverse than those found using Daylight fingerprints. Maximum common substructures give an intuitively reasonable view of the similarity between two molecules. However, their calculation using graph-matching techniques is too time-consuming for use in practical similarity searching in larger data sets. In this work, we exploit the low cardinality of the reduced graph in graph-based similarity searching. We reinterpret the reduced graph as a fully connected graph using the bond-distance information of the original graph. We describe searches, using both the maximum common induced subgraph and maximum common edge subgraph formulations, on the fully connected reduced graphs and compare the results with those obtained using both conventional chemical and reduced graph fingerprints. We show that graph matching using fully connected reduced graphs is an effective retrieval method and that the actives retrieved are likely to be topologically different from those retrieved using conventional 2D methods.  相似文献   

8.
In this work, we calculated the pair wise chemical similarity for a subset of small molecules screened against the NCI60 cancer cell line panel. Four different compound similarity calculation methods were used: Brutus, GRIND, Daylight and UNITY. The chemical similarity scores of each method were related to the biological similarity data set. The same was done also for combinations of methods. In the end, we had an estimate of biological similarity for a given chemical similarity score or combinations thereof. The data from above was used to identify chemical similarity ranges where combining two or more methods (data fusion) led to synergy. The results were also applied in ligand-based virtual screening using the DUD data set. In respect to their ability to enrich biologically similar compound pairs, the ranking of the four methods in descending performance is UNITY, Daylight, Brutus and GRIND. Combining methods resulted always in positive synergy within a restricted range of chemical similarity scores. We observed no negative synergy. We also noted that combining three or four methods had only limited added advantage compared to combining just two. In the virtual screening, using the estimated biological similarity for ranking compounds produced more consistent results than using the methods in isolation.  相似文献   

9.
Reduced graphs provide summary representations of chemical structures. Here, a variety of different types of reduced graphs are compared in similarity searches. The reduced graphs are found to give comparable performance to Daylight fingerprints in terms of the number of active compounds retrieved. However, no one type of reduced graph is found to be consistently superior across a variety of different data sets. Consequently, a representative set of reduced graphs was chosen and used together with Daylight fingerprints in data fusion experiments. The results show improved performance in 10 out of 11 data sets compared to using Daylight fingerprints alone. Finally, the potential of using reduced graphs to build SAR models is demonstrated using recursive partitioning. An SAR model consistent with a published model is found following just two splits in the decision tree.  相似文献   

10.
11.
Rapid overlay of chemical structures (ROCS) is a standard tool for the calculation of 3D shape and chemical (“color”) similarity. ROCS uses unweighted sums to combine many aspects of similarity, yielding parameter-free models for virtual screening. In this report, we decompose the ROCS color force field into color components and color atom overlaps, novel color similarity features that can be weighted in a system-specific manner by machine learning algorithms. In cross-validation experiments, these additional features significantly improve virtual screening performance relative to standard ROCS.  相似文献   

12.
Similarity searching using reduced graphs   总被引:3,自引:0,他引:3  
Reduced graphs provide summary representations of chemical structures. In this work, the effectiveness of reduced graphs for similarity searching is investigated. Different types of reduced graphs are introduced that aim to summarize features of structures that have the potential to form interactions with receptors while retaining the topology between the features. Similarity searches have been carried out across a variety of different activity classes. The effectiveness of the reduced graphs at retrieving compounds with the same activity as known target compounds is compared with searching using Daylight fingerprints. The reduced graphs are shown to be effective for similarity searching and to retrieve more diverse active compounds than those found using Daylight fingerprints; they thus represent a complementary similarity searching tool.  相似文献   

13.
Chemical fingerprints are used to represent chemical molecules by recording the presence or absence, or by counting the number of occurrences, of particular features or substructures, such as labeled paths in the 2D graph of bonds, of the corresponding molecule. These fingerprint vectors are used to search large databases of small molecules, currently containing millions of entries, using various similarity measures, such as the Tanimoto or Tversky's measures and their variants. Here, we derive simple bounds on these similarity measures and show how these bounds can be used to considerably reduce the subset of molecules that need to be searched. We consider both the case of single-molecule and multiple-molecule queries, as well as queries based on fixed similarity thresholds or aimed at retrieving the top K hits. We study the speedup as a function of query size and distribution, fingerprint length, similarity threshold, and database size |D| and derive analytical formulas that are in excellent agreement with empirical values. The theoretical considerations and experiments show that this approach can provide linear speedups of one or more orders of magnitude in the case of searches with a fixed threshold, and achieve sublinear speedups in the range of O(|D|0.6) for the top K hits in current large databases. This pruning approach yields subsecond search times across the 5 million compounds in the ChemDB database, without any loss of accuracy.  相似文献   

14.
Recent studies into the use of a selection of similarity coefficients, when applied to searches of chemical databases represented by binary fingerprints, have shown considerable variation in their retrieval performance and in the sets of compounds being retrieved. The main factor influencing performance is the density distribution of the bitstrings for the active class, a feature which is closely related to molecular size. If this is the case when these coefficients are applied to similarity searches, then we would expect considerable variation in performance when applied to dissimilarity methods, namely clustering and compound selection. Here we report on several studies which have been undertaken to investigate the relative performance of 13 association and correlation coefficients, which have been shown to exhibit complementary performance in similarity searches, when applied to hierarchical and nonhierarchical clustering methods and to a compound selection methodology. Results suggest that the correlation coefficients perform consistently well for clustering and compound selection, as does the Baroni-Urbani/Buser association coefficient. Surprisingly, these often outperform the Tanimoto coefficient, while the Simple Match (effectively the complement of the Squared Euclidean Distance) performs very poorly.  相似文献   

15.
FLAP fingerprints are applied in the ligand-, structure- and pharmacophore-based mode in a case study on antagonists of all four adenosine receptor (AR) subtypes. Structurally diverse antagonist collections with respect to the different ARs were constructed by including binding data to human species only. FLAP models well discriminate ??active?? (=highly potent) from ??inactive?? (=weakly potent) AR antagonists, as indicated by enrichment curves, numbers of false positives, and AUC values. For all FLAP modes, model predictivity slightly decreases as follows: A2BR?>?A2AR?>?A3R?>?A1R antagonists. General performance of FLAP modes in this study is: ligand-?>?structure-?>?pharmacophore- based mode. We also compared the FLAP performance with other common ligand- and structure-based fingerprints. Concerning the ligand-based mode, FLAP model performance is superior to ECFP4 and ROCS for all AR subtypes. Although focusing on the early first part of the A2A, A2B and A3 enrichment curves, ECFP4 and ROCS still retain a satisfactory retrieval of actives. FLAP is also superior when comparing the structure-based mode with PLANTS and GOLD. In this study we applied for the first time the novel FLAPPharm tool for pharmacophore generation. Pharmacophore hypotheses, generated with this tool, convincingly match with formerly published data. Finally, we could demonstrate the capability of FLAP models to uncover selectivity aspects although single AR subtype models were not trained for this purpose.  相似文献   

16.
A wide variety of computational algorithms have been developed that strive to capture the chemical similarity between two compounds for use in virtual screening and lead discovery. One limitation of such approaches is that, while a returned similarity value reflects the perceived degree of relatedness between any two compounds, there is no direct correlation between this value and the expectation or confidence that any two molecules will in fact be equally active. A lack of a common framework for interpretation of similarity measures also confounds the reliable fusion of information from different algorithms. Here, we present a probabilistic framework for interpreting similarity measures that directly correlates the similarity value to a quantitative expectation that two molecules will in fact be equipotent. The approach is based on extensive benchmarking of 10 different similarity methods (MACCS keys, Daylight fingerprints, maximum common subgraphs, rapid overlay of chemical structures (ROCS) shape similarity, and six connectivity-based fingerprints) against a database of more than 150,000 compounds with activity data against 23 protein targets. Given this unified and probabilistic framework for interpreting chemical similarity, principles derived from decision theory can then be applied to combine the evidence from different similarity measures in such a way that both capitalizes on the strengths of the individual approaches and maintains a quantitative estimate of the likelihood that any two molecules will exhibit similar biological activity.  相似文献   

17.
18.
Fingerprint-based similarity searching is widely used for virtual screening when only a single bioactive reference structure is available. This paper reviews three distinct ways of carrying out such searches when multiple bioactive reference structures are available: merging the individual fingerprints into a single combined fingerprint; applying data fusion to the similarity rankings resulting from individual similarity searches; and approximations to substructural analysis. Extended searches on the MDL Drug Data Report database suggest that fusing similarity scores is the most effective general approach, with the best individual results coming from the binary kernel discrimination technique.  相似文献   

19.
Crystal structures taken from the Cambridge Structural Database were used to build a ring scaffold database containing 19 050 3D structures, with each such scaffold then being used to generate a centroid connecting path (CCP) representation. The CCP is a novel object that connects ring centroids, ring linker atoms, and other important points on the connection path between ring centroids. Unsupervised searching in the scaffold and CCP data sets was carried out using the atom-based LAMDA and RigFit search methods and the field-based similarity search method. The performance of these methods was tested with three different ring scaffold queries. These searches demonstrated that unsupervised 3D scaffold searching methods can find not only the types of ring systems that might be retrieved in carefully defined pharmacophore searches (supervised approach) but also additional, structurally diverse ring systems that could form the starting point for lead discovery programs or other scaffold-hopping applications. Not only are the methods effective but some are sufficiently rapid to permit scaffold searching in large chemical databases on a routine basis.  相似文献   

20.
On the basis of the recently introduced reduced graph concept of ErG (extending reduced graphs), a straightforward weighting approach to include additional (e.g., structural or SAR) knowledge into similarity searching procedures for virtual screening (wErG) is proposed. This simple procedure is exemplified with three data sets, for which interaction patterns available from X-ray structures of native or peptidomimetic ligands with their target protein are used to significantly improve retrieval rates of known actives from the MDL Drug Report database. The results are compared to those of other virtual screening techniques such as Daylight fingerprints, FTrees, UNITY, and various FlexX docking protocols. Here, it is shown that wErG exhibits a very good and stable performance independent of the target structure. On the basis of this (and the fact that ErG retrieves structurally more dissimilar compounds due to its potential to perform scaffold-hopping), the combination of wErG and FlexX is successfully explored. Overall, wErG is not only an easily applicable weighting procedure that efficiently identifies actives in large data sets but it is also straightforward to understand for both medicinal and computational chemists and can, therefore, be driven by several aspects of project-related knowledge (e.g., X-ray, NMR, SAR, and site-directed mutagenesis) in a very early stage of the hit identification process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号