首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Virtual docking algorithms are often evaluated on their ability to separate active ligands from decoy molecules. The current state-of-the-art benchmark, the Directory of Useful Decoys (DUD), minimizes bias by including decoys from a library of synthetically feasible molecules that are physically similar yet chemically dissimilar to the active ligands. We show that by ignoring synthetic feasibility, we can compile a benchmark that is comparable to the DUD and less biased with respect to physical similarity.  相似文献   

2.
Due to the large number of different docking programs and scoring functions available, researchers are faced with the problem of selecting the most suitable one when starting a structure-based drug discovery project. To guide the decision process, several studies comparing different docking and scoring approaches have been published. In the context of comparing scoring function performance, it is common practice to use a predefined, computer-generated set of ligand poses (decoys) and to reevaluate their score using the set of scoring functions to be compared. But are predefined decoy sets able to unambiguously evaluate and rank different scoring functions with respect to pose prediction performance? This question arose when the pose prediction performance of our piecewise linear potential derived scoring functions (Korb et al. in J Chem Inf Model 49:84–96, 2009) was assessed on a standard decoy set (Cheng et al. in J Chem Inf Model 49:1079–1093, 2009). While they showed excellent pose identification performance when they were used for rescoring of the predefined decoy conformations, a pronounced degradation in performance could be observed when they were directly applied in docking calculations using the same test set. This implies that on a discrete set of ligand poses only the rescoring performance can be evaluated. For comparing the pose prediction performance in a more rigorous manner, the search space of each scoring function has to be sampled extensively as done in the docking calculations performed here. We were able to identify relative strengths and weaknesses of three scoring functions (ChemPLP, GoldScore, and Astex Statistical Potential) by analyzing the performance for subsets of the complexes grouped by different properties of the active site. However, reasons for the overall poor performance of all three functions on this test set compared to other test sets of similar size could not be identified.  相似文献   

3.
Docking and scoring are critical issues in virtual drug screening methods. Fast and reliable methods are required for the prediction of binding affinity especially when applied to a large library of compounds. The implementation of receptor flexibility and refinement of scoring functions for this purpose are extremely challenging in terms of computational speed. Here we propose a knowledge-based multiple-conformation docking method that efficiently accommodates receptor flexibility thus permitting reliable virtual screening of large compound libraries. Starting with a small number of active compounds, a preliminary docking operation is conducted on a large ensemble of receptor conformations to select the minimal subset of receptor conformations that provides a strong correlation between the experimental binding affinity (e.g., Ki, IC50) and the docking score. Only this subset is used for subsequent multiple-conformation docking of the entire data set of library (test) compounds. In conjunction with the multiple-conformation docking procedure, a two-step scoring scheme is employed by which the optimal scoring geometries obtained from the multiple-conformation docking are re-scored by a molecular mechanics energy function including desolvation terms. To demonstrate the feasibility of this approach, we applied this integrated approach to the estrogen receptor alpha (ERalpha) system for which published binding affinity data were available for a series of structurally diverse chemicals. The statistical correlation between docking scores and experimental values was significantly improved from those of single-conformation dockings. This approach led to substantial enrichment of the virtual screening conducted on mixtures of active and inactive ERalpha compounds.  相似文献   

4.
A new method for the postprocessing of docking outputs has been developed, based on encoding putative 3D binding modes (docking solutions) as ligand-protein interactions into simple bit strings, a method analogous to the structural interaction fingerprint. Instead of employing traditional scoring functions, the method uses a series of new, knowledge-based scores derived from the similarity of the bit strings for each docking solution to that of a known reference binding mode. A GOLD docking study was carried out using the Bissantz estrogen receptor antagonist set along with the new scoring method. Superior recovery rates, with up to 2-fold enrichments, were observed when the new knowledge-based scoring was compared to the GOLD fitness score. In addition, top ranking sets of molecules (actives and potential actives or decoys) were structurally diverse with low molecular weights and structural complexities. Principal component analysis and clustering of the fingerprints permits the easy separation of active from inactive binding modes and the visualization of diverse binding modes.  相似文献   

5.
Target-based virtual screening is increasingly used to generate leads for targets for which high quality three-dimensional (3D) structures are available. To allow large molecular databases to be screened rapidly, a tiered scoring scheme is often employed whereby a simple scoring function is used as a fast filter of the entire database and a more rigorous and time-consuming scoring function is used to rescore the top hits to produce the final list of ranked compounds. Molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) approaches are currently thought to be quite effective at incorporating implicit solvation into the estimation of ligand binding free energies. In this paper, the ability of a high-throughput MM-PBSA rescoring function to discriminate between correct and incorrect docking poses is investigated in detail. Various initial scoring functions are used to generate docked poses for a subset of the CCDC/Astex test set and to dock one set of actives/inactives from the DUD data set. The effectiveness of each of these initial scoring functions is discussed. Overall, the ability of the MM-PBSA rescoring function to (i) regenerate the set of X-ray complexes when docking the bound conformation of the ligand, (ii) regenerate the X-ray complexes when docking conformationally expanded databases for each ligand which include "conformation decoys" of the ligand, and (iii) enrich known actives in a virtual screen for the mineralocorticoid receptor in the presence of "ligand decoys" is assessed. While a pharmacophore-based molecular docking approach, PhDock, is used to carry out the docking, the results are expected to be general to use with any docking method.  相似文献   

6.
Computational methods involving virtual screening could potentially be employed to discover new biomolecular targets for an individual molecule of interest (MOI). However, existing scoring functions may not accurately differentiate proteins to which the MOI binds from a larger set of macromolecules in a protein structural database. An MOI will most likely have varying degrees of predicted binding affinities to many protein targets. However, correctly interpreting a docking score as a hit for the MOI docked to any individual protein can be problematic. In our method, which we term "Virtual Target Screening (VTS)", a set of small drug-like molecules are docked against each structure in the protein library to produce benchmark statistics. This calibration provides a reference for each protein so that hits can be identified for an MOI. VTS can then be used as tool for: drug repositioning (repurposing), specificity and toxicity testing, identifying potential metabolites, probing protein structures for allosteric sites, and testing focused libraries (collection of MOIs with similar chemotypes) for selectivity. To validate our VTS method, twenty kinase inhibitors were docked to a collection of calibrated protein structures. Here, we report our results where VTS predicted protein kinases as hits in preference to other proteins in our database. Concurrently, a graphical interface for VTS was developed.  相似文献   

7.
8.
Summary Structure-based screening using fully flexible docking is still too slow for large molecular libraries. High quality docking of a million molecule library can take days even on a cluster with hundreds of CPUs. This performance issue prohibits the use of fully flexible docking in the design of large combinatorial libraries. We have developed a fast structure-based screening method, which utilizes docking of a limited number of compounds to build a 2D QSAR model used to rapidly score the rest of the database. We compare here a model based on radial basis functions and a Bayesian categorization model. The number of compounds that need to be actually docked depends on the number of docking hits found. In our case studies reasonable quality models are built after docking of the number of molecules containing 50 docking hits. The rest of the library is screened by the QSAR model. Optionally a fraction of the QSAR-prioritized library can be docked in order to find the true docking hits. The quality of the model only depends on the training set size – not on the size of the library to be screened. Therefore, for larger libraries the method yields higher gain in speed no change in performance. Prioritizing a large library with these models provides a significant enrichment with docking hits: it attains the values of 13 and 35 at the beginning of the score-sorted libraries in our two case studies: screening of the NCI collection and a combinatorial libraries on CDK2 kinase structure. With such enrichments, only a fraction of the database must actually be docked to find many of the true hits. The throughput of the method allows its use in screening of large compound collections and in the design of large combinatorial libraries. The strategy proposed has an important effect on efficiency but does not affect retrieval of actives, the latter being determined by the quality of the docking method itself. Electronic supplementary material is available at http://dx.doi.org/10.1007/s10822-005-9002-6.  相似文献   

9.
Applications in structural biology and medicinal chemistry require protein-ligand scoring functions for two distinct tasks: (i) ranking different poses of a small molecule in a protein binding site and (ii) ranking different small molecules by their complementarity to a protein site. Using probability theory, we developed two atomic distance-dependent statistical scoring functions: PoseScore was optimized for recognizing native binding geometries of ligands from other poses and RankScore was optimized for distinguishing ligands from nonbinding molecules. Both scores are based on a set of 8,885 crystallographic structures of protein-ligand complexes but differ in the values of three key parameters. Factors influencing the accuracy of scoring were investigated, including the maximal atomic distance and non-native ligand geometries used for scoring, as well as the use of protein models instead of crystallographic structures for training and testing the scoring function. For the test set of 19 targets, RankScore improved the ligand enrichment (logAUC) and early enrichment (EF(1)) scores computed by DOCK 3.6 for 13 and 14 targets, respectively. In addition, RankScore performed better at rescoring than each of seven other scoring functions tested. Accepting both the crystal structure and decoy geometries with all-atom root-mean-square errors of up to 2 ? from the crystal structure as correct binding poses, PoseScore gave the best score to a correct binding pose among 100 decoys for 88% of all cases in a benchmark set containing 100 protein-ligand complexes. PoseScore accuracy is comparable to that of DrugScore(CSD) and ITScore/SE and superior to 12 other tested scoring functions. Therefore, RankScore can facilitate ligand discovery, by ranking complexes of the target with different small molecules; PoseScore can be used for protein-ligand complex structure prediction, by ranking different conformations of a given protein-ligand pair. The statistical potentials are available through the Integrative Modeling Platform (IMP) software package (http://salilab.org/imp) and the LigScore Web server (http://salilab.org/ligscore/).  相似文献   

10.
11.
A molecular docking method designated as ADDock, anchor- dependent molecular docking process for docking small flexible molecules into rigid protein receptors, is presented in this article. ADDock makes the bond connection lists for atoms based on anchors chosen for building molecular structures for docking small flexible molecules or ligands into rigid active sites of protein receptors. ADDock employs an extended version of piecewise linear potential for scoring the docked structures. Since no translational motion for small molecules is implemented during the docking process, ADDock searches the best docking result by systematically changing the anchors chosen, which are usually the single-edge connected nodes or terminal hydrogen atoms of ligands. ADDock takes intact ligand structures generated during the docking process for computing the docked scores; therefore, no energy minimization is required in the evaluation phase of docking. The docking accuracy by ADDock for 92 receptor-ligand complexes docked is 91.3%. All these complexes have been docked by other groups using other docking methods. The receptor-ligand steric interaction energies computed by ADDock for some sets of active and inactive compounds selected and docked into the same receptor active sites are apparently separated. These results show that based on the steric interaction energies computed between the docked structures and receptor active sites, ADDock is able to separate active from inactive compounds for both being docked into the same receptor.  相似文献   

12.
Comparative study of several algorithms for flexible ligand docking   总被引:3,自引:0,他引:3  
We have performed a comparative assessment of several programs for flexible molecular docking: DOCK 4.0, FlexX 1.8, AutoDock 3.0, GOLD 1.2 and ICM 2.8. This was accomplished using two different studies: docking experiments on a data set of 37 protein-ligand complexes and screening a library containing 10,037 entries against 11 different proteins. The docking accuracy of the methods was judged based on the corresponding rank-one solutions. We have found that the fraction of molecules docked with acceptable accuracy is 0.47, 0.31, 0.35, 0.52 and 0.93 for, respectively, AutoDock, DOCK, FlexX, GOLD and ICM. Thus ICM provided the highest accuracy in ligand docking against these receptors. The results from the other programs are found to be less accurate and of approximately the same quality. A speed comparison demonstrated that FlexX was the fastest and AutoDock was the slowest among the tested docking programs. The database screening was performed using DOCK, FlexX and ICM. ICM was able to identify the original ligands within the top 1% of the total library in 17 cases. The corresponding number for DOCK and FlexX was 7 and 8, respectively. We have estimated that in virtual database screening, 50% of the potentially active compounds will be found among approximately 1.5% of the top scoring solutions found with ICM and among approximately 9% of the top scoring solutions produced by DOCK and FlexX.  相似文献   

13.
Two sets of ligand binding decoys have been constructed for the community structure-activity resource (CSAR) benchmark by using the MDock and DOCK programs for rigid- and flexible-ligand docking, respectively. The decoys generated for each complex in the benchmark thoroughly cover the binding site and also contain a certain number of near-native binding modes. A few scoring functions have been evaluated using the ligand binding decoy sets for their abilities of predicting near-native binding modes. Among them, ITScore achieved a success rate of 86.7% for the rigid-ligand decoys and 79.7% for the flexible-ligand decoys, under the common definition of a successful prediction as root-mean-square deviation <2.0 ? from the native structure if the top-scored binding mode was considered. The decoy sets may serve as benchmarks for binding mode prediction of a scoring function, which are available at the CSAR Web site ( http://www.csardock.org/).  相似文献   

14.
Protein-ligand interaction fingerprints have been used to postprocess docking poses of three ligand data sets: a set of 40 low-molecular-weight compounds from the Protein Data Bank, a collection of 40 scaffolds from pharmaceutically relevant protein ligands, and a database of 19 scaffolds extracted from true cdk2 inhibitors seeded in 2230 scaffold decoys. Four popular docking tools (FlexX, Glide, Gold, and Surflex) were used to generate poses for ligands of the three data sets. In all cases, scoring by the similarity of interaction fingerprints to a given reference was statistically superior to conventional scoring functions in posing low-molecular-weight fragments, predicting protein-bound scaffold coordinates according to the known binding mode of related ligands, and screening a scaffold library to enrich a hit list in true cdk2-targeted scaffolds.  相似文献   

15.
Protein–ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein–ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein–ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.  相似文献   

16.
For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.  相似文献   

17.
We report here a robust automated active site detection, docking, and scoring (AADS) protocol for proteins with known structures. The active site finder identifies all cavities in a protein and scores them based on the physicochemical properties of functional groups lining the cavities in the protein. The accuracy realized on 620 proteins with sizes ranging from 100 to 600 amino acids with known drug active sites is 100% when the top ten cavity points are considered. These top ten cavity points identified are then submitted for an automated docking of an input ligand/candidate molecule. The docking protocol uses an all atom energy based Monte Carlo method. Eight low energy docked structures corresponding to different locations and orientations of the candidate molecule are stored at each cavity point giving 80 docked structures overall which are then ranked using an effective free energy function and top five structures are selected. The predicted structure and energetics of the complexes agree quite well with experiment when tested on a data set of 170 protein-ligand complexes with known structures and binding affinities. The AADS methodology is implemented on an 80 processor cluster and presented as a freely accessible, easy to use tool at http://www.scfbio-iitd.res.in/dock/ActiveSite_new.jsp .  相似文献   

18.
An NMR fragment screening dataset with known binders and decoys was used to evaluate the ability of docking and re-scoring methods to identify fragment binders. Re-scoring docked poses using the Molecular Mechanics Poisson-Boltzmann Surface Area (MM-PBSA) implicit solvent model identifies additional active fragments relative to either docking or random fragment screening alone. Early enrichment, which is clearly most important in practice for selecting relatively small sets of compounds for experimental testing, is improved by MM-PBSA re-scoring. In addition, the value in MM-PBSA re-scoring of docked poses for virtual screening may be in lessening the effect of the variation in the protein complex structure used.  相似文献   

19.
Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands physically, so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands physically but not topologically to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chemical space, and the proper scope for using DUD. Careful attention to both the composition of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.  相似文献   

20.
A continuing problem in protein-ligand docking is the correct relative ranking of active molecules versus inactives. Using the ChemScore scoring function as implemented in the GOLD docking software, we have investigated the effect of scaling hydrogen bond, metal-ligand, and lipophilic interactions based on the buriedness of the interaction. Buriedness was measured using the receptor density, the number of protein heavy atoms within 8.0 A. Terms in the scaling functions were optimized using negative data, represented by docked poses of inactive molecules. The objective function was the mean rank of the scores of the active poses in the Astex Diverse Set (Hartshorn et al. J. Med. Chem., 2007, 50, 726) with respect to the docked poses of 99 inactives. The final four-parameter model gave a substantial improvement in the average rank from 18.6 to 12.5. Similar results were obtained for an independent test set. Receptor density scaling is available as an option in the recent GOLD release.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号