首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, and scoring functions play significant roles in it. While consensus scoring (CS) generally improves enrichment by compensating for the deficiencies of each scoring function, the strategy of how individual scoring functions are selected remains a challenging task when few known active compounds are available. To address this problem, we propose feature selection-based consensus scoring (FSCS), which performs supervised feature selection with docked native ligand conformations to select complementary scoring functions. We evaluated the enrichments of five scoring functions (F-Score, D-Score, PMF, G-Score, and ChemScore), FSCS, and RCS (rank-by-rank consensus scoring) for four different target proteins: acetylcholine esterase (AChE), thrombin (thrombin), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). The results indicated that FSCS was able to select the complementary scoring functions and enhance ligand enrichments and that it outperformed RCS and the individual scoring functions for all target proteins. They also indicated that the performances of the single scoring functions were strongly dependent on the target protein. An especially favorable result with implications for practical drug screening is that FSCS performs well even if only one 3D structure of the protein-ligand complex is known. Moreover, we found that one can infer which scoring functions significantly enrich active compounds by using feature selection before actual docking and that the selected scoring functions are complementary.  相似文献   

2.
Protein-ligand docking programs have been used to efficiently discover novel ligands for target proteins from large-scale compound databases. However, better scoring methods are needed. Generally, scoring functions are optimized by means of various techniques that affect their fitness for reproducing X-ray structures and protein-ligand binding affinities. However, these scoring functions do not always work well for all target proteins. A scoring function should be optimized for a target protein to enhance enrichment for structure-based virtual screening. To address this problem, we propose the supervised scoring model (SSM), which takes into account the protein-ligand binding process using docked ligand conformations with supervised learning for optimizing scoring functions against a target protein. SSM employs a rough linear correlation between binding free energy and the root mean square deviation of a native ligand for predicting binding energy. We applied SSM to the FlexX scoring function, that is, F-Score, with five different target proteins: thymidine kinase (TK), estrogen receptor (ER), acetylcholine esterase (AChE), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). For these five proteins, SSM always enhanced enrichment better than F-Score, exhibiting superior performance that was particularly remarkable for TK, AChE, and PPARgamma. We also demonstrated that SSM is especially good at enhancing enrichments of the top ranks of screened compounds, which is useful in practical drug screening.  相似文献   

3.
The efficiency of scoring functions for hit identification is usually quantified in terms of enrichment factors and enrichment curves. Close inspection of simulated and real score distributions from virtual screening, however, suggests that 'analysis of variance' (ANOVA) is a more reliable method for assessing their performance. Using ANOVA to quantify the discriminatory power of scoring functions with respect to ligands, decoys, and a reproducible reference database has the potential to facilitate the advancement of scoring functions significantly.  相似文献   

4.
Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening (VS), which is most frequently manifested in the scoring functions' inability to discriminate between true ligands vs known nonbinders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from VS. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of VS in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE::HMSCORE, ChemScore, PLP, and Chemgauss3, in 6 out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP::LBX). We also compare our method with FLAP::RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP::RBLB, hinting effective directions for best VS applications. We suggest that this integrative VS approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.  相似文献   

5.
In order to identify novel chemical classes of factor Xa inhibitors, five scoring functions (FlexX, DOCK, GOLD, ChemScore and PMF) were engaged to evaluate the multiple docking poses generated by FlexX. The compound collection was composed of confirmed potent factor Xa inhibitors and a subset of the LeadQuest screening compound library. Except for PMF the other four scoring functions succeeded in reproducing the crystal complex (PDB code: 1FAX). During virtual screening the highest hit rate (80%) was demonstrated by FlexX at an energy cutoff of -40 kJ/mol, which is about 40-fold over random screening (2.06%). Limited results suggest that presenting more poses of a single molecule to the scoring functions could deteriorate their enrichment factors. A series of promising scaffolds with favorable binding scores was retrieved from LeadQuest. Consensus scoring by pair-wise intersection failed to enrich the hit rate yielded by single scorings (i.e. FlexX). We note that reported successes of consensus scoring in hit rate enrichment could be artificial because their comparisons were based on a selected subset of single scoring and a markedly reduced subset of double or triple scoring. The findings presented in this report are based upon a single biological system and support further studies.  相似文献   

6.
Journal of Computer-Aided Molecular Design - Structure-based virtual screening plays a significant role in drug-discovery. The method virtually docks millions of compounds from corporate or public...  相似文献   

7.
8.
The performance of all four GOLD scoring functions has been evaluated for pose prediction and virtual screening under the standardized conditions of the comparative docking and scoring experiment reported in this Edition. Excellent pose prediction and good virtual screening performance was demonstrated using unmodified protein models and default parameter settings. The best performing scoring function for both pose prediction and virtual screening was demonstrated to be the recently introduced scoring function ChemPLP. We conclude that existing docking programs already perform close to optimally in the cognate pose prediction experiments currently carried out and that more stringent pose prediction tests should be used in the future. These should employ cross-docking sets. Evaluation of virtual screening performance remains problematic and much remains to be done to improve the usefulness of publically available active and decoy sets for virtual screening. Finally we suggest that, for certain target/scoring function combinations, good enrichment may sometimes be a consequence of 2D property recognition rather than a modelling of the correct 3D interactions.  相似文献   

9.
In many practical applications of structure-based virtual screening (VS) ligands are already known. This circumstance requires that the obtained hits need to satisfy initial made expectations i.e., they have to fulfill a predefined binding pattern and/or lie within a predefined physico-chemical property range. Based on the RApid Index-based Screening Engine (RAISE) approach, we introduce cRAISE—a user-controllable structure-based VS method. It efficiently realizes pharmacophore-guided protein-ligand docking to assess the library content but thereby concentrates only on molecules that have a chance to fulfill the given binding pattern. In order to focus only on hits satisfying given molecular properties, library profiles can be utilized to simultaneously filter compounds. cRAISE was evaluated on a range of strict to rather relaxed hypotheses with respect to its capability to guide binding-mode predictions and VS runs. The results reveal insights into a guided VS process. If a pharmacophore model is chosen appropriately, a binding mode below 2 Å is successfully reproduced for 85 % of well-prepared structures, enrichment is increased up to median AUC of 73 %, and the selectivity of the screening process is significantly enhanced leading up to seven times accelerated runtimes. In general, cRAISE supports a versatile structure-based VS approach allowing to assess hypotheses about putative ligands on a large scale.  相似文献   

10.
In today's research environment, a wealth of experimental/theoretical structural data is available and the number of therapeutically relevant macromolecular structures is growing rapidly. This, coupled with the huge number of small non-peptide potential drug candidates easily available (over 7 million compounds), highlight the need of using computer-aided techniques for the efficient identification and optimization of novel hit compounds. Virtual (or in silico) ligand screening based on the three-dimensional structure of macromolecular targets (SB-VLS) is firmly established as an important approach to identify chemical entities that have a high likelihood of binding to a target molecule to elicit desired biological responses. A myriad of free applications and services facilitating the drug discovery process have been posted on the Web. In this review, we cite over 350 URLs that are useful for SB-VLS projects and essentially free for academic groups. We attempt to provide links for in silico ADME/tox prediction tools, compound collections, some ligand-based methods, characterization/simulation of 3D targets and homology modeling tools, druggable pocket predictions, active site comparisons, analysis of macromolecular interfaces, protein docking tools to help identify binding pockets and protein-ligand docking/scoring methods. As such, we aim at providing both, methods pertaining to the field of Structural Bioinformatics (defined here as tools to study macromolecules) and methods pertaining to the field of Chemoinformatics (defined here as tools to make better decisions faster in the arena of drug/lead identification and optimization). We also report several recent success stories using these free computer methods. This review should help readers finding free computer tools useful for their projects. Overall, we are confident that these tools will facilitate rapid and cost-effective identification of new hit compounds. The URLs presented in this review will be updated regularly at www.vls3d.com in the coming months, "Links" section.  相似文献   

11.
12.
The potential for therapeutic specificity in regulating diseases has made cannabinoid (CB) receptors one of the most important G-protein-coupled receptor (GPCR) targets in search for new drugs. Considering the lack of related 3D experimental structures, we have established a structure-based virtual screening protocol to search for CB2 bioactive antagonists based on the 3D CB2 homology structure model. However, the existing homology-predicted 3D models often deviate from the native structure and therefore may incorrectly bias the in silico design. To overcome this problem, we have developed a 3D testing database query algorithm to examine the constructed 3D CB2 receptor structure model as well as the predicted binding pocket. In the present study, an antagonist-bound CB2 receptor complex model was initially generated using flexible docking simulation and then further optimized by molecular dynamic and mechanical (MD/MM) calculations. The refined 3D structural model of the CB2-ligand complex was then inspected by exploring the interactions between the receptor and ligands in order to predict the potential CB2 binding pocket for its antagonist. The ligand-receptor complex model and the predicted antagonist binding pockets were further processed and validated by FlexX-Pharm docking against a testing compound database that contains known antagonists. Furthermore, a consensus scoring (CScore) function algorithm was established to rank the binding interaction modes of a ligand on the CB2 receptor. Our results indicated that the known antagonists seeded in the testing database can be distinguished from a significant amount of randomly chosen molecules. Our studies demonstrated that the established GPCR structure-based virtual screening approach provided a new strategy with a high potential for in silico identifying novel CB2 antagonist leads based on the homology-generated 3D CB2 structure model.  相似文献   

13.
MOTIVATION: Virtual screening of molecular compound libraries is a potentially powerful and inexpensive method for the discovery of novel lead compounds for drug development. The major weakness of virtual screening-the inability to consistently identify true positives (leads)-is likely due to our incomplete understanding of the chemistry involved in ligand binding and the subsequently imprecise scoring algorithms. It has been demonstrated that combining multiple scoring functions (consensus scoring) improves the enrichment of true positives. Previous efforts at consensus scoring have largely focused on empirical results, but they have yet to provide a theoretical analysis that gives insight into real features of combinations and data fusion for virtual screening. RESULTS: We demonstrate that combining multiple scoring functions improves the enrichment of true positives only if (a) each of the individual scoring functions has relatively high performance and (b) the individual scoring functions are distinctive. Notably, these two prediction variables are previously established criteria for the performance of data fusion approaches using either rank or score combinations. This work, thus, establishes a potential theoretical basis for the probable success of data fusion approaches to improve yields in in silico screening experiments. Furthermore, it is similarly established that the second criterion (b) can, in at least some cases, be functionally defined as the area between the rank versus score plots generated by the two (or more) algorithms. Because rank-score plots are independent of the performance of the individual scoring function, this establishes a second theoretically defined approach to determining the likely success of combining data from different predictive algorithms. This approach is, thus, useful in practical settings in the virtual screening process when the performance of at least two individual scoring functions (such as in criterion a) can be estimated as having a high likelihood of having high performance, even if no training sets are available. We provide initial validation of this theoretical approach using data from five scoring systems with two evolutionary docking algorithms on four targets, thymidine kinase, human dihydrofolate reductase, and estrogen receptors of antagonists and agonists. Our procedure is computationally efficient, able to adapt to different situations, and scalable to a large number of compounds as well as to a greater number of combinations. Results of the experiment show a fairly significant improvement (vs single algorithms) in several measures of scoring quality, specifically "goodness-of-hit" scores, false positive rates, and "enrichment". This approach (available online at http://gemdock.life. nctu.edu.tw/dock/download.php) has practical utility for cases where the basic tools are known or believed to be generally applicable, but where specific training sets are absent.  相似文献   

14.
15.
Docking programs are widely used to discover novel ligands efficiently and can predict protein-ligand complex structures with reasonable accuracy and speed. However, there is an emerging demand for better performance from the scoring methods. Consensus scoring (CS) methods improve the performance by compensating for the deficiencies of each scoring function. However, conventional CS and existing scoring functions have the same problems, such as a lack of protein flexibility, inadequate treatment of salvation, and the simplistic nature of the energy function used. Although there are many problems in current scoring functions, we focus our attention on the incorporation of unbound ligand conformations. To address this problem, we propose supervised consensus scoring (SCS), which takes into account protein-ligand binding process using unbound ligand conformations with supervised learning. An evaluation of docking accuracy for 100 diverse protein-ligand complexes shows that SCS outperforms both CS and 11 scoring functions (PLP, F-Score, LigScore, DrugScore, LUDI, X-Score, AutoDock, PMF, G-Score, ChemScore, and D-score). The success rates of SCS range from 89% to 91% in the range of rmsd < 2 A, while those of CS range from 80% to 85%, and those of the scoring functions range from 26% to 76%. Moreover, we also introduce a method for judging whether a compound is active or inactive with the appropriate criterion for virtual screening. SCS performs quite well in docking accuracy and is presumably useful for screening large-scale compound databases before predicting binding affinity.  相似文献   

16.
The docking program LigandFit/Cerius(2) has been used to perform shape-based virtual screening of databases against the aspartic protease renin, a target of determined three-dimensional structure. The protein structure was used in the induced fit binding conformation that occurs when renin is bound to the highly active renin inhibitor 1 (IC(50) = 2 nM). The scoring was calculated using several different scoring functions in order to get insight into the predictability of the magnitude of binding interactions. A database of 1000 diverse and druglike compounds, comprised of 990 members of a virtual database generated by using the iLib diverse software and 10 known active renin inhibitors, was docked flexibly and scored to determine appropriate scoring functions. All seven scoring functions used (LigScore1, LigScore2, PLP1, PLP2, JAIN, PMF, LUDI) were able to retrieve at least 50% of the active compounds within the first 20% (200 molecules) of the entire test database. A hit rate of 90% in the top 1.4% resulted using the quadruple consensus scoring of LigScore2, PLP1, PLP2, and JAIN. Additionally, a focused database was created with the iLib diverse software and used for the same procedure as the test database. Docking and scoring of the 990 focused compounds and the 10 known actives were performed. A hit rate of 100% in the top 8.4% resulted with use of the triple consensus scoring of PLP1, PLP2, and PMF. As expected, a ranking of the known active compounds within the focused database compared to the test database was observed. Adequate virtual screening conditions were derived empirically. They can be used for proximate docking and scoring application of compounds with putative renin inhibiting potency.  相似文献   

17.
Query expansion is the process of reformulating an original query to improve retrieval performance in information retrieval systems. Relevance feedback is one of the most useful query modification techniques in information retrieval systems. In this paper, we introduce query expansion into ligand-based virtual screening (LBVS) using the relevance feedback technique. In this approach, a few high-ranking molecules of unknown activity are filtered from the outputs of a Bayesian inference network based on a single ligand molecule to form a set of ligand molecules. This set of ligand molecules is used to form a new ligand molecule. Simulated virtual screening experiments with the MDL Drug Data Report and maximum unbiased validation data sets show that the use of ligand expansion provides a very simple way of improving the LBVS, especially when the active molecules being sought have a high degree of structural heterogeneity. However, the effectiveness of the ligand expansion is slightly less when structurally-homogeneous sets of actives are being sought.  相似文献   

18.
Summary Structure-based screening using fully flexible docking is still too slow for large molecular libraries. High quality docking of a million molecule library can take days even on a cluster with hundreds of CPUs. This performance issue prohibits the use of fully flexible docking in the design of large combinatorial libraries. We have developed a fast structure-based screening method, which utilizes docking of a limited number of compounds to build a 2D QSAR model used to rapidly score the rest of the database. We compare here a model based on radial basis functions and a Bayesian categorization model. The number of compounds that need to be actually docked depends on the number of docking hits found. In our case studies reasonable quality models are built after docking of the number of molecules containing 50 docking hits. The rest of the library is screened by the QSAR model. Optionally a fraction of the QSAR-prioritized library can be docked in order to find the true docking hits. The quality of the model only depends on the training set size – not on the size of the library to be screened. Therefore, for larger libraries the method yields higher gain in speed no change in performance. Prioritizing a large library with these models provides a significant enrichment with docking hits: it attains the values of 13 and 35 at the beginning of the score-sorted libraries in our two case studies: screening of the NCI collection and a combinatorial libraries on CDK2 kinase structure. With such enrichments, only a fraction of the database must actually be docked to find many of the true hits. The throughput of the method allows its use in screening of large compound collections and in the design of large combinatorial libraries. The strategy proposed has an important effect on efficiency but does not affect retrieval of actives, the latter being determined by the quality of the docking method itself. Electronic supplementary material is available at http://dx.doi.org/10.1007/s10822-005-9002-6.  相似文献   

19.
20.
In silico methods play an essential role in modern drug discovery methods. Virtual screening, an in silico method, is used to filter out the chemical space on which actual wet lab experiments are need to be conducted. Ligand based virtual screening is a computational strategy using which one can build a model of the target protein based on the knowledge of the ligands that bind successfully to the target. This model is then used to predict if the new molecule is likely to bind to the target. Support vector machine, a supervised learning algorithm used for classification, can be utilized for virtual screening the ligand data. When used for virtual screening purpose, SVM could produce interesting results. But since we have a huge ligand data, the time taken for training the SVM model is quite high compared to other learning algorithms. By parallelizing these algorithms on multi-core processors, one can easily expedite these discoveries. In this paper, a GPU based ligand based virtual screening tool (GpuSVMScreen) which uses SVM have been proposed and bench-marked. This data parallel virtual screening tool provides high throughput by running in short time. The proposed GpuSVMScreen can successfully screen large number of molecules (billions) also. The source code of this tool is available at http://ccc.nitc.ac.in/project/GPUSVMSCREEN.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号