首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose ( www.csardock.org ). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein-ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (K(d) or K(i)) in Binding MOAD, augmented with entries from PDB bind. The final data set contains 343 diverse protein-ligand complexes and spans 14 pK(d). Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. Inherent experimental error limits the possible correlation between scores and measured affinity; Pearson R is limited to ~ 0.91 (Pearson R2 0.83) when fitting to the data set without over parameterizing. Pearson R is limited to ~ 0.83(Pearson R2 ~ 0.70) when scoring the data set with a method trained on outside data [corrected]. The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented. Many groups generously participated in improving the data set, and this underscores the value of a supportive, collaborative effort in moving our field forward.  相似文献   

2.
Based on a statistical mechanics-based iterative method, we have extracted a set of distance-dependent, all-atom pairwise potentials for protein-ligand interactions from the crystal structures of 1300 protein-ligand complexes. The iterative method circumvents the long-standing reference state problem in knowledge-based scoring functions. The resulted scoring function, referred to as ITScore 2.0, has been tested with the CSAR (Community Structure-Activity Resource, 2009 release) benchmark of 345 diverse protein-ligand complexes. ITScore 2.0 achieved a Pearson correlation of R(2) = 0.54 in binding affinity prediction. A comparative analysis has been done on the scoring performances of ITScore 2.0, the van der Waals (VDW) scoring function, the VDW with heavy atoms only, and the force field (FF) scoring function of DOCK which consists of a VDW term and an electrostatic term. The results reveal several important factors that affect the scoring performances, which could be helpful for the improvement of scoring functions.  相似文献   

3.
4.
Scoring functions of protein–ligand interactions are widely used in computationally docking software and structure-based drug discovery. Accurate prediction of the binding energy between the protein and the ligand is the main task of the scoring function. The accuracy of a scoring function is normally evaluated by testing it on the benchmarks of protein–ligand complexes. In this work, we report the evaluation analysis of an improved version of scoring function SPecificity and Affinity (SPA). By testing on two independent benchmarks Community Structure-Activity Resource (CSAR) 2014 and Comparative Assessment of Scoring Functions (CASF) 2013, the assessment shows that SPA is relatively more accurate than other compared scoring functions in predicting the interactions between the protein and the ligand. We conclude that the inclusion of the specificity in the optimization can effectively suppress the competitive state on the funnel-like binding energy landscape, and make SPA more accurate in identifying the “native” conformation and scoring the binding decoys. The evaluation of SPA highlights the importance of binding specificity in improving the accuracy of the scoring functions.  相似文献   

5.
Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein-ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5-8% better compared to the standard functions implemented in popular docking software.  相似文献   

6.
Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening (VS), which is most frequently manifested in the scoring functions' inability to discriminate between true ligands vs known nonbinders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from VS. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of VS in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE::HMSCORE, ChemScore, PLP, and Chemgauss3, in 6 out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP::LBX). We also compare our method with FLAP::RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP::RBLB, hinting effective directions for best VS applications. We suggest that this integrative VS approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.  相似文献   

7.
Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein–ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5–8% better compared to the standard functions implemented in popular docking software.  相似文献   

8.
A new method for the postprocessing of docking outputs has been developed, based on encoding putative 3D binding modes (docking solutions) as ligand-protein interactions into simple bit strings, a method analogous to the structural interaction fingerprint. Instead of employing traditional scoring functions, the method uses a series of new, knowledge-based scores derived from the similarity of the bit strings for each docking solution to that of a known reference binding mode. A GOLD docking study was carried out using the Bissantz estrogen receptor antagonist set along with the new scoring method. Superior recovery rates, with up to 2-fold enrichments, were observed when the new knowledge-based scoring was compared to the GOLD fitness score. In addition, top ranking sets of molecules (actives and potential actives or decoys) were structurally diverse with low molecular weights and structural complexities. Principal component analysis and clustering of the fingerprints permits the easy separation of active from inactive binding modes and the visualization of diverse binding modes.  相似文献   

9.
10.
Solvated interaction energy (SIE) is an end-point physics-based scoring function for predicting binding affinities from force-field nonbonded interaction terms, continuum solvation, and configurational entropy linear compensation. We tested the SIE function in the Community Structure-Activity Resource (CSAR) scoring challenge consisting of high-resolution cocrystal structures for 343 protein-ligand complexes with high-quality binding affinity data and high diversity with respect to protein targets. Particular emphasis was placed on the sensitivity of SIE predictions to the assignment of protonation and tautomeric states in the complex and the treatment of metal ions near the protein-ligand interface. These were manually curated from an originally distributed CSAR-HiQ data set version, leading to the currently distributed CSAR-NRC-HiQ version. We found that this manual curation was a critical step for accurately testing the performance of the SIE function. The standard SIE parametrization, previously calibrated on an independent data set, predicted absolute binding affinities with a mean-unsigned-error (MUE) of 2.41 kcal/mol for the CSAR-HiQ version, which improved to 1.98 kcal/mol for the upgraded CSAR-NRC-HiQ version. Half-half retraining-testing of SIE parameters on two predefined subsets of CSAR-NRC-HiQ led to only marginal further improvements to an MUE of 1.83 kcal/mol. Hence, we do not recommend altering the current default parameters of SIE at this time. For a sample of SIE outliers, additional calculations by molecular dynamics-based SIE averaging with or without incorporation of ligand strain, by MM-PB(GB)/SA methods with or without entropic estimates, or even by the linear interaction energy (LIE) formalism with an explicit solvent model, did not further improve predictions.  相似文献   

11.
For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.  相似文献   

12.
13.
14.
15.
Target-specific optimization of scoring functions for protein–ligand docking is an effective method for significantly improving the discrimination of active and inactive molecules in virtual screening applications. Its applicability, however, is limited due to the narrow focus on, e.g., single protein structures. Using an ensemble of protein kinase structures, the publically available directory of useful decoys ligand dataset, and a novel multi-factorial optimization procedure, it is shown here that scoring functions can be tuned to multiple targets of a target class simultaneously. This leads to an improved robustness of the resulting scoring function parameters. Extensive validation experiments clearly demonstrate that (1) virtual screening performance for kinases improves significantly; (2) variations in database content affect this kind of machine-learning strategy to a lesser extent than binary QSAR models, and (3) the reweighting of interaction types is of particular importance for improved screening performance. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

16.
Virtual Screening (VS) is a computational technique that allows selection and ranking of possible hits from a library of compounds. We have carried out VS on 128 selected EGFR kinase inhibitors with GOLD and LigandFit. From the experimental crystal structure of the erlotinib-EGFR complex, three key hydrogen bonds were identified as responsible for anchoring the ligand in the active site. These are of the N-H...N, O(w)-H...N, and C-H...O types. Failure to include the hydrogen-bonded water molecule that forms the O(w)-H...N bond leads to incorrect results. Of the three interactions, the C-H...O formed by an activated C-H group is the best conserved. On the basis of the efficacy of these hydrogen bonds, the poses were classified into one of three categories: close, shifted, and misoriented. In the VS context, all three interactions need to be modeled correctly so that correct poses and affinities are obtained, and this happens in ligands of the close variety. Cross scoring wherein the poses from one software are input into another for scoring and consensus scoring wherein the scores from various software packages are weighted are also helpful in obtaining better agreements.  相似文献   

17.
In computational biology processes such as docking, binding, and folding are often described by simplified, empirical models. These models are fitted to physical properties of the process by adjustable parameters. An appropriate choice of these parameters is crucial for the quality of the models. Locating the best choices for the parameters is often is a difficult task, depending on the complexity of the model. We describe a new method and program, POEM (Parameter Optimization using Ensemble Methods), for this task. In POEM we combine the DOE (Design Of Experiment) procedure with ensembles of different regression methods. We apply the method to the optimization of target specific scoring functions in molecular docking. The method consists of an iterative procedure that uses alternate evaluation and prediction steps. During each cycle of optimization we fit an approximate function to a defined loss function landscape and improve the quality of this fit from cycle to cycle by constantly augmenting our data set. As test applications we fitted the FlexX and Screenscore scoring functions to the kinase and ATPase protein classes. The results are promising: Starting from random parameters we are able to locate parameter sets which show superior performance compared to the original values. The POEM approach converges quickly and the approximated loss function landscapes are smooth, thus making the approach a suitable method for optimizations on rugged landscapes.  相似文献   

18.
Empirical scoring functions provide estimates of the free energy of protein-ligand binding in situations when atomic-scale simulations are intractable, for example, in virtual high-throughput screening. Currently, such scoring functions are often inaccurate, and further improvements are complicated by the lack of reliable training data, the complex interplay between scoring functions and docking algorithms, and an inconsistent statistical treatment of positive and negative training data. In comparison to various other performance measures of scoring functions, "analysis of variance" provides a well-behaved objective function for optimization, which focuses on the signal-to-noise ratio of ligand-decoy discrimination. In combination with a large database of ligands and decoys, an in situ optimization of scoring function parameters was able to generate improved, target-specific scoring functions for three different proteins of pharmaceutical interest: cyclin-dependent kinase 2, the estrogen receptor, and cyclooxygenase-2. Statistical analysis of the improvements observed in "receiver-operating characteristic" curves showed that the optimized scoring functions achieved a significantly (between p < 0.0001 and p < 0.05) higher enrichment of true ligands. A scaffold dependence of the resulting binding modes was observed, which is discussed in conjunction with the rigid receptor hypothesis commonly made in protein-ligand docking. In summary, the approach described here represents a well-adapted statistical method for setting up scoring functions.  相似文献   

19.
The efficiency of scoring functions for hit identification is usually quantified in terms of enrichment factors and enrichment curves. Close inspection of simulated and real score distributions from virtual screening, however, suggests that 'analysis of variance' (ANOVA) is a more reliable method for assessing their performance. Using ANOVA to quantify the discriminatory power of scoring functions with respect to ligands, decoys, and a reproducible reference database has the potential to facilitate the advancement of scoring functions significantly.  相似文献   

20.
A dataset of protein‐drug complexes with experimental binding energy and crystal structure were analyzed and the performance of different docking engines and scoring functions (as well as components of these) for predicting the free energy of binding and several ligand efficiency indices were compared. The aim was not to evaluate the best docking method, but to determine the effect of different efficiency indices on the experimental and predicted free energy. Some ligand efficiency indices, such as ΔG/W (Wiener index), ΔG/NoC (number of carbons), and ΔG/P (partition coefficient), improve the correlation between experimental and calculated values. This effect was shown to be valid across the different scoring functions and docking programs. It also removes the common bias of scoring functions in favor of larger ligands. For all scoring functions, the efficiency indices effectively normalize the free energy derived indices, to give values closer to experiment. Compound collection filtering can be done prior or after docking, using pharmacokinetic as well as pharmacodynamic profiles. Achieving these better correlations with experiment can improve the ability of docking scoring functions to predict active molecules in virtual screening. © 2009 Wiley Periodicals, Inc. J Comput Chem 2010  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号