首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein–ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5–8% better compared to the standard functions implemented in popular docking software.  相似文献   

2.
Empirical scoring functions used in protein-ligand docking calculations are typically trained on a dataset of complexes with known affinities with the aim of generalizing across different docking applications. We report a novel method of scoring-function optimization that supports the use of additional information to constrain scoring function parameters, which can be used to focus a scoring function’s training towards a particular application, such as screening enrichment. The approach combines multiple instance learning, positive data in the form of ligands of protein binding sites of known and unknown affinity and binding geometry, and negative (decoy) data of ligands thought not to bind particular protein binding sites or known not to bind in particular geometries. Performance of the method for the Surflex-Dock scoring function is shown in cross-validation studies and in eight blind test cases. Tuned functions optimized with a sufficient amount of data exhibited either improved or undiminished screening performance relative to the original function across all eight complexes. Analysis of the changes to the scoring function suggest that modifications can be learned that are related to protein-specific features such as active-site mobility.  相似文献   

3.
A successful protein–protein docking study culminates in identification of decoys at top ranks with near‐native quaternary structures. However, this task remains enigmatic because no generalized scoring functions exist that effectively infer decoys according to the similarity to near‐native quaternary structures. Difficulties arise because of the highly irregular nature of the protein surface and the significant variation of the nonbonding and solvation energies based on the chemical composition of the protein–protein interface. In this work, we describe a novel method combining an interface‐size filter, a regression model for geometric compatibility (based on two correlated surface and packing parameters), and normalized interaction energy (calculated from correlated nonbonded and solvation energies), to effectively rank decoys from a set of 10,000 decoys. Tests on 30 unbound binary protein–protein complexes show that in 16 cases we can identify at least one decoy in top three ranks having ≤10 Å backbone root mean square deviation from true binding geometry. Comparisons with other state‐of‐art methods confirm the improved ranking power of our method without the use of any experiment‐guided restraints, evolutionary information, statistical propensities, or modified interaction energy equations. Tests on 118 less‐difficult bound binary protein–protein complexes with ≤35% sequence redundancy at the interface showed that in 77% cases, at least 1 in 10,000 decoys were identified with ≤5Å backbone root mean square deviation from true geometry at first rank. The work will promote the use of new concepts where correlations among parameters provide more robust scoring models. It will facilitate studies involving molecular interactions, including modeling of large macromolecular assemblies and protein structure prediction. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011.  相似文献   

4.
We have derived, in the context of the Rigid Rotor Harmonic Approximation (RRHO), a general mass and Planck's constant h independent expression for the dissociation free energy in ligand–receptor systems, featuring a systematically (anti‐binding) additive negative entropic term depending on readily available ligand–receptor quantities. The proposed RRHO expression allows to straightforwardly compute the absolute standard dissociation free energy without resorting to expensive normal mode analysis or other dynamical matrix‐based techniques for evaluating the entropic contribution, hence providing an effective scoring function for assessing docking poses with no adjustable parameters. Our RRHO formula was tested on a set of 55 ligand–receptor systems obtaining correlation coefficients and unsigned mean errors comparable to or better than those obtained with computationally demanding techniques for the dissociation entropy assessment. The proposed compact reformulation of the RRHO entropy term could constitute the basis for new and more effective scoring functions in molecular docking‐based high‐throughput virtual screening for drug discovery. © 2016 Wiley Periodicals, Inc.  相似文献   

5.
Empirical scoring functions provide estimates of the free energy of protein-ligand binding in situations when atomic-scale simulations are intractable, for example, in virtual high-throughput screening. Currently, such scoring functions are often inaccurate, and further improvements are complicated by the lack of reliable training data, the complex interplay between scoring functions and docking algorithms, and an inconsistent statistical treatment of positive and negative training data. In comparison to various other performance measures of scoring functions, "analysis of variance" provides a well-behaved objective function for optimization, which focuses on the signal-to-noise ratio of ligand-decoy discrimination. In combination with a large database of ligands and decoys, an in situ optimization of scoring function parameters was able to generate improved, target-specific scoring functions for three different proteins of pharmaceutical interest: cyclin-dependent kinase 2, the estrogen receptor, and cyclooxygenase-2. Statistical analysis of the improvements observed in "receiver-operating characteristic" curves showed that the optimized scoring functions achieved a significantly (between p < 0.0001 and p < 0.05) higher enrichment of true ligands. A scaffold dependence of the resulting binding modes was observed, which is discussed in conjunction with the rigid receptor hypothesis commonly made in protein-ligand docking. In summary, the approach described here represents a well-adapted statistical method for setting up scoring functions.  相似文献   

6.
A dataset of protein‐drug complexes with experimental binding energy and crystal structure were analyzed and the performance of different docking engines and scoring functions (as well as components of these) for predicting the free energy of binding and several ligand efficiency indices were compared. The aim was not to evaluate the best docking method, but to determine the effect of different efficiency indices on the experimental and predicted free energy. Some ligand efficiency indices, such as ΔG/W (Wiener index), ΔG/NoC (number of carbons), and ΔG/P (partition coefficient), improve the correlation between experimental and calculated values. This effect was shown to be valid across the different scoring functions and docking programs. It also removes the common bias of scoring functions in favor of larger ligands. For all scoring functions, the efficiency indices effectively normalize the free energy derived indices, to give values closer to experiment. Compound collection filtering can be done prior or after docking, using pharmacokinetic as well as pharmacodynamic profiles. Achieving these better correlations with experiment can improve the ability of docking scoring functions to predict active molecules in virtual screening. © 2009 Wiley Periodicals, Inc. J Comput Chem 2010  相似文献   

7.
In small molecule docking, the scoring and ranking of generated conformations is an important, though still not a completely resolved problem. Rescoring schemes often improve the quality of the obtained rankings. It is known that a local optimization is essential before a valid rescore value can be calculated. Here, we present a method that improves rescoring results obtained with the DrugScore function due to a new optimization technique. The method implements a more sophisticated search algorithm compared to the classic local optimization procedures used in this context. We validated the proposed method on a set of 192 protein-ligand complexes. Results show substantial improvements compared to original docking results with success rates increased by up to 10% for top scored solutions below 2 ? root-mean-square deviation to the native state and up to 18% increase below 1 ?, respectively.  相似文献   

8.
This review gives an introduction into ligand - receptor docking and illustrates the basic underlying concepts. An overview of different approaches and algorithms is provided. Although the application of docking and scoring has led to some remarkable successes, there are still some major challenges ahead, which are outlined here as well. Approaches to address some of these challenges and the latest developments in the area are presented. Some aspects of the assessment of docking program performance are discussed. A number of successful applications of structure-based virtual screening are described.  相似文献   

9.
Docking and scoring are critical issues in virtual drug screening methods. Fast and reliable methods are required for the prediction of binding affinity especially when applied to a large library of compounds. The implementation of receptor flexibility and refinement of scoring functions for this purpose are extremely challenging in terms of computational speed. Here we propose a knowledge-based multiple-conformation docking method that efficiently accommodates receptor flexibility thus permitting reliable virtual screening of large compound libraries. Starting with a small number of active compounds, a preliminary docking operation is conducted on a large ensemble of receptor conformations to select the minimal subset of receptor conformations that provides a strong correlation between the experimental binding affinity (e.g., Ki, IC50) and the docking score. Only this subset is used for subsequent multiple-conformation docking of the entire data set of library (test) compounds. In conjunction with the multiple-conformation docking procedure, a two-step scoring scheme is employed by which the optimal scoring geometries obtained from the multiple-conformation docking are re-scored by a molecular mechanics energy function including desolvation terms. To demonstrate the feasibility of this approach, we applied this integrated approach to the estrogen receptor alpha (ERalpha) system for which published binding affinity data were available for a series of structurally diverse chemicals. The statistical correlation between docking scores and experimental values was significantly improved from those of single-conformation dockings. This approach led to substantial enrichment of the virtual screening conducted on mixtures of active and inactive ERalpha compounds.  相似文献   

10.
Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here.  相似文献   

11.
It has been notoriously difficult to develop general all-purpose scoring functions for high-throughput docking that correlate with measured binding affinity. As a practical alternative, AutoShim uses the program Magnet to add point-pharmacophore like "shims" to the binding site of each protein target. The pharmacophore shims are weighted by partial least-squares (PLS) regression, adjusting the all-purpose scoring function to reproduce IC 50 data, much as the shims in an NMR magnet are weighted to optimize the field for a better spectrum. This dramatically improves the affinity predictions on 25% of the compounds held out at random. An iterative procedure chooses the best pose during the process of shim parametrization. This method reproducibly converges to a consistent solution, regardless of starting pose, in just 2-4 iterations, so these robust models do not overtrain. Sets of complex multifeature shims, generated by a recursive partitioning method, give the best activity predictions, but these are difficult to interpret when designing new compounds. Sets of simpler single-point pharmacophores still predict affinity reasonably well and clearly indicate the molecular interactions producing effective binding. The pharmacophore requirements are very reproducible, irrespective of the compound sets used for parametrization, lending confidence to the predictions and interpretations. The automated procedure does require a training set of experimental compounds but otherwise adds little extra time over conventional docking.  相似文献   

12.
For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.  相似文献   

13.
To improve the performance of a single scoring function used in a protein-ligand docking program, we developed a bootstrap-based consensus scoring (BBCS) method, which is based on ensemble learning. BBCS combines multiple scorings, each of which has the same function form but different energy-parameter sets. These multiple energy-parameter sets are generated in two steps: (1) generation of training sets by a bootstrap method and (2) optimization of energy-parameter set by a Z-score approach, which is based on energy landscape theory as used in protein folding, against each training set. In this study, we applied BBCS to the FlexX scoring function. Using given 50 complexes, we generated 100 training sets and obtained 100 optimized energy-parameter sets. These parameter sets were tested against 48 complexes different from the training sets. BBCS was shown to be an improvement over single scoring when using a parameter set optimized by the same Z-score approach. Comparing BBCS with the original FlexX scoring function, we found that (1) the success rate of recognizing the crystal structure at the top relative to decoys increased from 33.3% to 52.1% and that (2) the rank of the crystal structure improved for 54.2% of the complexes and worsened for none. We also found that BBCS performed better than conventional consensus scoring (CS).  相似文献   

14.
We present a novel scoring function for docking of small molecules to protein binding sites. The scoring function is based on a combination of two main approaches used in the field, the empirical and knowledge-based approaches. To calibrate the scoring function we used an iterative procedure in which a ligand's position and its score were determined self-consistently at each iteration. The scoring function demonstrated superiority in prediction of ligand positions in docking tests against the commonly used Dock, FlexX and Gold docking programs. It also demonstrated good accuracy of binding affinity prediction for the docked ligands.  相似文献   

15.
We present results of testing the ability of eleven popular scoring functions to predict native docked positions using a recently developed method (Ruvinsky and Kozintsev, J Comput Chem 2005, 26, 1089) for estimation the entropy contributions of relative motions to protein-ligand binding affinity. The method is based on the integration of the configurational integral over clusters obtained from multiple docked positions. We use a test set of 100 PDB protein-ligand complexes and ensembles of 101 docked positions generated by (Wang et al. J Med Chem 2003, 46, 2287) for each ligand in the test set. To test the suggested method we compared the averaged root-mean square deviations (RMSD) of the top-scored ligand docked positions, accounting and not accounting for entropy contributions, relative to the experimentally determined positions. We demonstrate that the method increases docking accuracy by 10-21% when used in conjunction with the AutoDock scoring function, by 2-25% with G-Score, by 7-41% with D-Score, by 0-8% with LigScore, by 1-6% with PLP, by 0-12% with LUDI, by 2-8% with F-Score, by 7-29% with ChemScore, by 0-9% with X-Score, by 2-19% with PMF, and by 1-7% with DrugScore. We also compared the performance of the suggested method with the method based on ranking by cluster occupancy only. We analyze how the choice of a clustering-RMSD and a low bound of dense clusters impacts on docking accuracy of the scoring methods. We derive optimal intervals of the clustering-RMSD for 11 scoring functions.  相似文献   

16.
Virtual screening by molecular docking has become a widely used approach to lead discovery in the pharmaceutical industry when a high-resolution structure of the biological target of interest is available. The performance of three widely used docking programs (Glide, GOLD, and DOCK) for virtual database screening is studied when they are applied to the same protein target and ligand set. Comparisons of the docking programs and scoring functions using a large and diverse data set of pharmaceutically interesting targets and active compounds are carried out. We focus on the problem of docking and scoring flexible compounds which are sterically capable of docking into a rigid conformation of the receptor. The Glide XP methodology is shown to consistently yield enrichments superior to the two alternative methods, while GOLD outperforms DOCK on average. The study also shows that docking into multiple receptor structures can decrease the docking error in screening a diverse set of active compounds.  相似文献   

17.
Scoring forms a major obstacle to the success of any docking study. In general, fast scoring functions perform poorly when used to determine the relative affinity of ligands for their receptors. In this study, the objective was not to rank compounds with confidence but simply to identify a scoring method which could provide a 4-fold hit enrichment in a screening sample over random selection. To this end, LigandFit, a fast shape matching docking algorithm, was used to dock a variety of known inhibitors of type 4 phosphodiesterase (PDE4B) into its binding site determined crystallographically for a series of pyrazolopyridine inhibitors. The success of identifying good poses with this technique was explored through RMSD comparisons with 19 known inhibitors for which crystallographic structures were available. The effectiveness of five scoring functions (PMF, JAIN, PLP2, LigScore2, and DockScore) was then evaluated through consideration of the success in enriching the top ranked fractions of nine artificial databases, constructed by seeding 1980 inactive ligands (pIC50 < 5) with 20 randomly selected inhibitors (pIC50 > 6.5). PMF and JAIN showed high average enrichment factors (greater than 4 times) in the top 5-10% of the ranked databases. Rank-based consensus scoring was then investigated, and the rational combination of 3 scoring functions resulted in more robust scoring schemes with (cScore)-DPmJ (consensus score of DockScore, PMF, and JAIN) and (cScore)-PPmJ (PLP2, PMF, and JAIN) yielding particularly good results. These cScores are believed to be of greater general application. Finally, the analysis of the behavior of the scoring functions across different chemotypes uncovered the inherent bias of the docking and scoring toward compounds in the same structural family as that employed for the crystal structure, suggesting the need to use multiple versions of the binding site for more successful virtual screening strategies.  相似文献   

18.
Most of the recent published works in the field of docking and scoring protein/ligand complexes have focused on ranking true positives resulting from a Virtual Library Screening (VLS) through the use of a specified or consensus linear scoring function. In this work, we present a methodology to speed up the High Throughput Screening (HTS) process, by allowing focused screens or for hitlist triaging when a prohibitively large number of hits is identified in the primary screen, where we have extended the principle of consensus scoring in a nonlinear neural network manner. This led us to introduce a nonlinear Generalist scoring Function, GFscore, which was trained to discriminate true positives from false positives in a data set of diverse chemical compounds. This original Generalist scoring Function is a combination of the five scoring functions found in the CScore package from Tripos Inc. GFscore eliminates up to 75% of molecules, with a confidence rate of 90%. The final result is a Hit Enrichment in the list of molecules to investigate during a research campaign for biological active compounds where the remaining 25% of molecules would be sent to in vitro screening experiments. GFscore is therefore a powerful tool for the biologist, saving both time and money.  相似文献   

19.
SKATE is a docking prototype that decouples systematic sampling from scoring. This novel approach removes any interdependence between sampling and scoring functions to achieve better sampling and, thus, improves docking accuracy. SKATE systematically samples a ligand's conformational, rotational and translational degrees of freedom, as constrained by a receptor pocket, to find sterically allowed poses. Efficient systematic sampling is achieved by pruning the combinatorial tree using aggregate assembly, discriminant analysis, adaptive sampling, radial sampling, and clustering. Because systematic sampling is decoupled from scoring, the poses generated by SKATE can be ranked by any published, or in‐house, scoring function. To test the performance of SKATE, ligands from the Asetex/CDCC set, the Surflex set, and the Vertex set, a total of 266 complexes, were redocked to their respective receptors. The results show that SKATE was able to sample poses within 2 Å RMSD of the native structure for 98, 95, and 98% of the cases in the Astex/CDCC, Surflex, and Vertex sets, respectively. Cross‐docking accuracy of SKATE was also assessed by docking 10 ligands to thymidine kinase and 73 ligands to cyclin‐dependent kinase. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

20.
Docking programs are widely used to discover novel ligands efficiently and can predict protein-ligand complex structures with reasonable accuracy and speed. However, there is an emerging demand for better performance from the scoring methods. Consensus scoring (CS) methods improve the performance by compensating for the deficiencies of each scoring function. However, conventional CS and existing scoring functions have the same problems, such as a lack of protein flexibility, inadequate treatment of salvation, and the simplistic nature of the energy function used. Although there are many problems in current scoring functions, we focus our attention on the incorporation of unbound ligand conformations. To address this problem, we propose supervised consensus scoring (SCS), which takes into account protein-ligand binding process using unbound ligand conformations with supervised learning. An evaluation of docking accuracy for 100 diverse protein-ligand complexes shows that SCS outperforms both CS and 11 scoring functions (PLP, F-Score, LigScore, DrugScore, LUDI, X-Score, AutoDock, PMF, G-Score, ChemScore, and D-score). The success rates of SCS range from 89% to 91% in the range of rmsd < 2 A, while those of CS range from 80% to 85%, and those of the scoring functions range from 26% to 76%. Moreover, we also introduce a method for judging whether a compound is active or inactive with the appropriate criterion for virtual screening. SCS performs quite well in docking accuracy and is presumably useful for screening large-scale compound databases before predicting binding affinity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号