首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Since the evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, scoring functions play significant roles in it. However, it is known that a scoring function does not always work well for all target proteins. When one cannot know which scoring function works best against a target protein a priori, there is no standard scoring method to know it even if 3D structure of a target protein-ligand complex is available. Therefore, development of the method to achieve high enrichments from given scoring functions and 3D structure of protein-ligand complex is a crucial and challenging task. To address this problem, we applied SCS (supervised consensus scoring), which employs a rough linear correlation between the binding free energy and the root-mean-square deviation (rmsd) of a native ligand conformations and incorporates protein-ligand binding process with docked ligand conformations using supervised learning, to virtual screening. We evaluated both the docking poses and enrichments of SCS and five scoring functions (F-Score, G-Score, D-Score, ChemScore, and PMF) for three different target proteins: thymidine kinase (TK), thrombin (thrombin), and peroxisome proliferator-activated receptor gamma (PPARgamma). Our enrichment studies show that SCS is competitive or superior to a best single scoring function at the top ranks of screened database. We found that the enrichments of SCS could be limited by a best scoring function, because SCS is obtained on the basis of the five individual scoring functions. Therefore, it is concluded that SCS works very successfully from our results. Moreover, from docking pose analysis, we revealed the connection between enrichment and average centroid distance of top-scored docking poses. Since SCS requires only one 3D structure of protein-ligand complex, SCS will be useful for identifying new ligands.  相似文献   

2.
The evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, and scoring functions play significant roles in it. While consensus scoring (CS) generally improves enrichment by compensating for the deficiencies of each scoring function, the strategy of how individual scoring functions are selected remains a challenging task when few known active compounds are available. To address this problem, we propose feature selection-based consensus scoring (FSCS), which performs supervised feature selection with docked native ligand conformations to select complementary scoring functions. We evaluated the enrichments of five scoring functions (F-Score, D-Score, PMF, G-Score, and ChemScore), FSCS, and RCS (rank-by-rank consensus scoring) for four different target proteins: acetylcholine esterase (AChE), thrombin (thrombin), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). The results indicated that FSCS was able to select the complementary scoring functions and enhance ligand enrichments and that it outperformed RCS and the individual scoring functions for all target proteins. They also indicated that the performances of the single scoring functions were strongly dependent on the target protein. An especially favorable result with implications for practical drug screening is that FSCS performs well even if only one 3D structure of the protein-ligand complex is known. Moreover, we found that one can infer which scoring functions significantly enrich active compounds by using feature selection before actual docking and that the selected scoring functions are complementary.  相似文献   

3.
Docking programs are widely used to discover novel ligands efficiently and can predict protein-ligand complex structures with reasonable accuracy and speed. However, there is an emerging demand for better performance from the scoring methods. Consensus scoring (CS) methods improve the performance by compensating for the deficiencies of each scoring function. However, conventional CS and existing scoring functions have the same problems, such as a lack of protein flexibility, inadequate treatment of salvation, and the simplistic nature of the energy function used. Although there are many problems in current scoring functions, we focus our attention on the incorporation of unbound ligand conformations. To address this problem, we propose supervised consensus scoring (SCS), which takes into account protein-ligand binding process using unbound ligand conformations with supervised learning. An evaluation of docking accuracy for 100 diverse protein-ligand complexes shows that SCS outperforms both CS and 11 scoring functions (PLP, F-Score, LigScore, DrugScore, LUDI, X-Score, AutoDock, PMF, G-Score, ChemScore, and D-score). The success rates of SCS range from 89% to 91% in the range of rmsd < 2 A, while those of CS range from 80% to 85%, and those of the scoring functions range from 26% to 76%. Moreover, we also introduce a method for judging whether a compound is active or inactive with the appropriate criterion for virtual screening. SCS performs quite well in docking accuracy and is presumably useful for screening large-scale compound databases before predicting binding affinity.  相似文献   

4.
We have developed an iterative knowledge-based scoring function (ITScore) to describe protein-ligand interactions. Here, we assess ITScore through extensive tests on native structure identification, binding affinity prediction, and virtual database screening. Specifically, ITScore was first applied to a test set of 100 protein-ligand complexes constructed by Wang et al. (J Med Chem 2003, 46, 2287), and compared with 14 other scoring functions. The results show that ITScore yielded a high success rate of 82% on identifying native-like binding modes under the criterion of rmsd < or = 2 A for each top-ranked ligand conformation. The success rate increased to 98% if the top five conformations were considered for each ligand. In the case of binding affinity prediction, ITScore also obtained a good correlation for this test set (R = 0.65). Next, ITScore was used to predict binding affinities of a second diverse test set of 77 protein-ligand complexes prepared by Muegge and Martin (J Med Chem 1999, 42, 791), and compared with four other widely used knowledge-based scoring functions. ITScore yielded a high correlation of R2 = 0.65 (or R = 0.81) in the affinity prediction. Finally, enrichment tests were performed with ITScore against four target proteins using the compound databases constructed by Jacobsson et al. (J Med Chem 2003, 46, 5781). The results were compared with those of eight other scoring functions. ITScore yielded high enrichments in all four database screening tests. ITScore can be easily combined with the existing docking programs for the use of structure-based drug design.  相似文献   

5.
Empirical scoring functions provide estimates of the free energy of protein-ligand binding in situations when atomic-scale simulations are intractable, for example, in virtual high-throughput screening. Currently, such scoring functions are often inaccurate, and further improvements are complicated by the lack of reliable training data, the complex interplay between scoring functions and docking algorithms, and an inconsistent statistical treatment of positive and negative training data. In comparison to various other performance measures of scoring functions, "analysis of variance" provides a well-behaved objective function for optimization, which focuses on the signal-to-noise ratio of ligand-decoy discrimination. In combination with a large database of ligands and decoys, an in situ optimization of scoring function parameters was able to generate improved, target-specific scoring functions for three different proteins of pharmaceutical interest: cyclin-dependent kinase 2, the estrogen receptor, and cyclooxygenase-2. Statistical analysis of the improvements observed in "receiver-operating characteristic" curves showed that the optimized scoring functions achieved a significantly (between p < 0.0001 and p < 0.05) higher enrichment of true ligands. A scaffold dependence of the resulting binding modes was observed, which is discussed in conjunction with the rigid receptor hypothesis commonly made in protein-ligand docking. In summary, the approach described here represents a well-adapted statistical method for setting up scoring functions.  相似文献   

6.
Applications in structural biology and medicinal chemistry require protein-ligand scoring functions for two distinct tasks: (i) ranking different poses of a small molecule in a protein binding site and (ii) ranking different small molecules by their complementarity to a protein site. Using probability theory, we developed two atomic distance-dependent statistical scoring functions: PoseScore was optimized for recognizing native binding geometries of ligands from other poses and RankScore was optimized for distinguishing ligands from nonbinding molecules. Both scores are based on a set of 8,885 crystallographic structures of protein-ligand complexes but differ in the values of three key parameters. Factors influencing the accuracy of scoring were investigated, including the maximal atomic distance and non-native ligand geometries used for scoring, as well as the use of protein models instead of crystallographic structures for training and testing the scoring function. For the test set of 19 targets, RankScore improved the ligand enrichment (logAUC) and early enrichment (EF(1)) scores computed by DOCK 3.6 for 13 and 14 targets, respectively. In addition, RankScore performed better at rescoring than each of seven other scoring functions tested. Accepting both the crystal structure and decoy geometries with all-atom root-mean-square errors of up to 2 ? from the crystal structure as correct binding poses, PoseScore gave the best score to a correct binding pose among 100 decoys for 88% of all cases in a benchmark set containing 100 protein-ligand complexes. PoseScore accuracy is comparable to that of DrugScore(CSD) and ITScore/SE and superior to 12 other tested scoring functions. Therefore, RankScore can facilitate ligand discovery, by ranking complexes of the target with different small molecules; PoseScore can be used for protein-ligand complex structure prediction, by ranking different conformations of a given protein-ligand pair. The statistical potentials are available through the Integrative Modeling Platform (IMP) software package (http://salilab.org/imp) and the LigScore Web server (http://salilab.org/ligscore/).  相似文献   

7.
In today's world of high-throughput in silico screening, the development of virtual screening methodologies to prioritize small molecules as new chemical entities (NCEs) for synthesis is of current interest. Among several approaches to virtual screening, structure-based virtual screening has been considered the most effective. However the problems associated with the ranking of potential solutions in terms of scoring functions remains one of the major bottlenecks in structure-based virtual screening technology. It has been suggested that scoring functions may be used as filters for distinguishing binders from nonbinders instead of accurately predicting their binding free energies. Subsequently, several improvements have been made in this area, which include the use of multiple rather than single scoring functions and application of either consensus or multivariate statistical methods or both to improve the discrimination between binders and nonbinders. In view of it, the discriminative ability (distinguishing binders from nonbinders) of binary QSAR models derived using LUDI and MOE scoring functions has been compared with the models derived by Jacobbsson et al. on five data sets viz. estrogen receptor alphamimics (ERalpha_mimics), estrogen receptor alphatoxins (ERalpha_toxins), matrix metalloprotease 3 inhibitors (MMP-3), factor Xa inhibitors (fXa), and acetylcholine esterase inhibitors (AChE). The overall analyses reveal that binary QSAR is comparable to the PLS discriminant analysis, rule-based, and Bayesian classification methods used by Jacobsson et al. Further the scoring functions implemented in LUDI and MOE can score a wide range of protein-ligand interactions and are comparable to the scoring functions implemented in ICM and Cscore. Thus the binary QSAR models derived using LUDI and MOE scoring functions may be useful as a preliminary screening layer in a multilayered virtual screening paradigm.  相似文献   

8.
A central problem in de novo drug design is determining the binding affinity of a ligand with a receptor. A new scoring algorithm is presented that estimates the binding affinity of a protein-ligand complex given a three-dimensional structure. The method, LISA (Ligand Identification Scoring Algorithm), uses an empirical scoring function to describe the binding free energy. Interaction terms have been designed to account for van der Waals (VDW) contacts, hydrogen bonding, desolvation effects, and metal chelation to model the dissociation equilibrium constants using a linear model. Atom types have been introduced to differentiate the parameters for VDW, H-bonding interactions, and metal chelation between different atom pairs. A training set of 492 protein-ligand complexes was selected for the fitting process. Different test sets have been examined to evaluate its ability to predict experimentally measured binding affinities. By comparing with other well-known scoring functions, the results show that LISA has advantages over many existing scoring functions in simulating protein-ligand binding affinity, especially metalloprotein-ligand binding affinity. Artificial Neural Network (ANN) was also used in order to demonstrate that the energy terms in LISA are well designed and do not require extra cross terms.  相似文献   

9.
The performances of several two-step scoring approaches for molecular docking were assessed for their ability to predict binding geometries and free energies. Two new scoring functions designed for "step 2 discrimination" were proposed and compared to our CHARMM implementation of the linear interaction energy (LIE) approach using the Generalized-Born with Molecular Volume (GBMV) implicit solvation model. A scoring function S1 was proposed by considering only "interacting" ligand atoms as the "effective size" of the ligand and extended to an empirical regression-based pair potential S2. The S1 and S2 scoring schemes were trained and 5-fold cross-validated on a diverse set of 259 protein-ligand complexes from the Ligand Protein Database (LPDB). The regression-based parameters for S1 and S2 also demonstrated reasonable transferability in the CSARdock 2010 benchmark using a new data set (NRC HiQ) of diverse protein-ligand complexes. The ability of the scoring functions to accurately predict ligand geometry was evaluated by calculating the discriminative power (DP) of the scoring functions to identify native poses. The parameters for the LIE scoring function with the optimal discriminative power (DP) for geometry (step 1 discrimination) were found to be very similar to the best-fit parameters for binding free energy over a large number of protein-ligand complexes (step 2 discrimination). Reasonable performance of the scoring functions in enrichment of active compounds in four different protein target classes established that the parameters for S1 and S2 provided reasonable accuracy and transferability. Additional analysis was performed to definitively separate scoring function performance from molecular weight effects. This analysis included the prediction of ligand binding efficiencies for a subset of the CSARdock NRC HiQ data set where the number of ligand heavy atoms ranged from 17 to 35. This range of ligand heavy atoms is where improved accuracy of predicted ligand efficiencies is most relevant to real-world drug design efforts.  相似文献   

10.
In the context of virtual database screening, calculations of protein-ligand binding entropy of relative and overall molecular motions are challenging, owing to the inherent structural complexity of the ligand binding well in the energy landscape of protein-ligand interactions and computing time limitations. We describe a fast statistical thermodynamic method for estimation the binding entropy to address the challenges. The method is based on the integration of the configurational integral over clusters obtained from multiple docked positions. We apply the method in conjunction with 11 popular scoring functions (AutoDock, ChemScore, DrugScore, D-Score, F-Score, G-Score, LigScore, LUDI, PLP, PMF, X-Score) to evaluate the binding entropy of 100 protein-ligand complexes. The averaged values of binding entropy contribution vary from 6.2 to 9.1 kcal/mol, showing good agreement with literature. We calculate positional sizes and the angular volume of the native ligand wells. The averaged geometric mean of positional sizes in principal directions varies from 0.8 to 1.4 A. The calculated range of angular volumes is 3.3-11.8 rad(2). Then we demonstrate that the averaged six-dimensional volume of the native well is larger than the volume of the most populated non-native well in energy landscapes described by all of 11 scoring functions.  相似文献   

11.
Fourteen popular scoring functions, i.e., X-Score, DrugScore, five scoring functions in the Sybyl software (D-Score, PMF-Score, G-Score, ChemScore, and F-Score), four scoring functions in the Cerius2 software (LigScore, PLP, PMF, and LUDI), two scoring functions in the GOLD program (GoldScore and ChemScore), and HINT, were tested on the refined set of the PDBbind database, a set of 800 diverse protein-ligand complexes with high-resolution crystal structures and experimentally determined Ki or Kd values. The focus of our study was to assess the ability of these scoring functions to predict binding affinities based on the experimentally determined high-resolution crystal structures of proteins in complex with their ligands. The quantitative correlation between the binding scores produced by each scoring function and the known binding constants of the 800 complexes was computed. X-Score, DrugScore, Sybyl::ChemScore, and Cerius2::PLP provided better correlations than the other scoring functions with standard deviations of 1.8-2.0 log units. These four scoring functions were also found to be robust enough to carry out computation directly on unaltered crystal structures. To examine how well scoring functions predict the binding affinities for ligands bound to the same target protein, the performance of these 14 scoring functions were evaluated on three subsets of protein-ligand complexes from the test set: HIV-1 protease complexes (82 entries), trypsin complexes (45 entries), and carbonic anhydrase II complexes (40 entries). Although the results for the HIV-1 protease subset are less than desirable, several scoring functions are able to satisfactorily predict the binding affinities for the trypsin and the carbonic anhydrase II subsets with standard deviation as low as 1.0 log unit (corresponding to 1.3-1.4 kcal/mol at room temperature). Our results demonstrate the strengths as well as the weaknesses of current scoring functions for binding affinity prediction.  相似文献   

12.
Virtual screening is becoming an important tool for drug discovery. However, the application of virtual screening has been limited by the lack of accurate scoring functions. Here, we present a novel scoring function, MedusaScore, for evaluating protein-ligand binding. MedusaScore is based on models of physical interactions that include van der Waals, solvation, and hydrogen bonding energies. To ensure the best transferability of the scoring function, we do not use any protein-ligand experimental data for parameter training. We then test the MedusaScore for docking decoy recognition and binding affinity prediction and find superior performance compared to other widely used scoring functions. Statistical analysis indicates that one source of inaccuracy of MedusaScore may arise from the unaccounted entropic loss upon ligand binding, which suggests avenues of approach for further MedusaScore improvement.  相似文献   

13.
To help improve the accuracy of protein-ligand docking as a useful tool for drug discovery, we developed MPSim-Dock, which ensures a comprehensive sampling of diverse families of ligand conformations in the binding region followed by an enrichment of the good energy scoring families so that the energy scores of the sampled conformations can be reliably used to select the best conformation of the ligand. This combines elements of DOCK4.0 with molecular dynamics (MD) methods available in the software, MPSim. We test here the efficacy of MPSim-Dock to predict the 64 protein-ligand combinations formed by starting with eight trypsin cocrystals, and crossdocking the other seven ligands to each protein conformation. We consider this as a model for how well the method would work for one given target protein structure. Using as a criterion that the structures within 2 kcal/mol of the top scoring include a conformation within a coordinate root mean square (CRMS) of 1 A of the crystal structure, we find that 100% of the 64 cases are predicted correctly. This indicates that MPSim-Dock can be used reliably to identify strongly binding ligands, making it useful for virtual ligand screening.  相似文献   

14.
Protein-ligand docking programs can generate a large number of possible binding orientations for each ligand candidate. The challenge is to identify the orientations closest to the native binding mode using a scoring method. Many different scoring functions have been developed for protein-ligand scoring, but their performance on binding mode prediction is often target-dependent. In this study, a statistical approach was employed to provide a confidence measure of scoring performance in finding close to the correct docked ligand orientations. It exploits the fact that the scores provided by an adequately performing scoring function generally improve as the ligand binding modes get closer to the correct native orientation. For such cases, the correlation coefficient of scores versus distances is expected to be highest when the most native-like orientation is used as a reference. This correlation coefficient, called the correlation-based score (CBScore), was used as an indicator of how far the docked pose was from the native orientation. The correlation between the original scores and CBScores as well as the range of CBScores were found to be good measures of scoring performance. They were combined into a single quantity, called the scoring confidence index. High values of the scoring confidence index were indicative of pronounced and relatively smooth binding energy landscapes with easily discernable global minima, resulting in reliable binding mode predictions. Low values of this index reflected rugged energy landscapes making the prediction of the correct binding mode very difficult and often unreliable. The diagnostic ability of the scoring confidence index was tested on a non-redundant set of 50 protein-ligand complexes scored with three commonly employed scoring functions: AffiScore, DrugScore and X-Score. Binding mode predictions were found to be three times more reliable for complexes with scoring confidence indices in the upper half than for cases with values in the lower half of the resulting range of 0–1.6. This new confidence measure of scoring performance is expected to be a valuable tool for virtual screening applications. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

15.
16.
Performance of small molecule automated docking programs has conceptually been divided into docking -, scoring -, ranking - and screening power, which focuses on the crystal pose prediction, affinity prediction, ligand ranking and database screening capabilities of the docking program, respectively. Benchmarks show that different docking programs can excel in individual benchmarks which suggests that the scoring function employed by the programs can be optimized for a particular task. Here the scoring function of Smina is re-optimized towards enhancing the docking power using a supervised machine learning approach and a manually curated database of ligands and cross docking receptor pairs. The optimization method does not need associated binding data for the receptor-ligand examples used in the data set and works with small train sets. The re-optimization of the weights for the scoring function results in a similar docking performance with regard to docking power towards a cross docking test set. A ligand decoy based benchmark indicates a better discrimination between poses with high and low RMSD. The reported parameters for Smina are compatible with Autodock Vina and represent ready-to-use alternative parameters for researchers who aim at pose prediction rather than affinity prediction.  相似文献   

17.
We have derived, in the context of the Rigid Rotor Harmonic Approximation (RRHO), a general mass and Planck's constant h independent expression for the dissociation free energy in ligand–receptor systems, featuring a systematically (anti‐binding) additive negative entropic term depending on readily available ligand–receptor quantities. The proposed RRHO expression allows to straightforwardly compute the absolute standard dissociation free energy without resorting to expensive normal mode analysis or other dynamical matrix‐based techniques for evaluating the entropic contribution, hence providing an effective scoring function for assessing docking poses with no adjustable parameters. Our RRHO formula was tested on a set of 55 ligand–receptor systems obtaining correlation coefficients and unsigned mean errors comparable to or better than those obtained with computationally demanding techniques for the dissociation entropy assessment. The proposed compact reformulation of the RRHO entropy term could constitute the basis for new and more effective scoring functions in molecular docking‐based high‐throughput virtual screening for drug discovery. © 2016 Wiley Periodicals, Inc.  相似文献   

18.
We describe binding free energy calculations in the D3R Grand Challenge 2015 for blind prediction of the binding affinities of 180 ligands to Hsp90. The present D3R challenge was built around experimental datasets involving Heat shock protein (Hsp) 90, an ATP-dependent molecular chaperone which is an important anticancer drug target. The Hsp90 ATP binding site is known to be a challenging target for accurate calculations of ligand binding affinities because of the ligand-dependent conformational changes in the binding site, the presence of ordered waters and the broad chemical diversity of ligands that can bind at this site. Our primary focus here is to distinguish binders from nonbinders. Large scale absolute binding free energy calculations that cover over 3000 protein–ligand complexes were performed using the BEDAM method starting from docked structures generated by Glide docking. Although the ligand dataset in this study resembles an intermediate to late stage lead optimization project while the BEDAM method is mainly developed for early stage virtual screening of hit molecules, the BEDAM binding free energy scoring has resulted in a moderate enrichment of ligand screening against this challenging drug target. Results show that, using a statistical mechanics based free energy method like BEDAM starting from docked poses offers better enrichment than classical docking scoring functions and rescoring methods like Prime MM-GBSA for the Hsp90 data set in this blind challenge. Importantly, among the three methods tested here, only the mean value of the BEDAM binding free energy scores is able to separate the large group of binders from the small group of nonbinders with a gap of 2.4 kcal/mol. None of the three methods that we have tested provided accurate ranking of the affinities of the 147 active compounds. We discuss the possible sources of errors in the binding free energy calculations. The study suggests that BEDAM can be used strategically to discriminate binders from nonbinders in virtual screening and to more accurately predict the ligand binding modes prior to the more computationally expensive FEP calculations of binding affinity.  相似文献   

19.
Calculation of protein-ligand binding affinities continues to be a hotbed of research. Although many techniques for computing protein-ligand binding affinities have been introduced--ranging from computationally very expensive approaches, such as free energy perturbation (FEP) theory; to more approximate techniques, such as empirically derived scoring functions, which, although computationally efficient, lack a clear theoretical basis--there remains pressing need for more robust approaches. A recently introduced technique, the displaced-solvent functional (DSF) method, was developed to bridge the gap between the high accuracy of FEP and the computational efficiency of empirically derived scoring functions. In order to develop a set of reference data to test the DSF theory for calculating absolute protein-ligand binding affinities, we have pursued FEP theory calculations of the binding free energies of a methane ligand with 13 different model hydrophobic enclosures of varying hydrophobicity. The binding free energies of the methane ligand with the various hydrophobic enclosures were then recomputed by DSF theory and compared with the FEP reference data. We find that the DSF theory, which relies on no empirically tuned parameters, shows excellent quantitative agreement with the FEP. We also explored the ability of buried solvent accessible surface area and buried molecular surface area models to describe the relevant physics, and find the buried molecular surface area model to offer superior performance over this dataset.  相似文献   

20.
The development and validation of a new knowledge based scoring function (SIScoreJE) to predict binding energy between proteins and ligands is presented. SIScoreJE efficiently predicts the binding energy between a small molecule and its protein receptor. Protein-ligand atomic contact information was derived from a Non-Redundant Data set (NRD) of over 3000 X-ray crystal structures of protein-ligand complexes. This information was classified for individual "atom contact pairs" (ACP) which is used to calculate the atomic contact preferences. In addition to the two schemes generated in this study we have assessed a number of other common atom-type classification schemes. The preferences were calculated using an information theoretic relationship of joint entropy. Among 18 different atom-type classification schemes "ScoreJE Atom Type set2" (SATs2) was found to be the most suitable for our approach. To test the sensitivity of the method to the inclusion of solvent, Single-body Solvation Potentials (SSP) were also derived from the atomic contacts between the protein atom types and water molecules modeled using AQUARIUS2. Validation was carried out using an evaluation data set of 100 protein-ligand complexes with known binding energies to test the ability of the scoring functions to reproduce known binding affinities. In summary, it was found that a combined SSP/ScoreJE (SIScoreJE) performed significantly better than ScoreJE alone, and SIScoreJE and ScoreJE performed better than GOLD::GoldScore, GOLD::ChemScore, and XScore.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号