首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein–ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5–8% better compared to the standard functions implemented in popular docking software.  相似文献   

2.
We have developed a generic evolutionary method with an empirical scoring function for the protein-ligand docking, which is a problem of paramount importance in structure-based drug design. This approach, referred to as the GEMDOCK (Generic Evolutionary Method for molecular DOCKing), combines both continuous and discrete search mechanisms. We tested our approach on seven protein-ligand complexes, and the docked lowest energy structures have root-mean-square derivations ranging from 0.32 to 0.99 A with respect to the corresponding crystal ligand structures. In addition, we evaluated GEMDOCK on crossdocking experiments, in which some complexes with an identical protein used for docking all crystallized ligands of these complexes. GEMDOCK yielded 98% docked structures with RMSD below 2.0 A when the ligands were docked into foreign protein structures. We have reported the validation and analysis of our approach on various search spaces and scoring functions. Experimental results show that our approach is robust, and the empirical scoring function is simple and fast to recognize compounds. We found that if GEMDOCK used the RMSD scoring function, then the prediction accuracy was 100% and the docked structures had RMSD below 0.1 A for each test system. These results suggest that GEMDOCK is a useful tool, and may systematically improve the forms and parameters of a scoring function, which is one of major bottlenecks for molecular recognition.  相似文献   

3.
Docking programs are widely used to discover novel ligands efficiently and can predict protein-ligand complex structures with reasonable accuracy and speed. However, there is an emerging demand for better performance from the scoring methods. Consensus scoring (CS) methods improve the performance by compensating for the deficiencies of each scoring function. However, conventional CS and existing scoring functions have the same problems, such as a lack of protein flexibility, inadequate treatment of salvation, and the simplistic nature of the energy function used. Although there are many problems in current scoring functions, we focus our attention on the incorporation of unbound ligand conformations. To address this problem, we propose supervised consensus scoring (SCS), which takes into account protein-ligand binding process using unbound ligand conformations with supervised learning. An evaluation of docking accuracy for 100 diverse protein-ligand complexes shows that SCS outperforms both CS and 11 scoring functions (PLP, F-Score, LigScore, DrugScore, LUDI, X-Score, AutoDock, PMF, G-Score, ChemScore, and D-score). The success rates of SCS range from 89% to 91% in the range of rmsd < 2 A, while those of CS range from 80% to 85%, and those of the scoring functions range from 26% to 76%. Moreover, we also introduce a method for judging whether a compound is active or inactive with the appropriate criterion for virtual screening. SCS performs quite well in docking accuracy and is presumably useful for screening large-scale compound databases before predicting binding affinity.  相似文献   

4.
Empirical scoring functions provide estimates of the free energy of protein-ligand binding in situations when atomic-scale simulations are intractable, for example, in virtual high-throughput screening. Currently, such scoring functions are often inaccurate, and further improvements are complicated by the lack of reliable training data, the complex interplay between scoring functions and docking algorithms, and an inconsistent statistical treatment of positive and negative training data. In comparison to various other performance measures of scoring functions, "analysis of variance" provides a well-behaved objective function for optimization, which focuses on the signal-to-noise ratio of ligand-decoy discrimination. In combination with a large database of ligands and decoys, an in situ optimization of scoring function parameters was able to generate improved, target-specific scoring functions for three different proteins of pharmaceutical interest: cyclin-dependent kinase 2, the estrogen receptor, and cyclooxygenase-2. Statistical analysis of the improvements observed in "receiver-operating characteristic" curves showed that the optimized scoring functions achieved a significantly (between p < 0.0001 and p < 0.05) higher enrichment of true ligands. A scaffold dependence of the resulting binding modes was observed, which is discussed in conjunction with the rigid receptor hypothesis commonly made in protein-ligand docking. In summary, the approach described here represents a well-adapted statistical method for setting up scoring functions.  相似文献   

5.
In molecular docking, it is challenging to develop a scoring function that is accurate to conduct high-throughput screenings. Most scoring functions implemented in popular docking software packages were developed with many approximations for computational efficiency, which sacrifices the accuracy of prediction. With advanced technology and powerful computational hardware nowadays, it is feasible to use rigorous scoring functions, such as molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) in molecular docking studies. Here, we systematically investigated the performance of MM/PBSA and MM/GBSA to identify the correct binding conformations and predict the binding free energies for 98 protein-ligand complexes. Comparison studies showed that MM/GBSA (69.4%) outperformed MM/PBSA (45.5%) and many popular scoring functions to identify the correct binding conformations. Moreover, we found that molecular dynamics simulations are necessary for some systems to identify the correct binding conformations. Based on our results, we proposed the guideline for MM/GBSA to predict the binding conformations. We then tested the performance of MM/GBSA and MM/PBSA to reproduce the binding free energies of the 98 protein-ligand complexes. The best prediction of MM/GBSA model with internal dielectric constant 2.0, produced a Spearman's correlation coefficient of 0.66, which is better than MM/PBSA (0.49) and almost all scoring functions used in molecular docking. In summary, MM/GBSA performs well for both binding pose predictions and binding free-energy estimations and is efficient to re-score the top-hit poses produced by other less-accurate scoring functions.  相似文献   

6.
New empirical scoring functions have been developed to estimate the binding affinity of a given protein-ligand complex with known three-dimensional structure. These scoring functions include terms accounting for van der Waals interaction, hydrogen bonding, deformation penalty, and hydrophobic effect. A special feature is that three different algorithms have been implemented to calculate the hydrophobic effect term, which results in three parallel scoring functions. All three scoring functions are calibrated through multivariate regression analysis of a set of 200 protein-ligand complexes and they reproduce the binding free energies of the entire training set with standard deviations of 2.2 kcal/mol, 2.1 kcal/mol, and 2.0 kcal/mol, respectively. These three scoring functions are further combined into a consensus scoring function, X-CSCORE. When tested on an independent set of 30 protein-ligand complexes, X-CSCORE is able to predict their binding free energies with a standard deviation of 2.2 kcal/mol. The potential application of X-CSCORE to molecular docking is also investigated. Our results show that this consensus scoring function improves the docking accuracy considerably when compared to the conventional force field computation used for molecular docking.  相似文献   

7.
Empirical scoring functions used in protein-ligand docking calculations are typically trained on a dataset of complexes with known affinities with the aim of generalizing across different docking applications. We report a novel method of scoring-function optimization that supports the use of additional information to constrain scoring function parameters, which can be used to focus a scoring function’s training towards a particular application, such as screening enrichment. The approach combines multiple instance learning, positive data in the form of ligands of protein binding sites of known and unknown affinity and binding geometry, and negative (decoy) data of ligands thought not to bind particular protein binding sites or known not to bind in particular geometries. Performance of the method for the Surflex-Dock scoring function is shown in cross-validation studies and in eight blind test cases. Tuned functions optimized with a sufficient amount of data exhibited either improved or undiminished screening performance relative to the original function across all eight complexes. Analysis of the changes to the scoring function suggest that modifications can be learned that are related to protein-specific features such as active-site mobility.  相似文献   

8.
Using a novel iterative method, we have developed a knowledge-based scoring function (ITScore) to predict protein-ligand interactions. The pair potentials for ITScore were derived from a training set of 786 protein-ligand complex structures in the Protein Data Bank. Twenty-six atom types were used based on the atom type category of the SYBYL software. The iterative method circumvents the long-standing reference state problem in the derivation of knowledge-based scoring functions. The basic idea is to improve pair potentials by iteration until they correctly discriminate experimentally determined binding modes from decoy ligand poses for the ligand-protein complexes in the training set. The iterative method is efficient and normally converges within 20 iterative steps. The scoring function based on the derived potentials was tested on a diverse set of 140 protein-ligand complexes for affinity prediction, yielding a high correlation coefficient of 0.74. Because ITScore uses SYBYL-defined atom types, this scoring function is easy to use for molecular files prepared by SYBYL or converted by software such as BABEL.  相似文献   

9.
Since the evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, scoring functions play significant roles in it. However, it is known that a scoring function does not always work well for all target proteins. When one cannot know which scoring function works best against a target protein a priori, there is no standard scoring method to know it even if 3D structure of a target protein-ligand complex is available. Therefore, development of the method to achieve high enrichments from given scoring functions and 3D structure of protein-ligand complex is a crucial and challenging task. To address this problem, we applied SCS (supervised consensus scoring), which employs a rough linear correlation between the binding free energy and the root-mean-square deviation (rmsd) of a native ligand conformations and incorporates protein-ligand binding process with docked ligand conformations using supervised learning, to virtual screening. We evaluated both the docking poses and enrichments of SCS and five scoring functions (F-Score, G-Score, D-Score, ChemScore, and PMF) for three different target proteins: thymidine kinase (TK), thrombin (thrombin), and peroxisome proliferator-activated receptor gamma (PPARgamma). Our enrichment studies show that SCS is competitive or superior to a best single scoring function at the top ranks of screened database. We found that the enrichments of SCS could be limited by a best scoring function, because SCS is obtained on the basis of the five individual scoring functions. Therefore, it is concluded that SCS works very successfully from our results. Moreover, from docking pose analysis, we revealed the connection between enrichment and average centroid distance of top-scored docking poses. Since SCS requires only one 3D structure of protein-ligand complex, SCS will be useful for identifying new ligands.  相似文献   

10.
A new optimization model of molecular docking is proposed, and a fast flexible docking method based on an improved adaptive genetic algorithm is developed in this paper. The algorithm takes some advanced techniques, such as multi-population genetic strategy, entropy-based searching technique with self-adaptation and the quasi-exact penalty. A new iteration scheme in conjunction with above techniques is employed to speed up the optimization process and to ensure very rapid and steady convergence. The docking accuracy and efficiency of the method are evaluated by docking results from GOLD test data set, which contains 134 protein-ligand complexes. In over 66.2% of the complexes, the docked pose was within 2.0 A root-mean-square deviation (RMSD) of the X-ray structure. Docking time is approximately in proportion to the number of the rotatable bonds of ligands.  相似文献   

11.
The two great challenges of the docking process are the prediction of ligand poses in a protein binding site and the scoring of the docked poses. Ligands that are composed of extended chains in their molecular structure display the most difficulties, predominantly because of the torsional flexibility. On the basis of the molecular docking program QXP-Flo+0802, we have developed a procedure particularly for ligands with a high degree of rotational freedom that allows the accurate prediction of the orientation and conformation of ligands in protein binding sites. Starting from an initial full Monte Carlo docking experiment, this was achieved by performing a series of successive multistep docking runs using a local Monte Carlo search with a restricted rotational angle, by which the conformational search space is limited. The method was established by using a highly flexible acetylcholinesterase inhibitor and has been applied to a number of challenging protein-ligand complexes known from the literature.  相似文献   

12.
Applications in structural biology and medicinal chemistry require protein-ligand scoring functions for two distinct tasks: (i) ranking different poses of a small molecule in a protein binding site and (ii) ranking different small molecules by their complementarity to a protein site. Using probability theory, we developed two atomic distance-dependent statistical scoring functions: PoseScore was optimized for recognizing native binding geometries of ligands from other poses and RankScore was optimized for distinguishing ligands from nonbinding molecules. Both scores are based on a set of 8,885 crystallographic structures of protein-ligand complexes but differ in the values of three key parameters. Factors influencing the accuracy of scoring were investigated, including the maximal atomic distance and non-native ligand geometries used for scoring, as well as the use of protein models instead of crystallographic structures for training and testing the scoring function. For the test set of 19 targets, RankScore improved the ligand enrichment (logAUC) and early enrichment (EF(1)) scores computed by DOCK 3.6 for 13 and 14 targets, respectively. In addition, RankScore performed better at rescoring than each of seven other scoring functions tested. Accepting both the crystal structure and decoy geometries with all-atom root-mean-square errors of up to 2 ? from the crystal structure as correct binding poses, PoseScore gave the best score to a correct binding pose among 100 decoys for 88% of all cases in a benchmark set containing 100 protein-ligand complexes. PoseScore accuracy is comparable to that of DrugScore(CSD) and ITScore/SE and superior to 12 other tested scoring functions. Therefore, RankScore can facilitate ligand discovery, by ranking complexes of the target with different small molecules; PoseScore can be used for protein-ligand complex structure prediction, by ranking different conformations of a given protein-ligand pair. The statistical potentials are available through the Integrative Modeling Platform (IMP) software package (http://salilab.org/imp) and the LigScore Web server (http://salilab.org/ligscore/).  相似文献   

13.
The ability to accurately predict biological affinity on the basis of in silico docking to a protein target remains a challenging goal in the CADD arena. Typically, "standard" scoring functions have been employed that use the calculated docking result and a set of empirical parameters to calculate a predicted binding affinity. To improve on this, we are exploring novel strategies for rapidly developing and tuning "customized" scoring functions tailored to a specific need. In the present work, three such customized scoring functions were developed using a set of 129 high-resolution protein-ligand crystal structures with measured Ki values. The functions were parametrized using N-PLS (N-way partial least squares), a multivariate technique well-known in the 3D quantitative structure-activity relationship field. A modest correlation between observed and calculated pKi values using a standard scoring function (r2 = 0.5) could be improved to 0.8 when a customized scoring function was applied. To mimic a more realistic scenario, a second scoring function was developed, not based on crystal structures but exclusively on several binding poses generated with the Flo+ docking program. Finally, a validation study was conducted by generating a third scoring function with 99 randomly selected complexes from the 129 as a training set and predicting pKi values for a test set that comprised the remaining 30 complexes. Training and test set r2 values were 0.77 and 0.78, respectively. These results indicate that, even without direct structural information, predictive customized scoring functions can be developed using N-PLS, and this approach holds significant potential as a general procedure for predicting binding affinity on the basis of in silico docking.  相似文献   

14.
Fourteen popular scoring functions, i.e., X-Score, DrugScore, five scoring functions in the Sybyl software (D-Score, PMF-Score, G-Score, ChemScore, and F-Score), four scoring functions in the Cerius2 software (LigScore, PLP, PMF, and LUDI), two scoring functions in the GOLD program (GoldScore and ChemScore), and HINT, were tested on the refined set of the PDBbind database, a set of 800 diverse protein-ligand complexes with high-resolution crystal structures and experimentally determined Ki or Kd values. The focus of our study was to assess the ability of these scoring functions to predict binding affinities based on the experimentally determined high-resolution crystal structures of proteins in complex with their ligands. The quantitative correlation between the binding scores produced by each scoring function and the known binding constants of the 800 complexes was computed. X-Score, DrugScore, Sybyl::ChemScore, and Cerius2::PLP provided better correlations than the other scoring functions with standard deviations of 1.8-2.0 log units. These four scoring functions were also found to be robust enough to carry out computation directly on unaltered crystal structures. To examine how well scoring functions predict the binding affinities for ligands bound to the same target protein, the performance of these 14 scoring functions were evaluated on three subsets of protein-ligand complexes from the test set: HIV-1 protease complexes (82 entries), trypsin complexes (45 entries), and carbonic anhydrase II complexes (40 entries). Although the results for the HIV-1 protease subset are less than desirable, several scoring functions are able to satisfactorily predict the binding affinities for the trypsin and the carbonic anhydrase II subsets with standard deviation as low as 1.0 log unit (corresponding to 1.3-1.4 kcal/mol at room temperature). Our results demonstrate the strengths as well as the weaknesses of current scoring functions for binding affinity prediction.  相似文献   

15.
We have developed an iterative knowledge-based scoring function (ITScore) to describe protein-ligand interactions. Here, we assess ITScore through extensive tests on native structure identification, binding affinity prediction, and virtual database screening. Specifically, ITScore was first applied to a test set of 100 protein-ligand complexes constructed by Wang et al. (J Med Chem 2003, 46, 2287), and compared with 14 other scoring functions. The results show that ITScore yielded a high success rate of 82% on identifying native-like binding modes under the criterion of rmsd < or = 2 A for each top-ranked ligand conformation. The success rate increased to 98% if the top five conformations were considered for each ligand. In the case of binding affinity prediction, ITScore also obtained a good correlation for this test set (R = 0.65). Next, ITScore was used to predict binding affinities of a second diverse test set of 77 protein-ligand complexes prepared by Muegge and Martin (J Med Chem 1999, 42, 791), and compared with four other widely used knowledge-based scoring functions. ITScore yielded a high correlation of R2 = 0.65 (or R = 0.81) in the affinity prediction. Finally, enrichment tests were performed with ITScore against four target proteins using the compound databases constructed by Jacobsson et al. (J Med Chem 2003, 46, 5781). The results were compared with those of eight other scoring functions. ITScore yielded high enrichments in all four database screening tests. ITScore can be easily combined with the existing docking programs for the use of structure-based drug design.  相似文献   

16.
We report the design and validation of a fast empirical function for scoring RNA-ligand interactions, and describe its implementation within RiboDock, a virtual screening system for automated flexible docking. Building on well-known protein-ligand scoring function foundations, features were added to describe the interactions of common RNA-binding functional groups that were not handled adequately by conventional terms, to disfavour non-complementary polar contacts, and to control non-specific charged interactions. The results of validation experiments against known structures of RNA-ligand complexes compare favourably with previously reported methods. Binding modes were well predicted in most cases and good discrimination was achieved between native and non-native ligands for each binding site, and between native and non-native binding sites for each ligand. Further evidence of the ability of the method to identify true RNA binders is provided by compound selection ('enrichment factor') experiments based around a series of HIV-1 TAR RNA-binding ligands. Significant enrichment in true binders was achieved amongst high scoring docking hits, even when selection was from a library of structurally related, positively charged molecules. Coupled with a semi-automated cavity detection algorithm for identification of putative ligand binding sites, also described here, the method is suitable for the screening of very large databases of molecules against RNA and RNA-protein interfaces, such as those presented by the bacterial ribosome.  相似文献   

17.
Performance of Glide was evaluated in a sequential multiple ligand docking paradigm predicting the binding modes of 129 protein-ligand complexes crystallized with clusters of 2-6 cooperative ligands. Three sampling protocols (single precision-SP, extra precision-XP, and SP without scaling ligand atom radii-SP hard) combined with three different scoring functions (GlideScore, Emodel and Glide Energy) were tested. The effects of ligand number, docking order and druglikeness of ligands and closeness of the binding site were investigated. On average 36?% of all structures were reproduced with RMSDs lower than 2??. Correctly docked structures reached 50?% when docking druglike ligands into closed binding sites by the SP hard protocol. Cooperative binding to metabolic and transport proteins can dramatically alter pharmacokinetic parameters of drugs. Analyzing the cytochrome P450 subset the SP hard protocol with Emodel ranking reproduced two-thirds of the structures well. Multiple ligand binding is also exploited by the fragment linking approach in lead discovery settings. The HSP90 subset from real life fragment optimization programs revealed that Glide is able to reproduce the positions of multiple bound fragments if conserved water molecules are considered. These case studies assess the utility of Glide in sequential multiple docking applications.  相似文献   

18.
We present a docking method that uses a scoring function for protein-ligand docking that is designed to maximize the docking success rate for low-resolution protein structures. We find that the resulting scoring function parameters are very different depending on whether they were optimized for high- or low-resolution protein structures. We show that this docking method can be successfully applied to predict the ligand-binding site of low-resolution structures. For a set of 25 protein-ligand complexes, in 76% of the cases, more than 50% of ligand-contacting residues are correctly predicted (using receptor crystal structures where the binding site is unspecified). Using decoys of the receptor structures having a 4 A RMSD from the native structure, for the same set of complexes, in 72% of the cases, we obtain at least one correctly predicted ligand-contacting residue. Furthermore, using an 81-protein-ligand set described by Jain, in 76 (93.8%) cases, the algorithm correctly predicts more than 50% of the ligand-contacting residues when native protein structures are used. Using 3 A RMSD from native decoys, in all but two cases (97.5%), the algorithm predicts at least one ligand-binding residue correctly. Finally, compared to the previously published Dolores method, for 298 protein-ligand pairs, the number of cases in which at least half of the specific contacts are correctly predicted is more than four times greater.  相似文献   

19.
The performances of several two-step scoring approaches for molecular docking were assessed for their ability to predict binding geometries and free energies. Two new scoring functions designed for "step 2 discrimination" were proposed and compared to our CHARMM implementation of the linear interaction energy (LIE) approach using the Generalized-Born with Molecular Volume (GBMV) implicit solvation model. A scoring function S1 was proposed by considering only "interacting" ligand atoms as the "effective size" of the ligand and extended to an empirical regression-based pair potential S2. The S1 and S2 scoring schemes were trained and 5-fold cross-validated on a diverse set of 259 protein-ligand complexes from the Ligand Protein Database (LPDB). The regression-based parameters for S1 and S2 also demonstrated reasonable transferability in the CSARdock 2010 benchmark using a new data set (NRC HiQ) of diverse protein-ligand complexes. The ability of the scoring functions to accurately predict ligand geometry was evaluated by calculating the discriminative power (DP) of the scoring functions to identify native poses. The parameters for the LIE scoring function with the optimal discriminative power (DP) for geometry (step 1 discrimination) were found to be very similar to the best-fit parameters for binding free energy over a large number of protein-ligand complexes (step 2 discrimination). Reasonable performance of the scoring functions in enrichment of active compounds in four different protein target classes established that the parameters for S1 and S2 provided reasonable accuracy and transferability. Additional analysis was performed to definitively separate scoring function performance from molecular weight effects. This analysis included the prediction of ligand binding efficiencies for a subset of the CSARdock NRC HiQ data set where the number of ligand heavy atoms ranged from 17 to 35. This range of ligand heavy atoms is where improved accuracy of predicted ligand efficiencies is most relevant to real-world drug design efforts.  相似文献   

20.
Improving the scoring functions for small molecule-protein docking is a highly challenging task in current computational drug design. Here we present a novel consensus scoring concept for the prediction of binding modes for multiple known active ligands. Similar ligands are generally believed to bind to their receptor in a similar fashion. The presumption of our approach was that the true binding modes of similar ligands should be more similar to each other compared to false positive binding modes. The number of conserved (consensus) interactions between similar ligands was used as a docking score. Patterns of interactions were modeled using ligand receptor interaction fingerprints. Our approach was evaluated for four different data sets of known cocrystal structures (CDK-2, dihydrofolate reductase, HIV-1 protease, and thrombin). Docking poses were generated with FlexX and rescored by our approach. For comparison the CScore scoring functions from Sybyl were used, and consensus scores were calculated thereof. Our approach performed better than individual scoring functions and was comparable to consensus scoring. Analysis of the distribution of docking poses by self-organizing maps (SOM) and interaction fingerprints confirmed that clusters of docking poses composed of multiple ligands were preferentially observed near the native binding mode. Being conceptually unrelated to commonly used docking scoring functions our approach provides a powerful method to complement and improve computational docking experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号