首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Docking and scoring are critical issues in virtual drug screening methods. Fast and reliable methods are required for the prediction of binding affinity especially when applied to a large library of compounds. The implementation of receptor flexibility and refinement of scoring functions for this purpose are extremely challenging in terms of computational speed. Here we propose a knowledge-based multiple-conformation docking method that efficiently accommodates receptor flexibility thus permitting reliable virtual screening of large compound libraries. Starting with a small number of active compounds, a preliminary docking operation is conducted on a large ensemble of receptor conformations to select the minimal subset of receptor conformations that provides a strong correlation between the experimental binding affinity (e.g., Ki, IC50) and the docking score. Only this subset is used for subsequent multiple-conformation docking of the entire data set of library (test) compounds. In conjunction with the multiple-conformation docking procedure, a two-step scoring scheme is employed by which the optimal scoring geometries obtained from the multiple-conformation docking are re-scored by a molecular mechanics energy function including desolvation terms. To demonstrate the feasibility of this approach, we applied this integrated approach to the estrogen receptor alpha (ERalpha) system for which published binding affinity data were available for a series of structurally diverse chemicals. The statistical correlation between docking scores and experimental values was significantly improved from those of single-conformation dockings. This approach led to substantial enrichment of the virtual screening conducted on mixtures of active and inactive ERalpha compounds.  相似文献   

2.
Drug discovery research often relies on the use of virtual screening via molecular docking to identify active hits in compound libraries. An area for improvement among many state-of-the-art docking methods is the accuracy of the scoring functions used to differentiate active from nonactive ligands. Many contemporary scoring functions are influenced by the physical properties of the docked molecule. This bias can cause molecules with certain physical properties to incorrectly score better than others. Since variation in physical properties is inevitable in large screening libraries, it is desirable to account for this bias. In this paper, we present a method of normalizing docking scores using virtually generated decoy sets with matched physical properties. First, our method generates a set of property-matched decoys for every molecule in the screening library. Each library molecule and its decoy set are docked using a state-of-the-art method, producing a set of raw docking scores. Next, the raw docking score of each library molecule is normalized against the scores of its decoys. The normalized score represents the probability that the raw docking score was drawn from the background distribution of nonactive property-matched decoys. Assuming that the distribution of scores of active molecules differs from the nonactive score distribution, we expect that the score of an active compound will have a low probability of having been drawn from the nonactive score distribution. In addition to the use of decoys in normalizing docking scores, we suggest that decoy sets may be a useful tool to evaluate, improve, or develop scoring functions. We show that by analyzing docking scores of library molecules with respect to the docking scores of their virtually generated property-matched decoys, one can gain insight into the advantages, limitations, and reliability of scoring functions.  相似文献   

3.
Virtual screening by molecular docking has become a widely used approach to lead discovery in the pharmaceutical industry when a high-resolution structure of the biological target of interest is available. The performance of three widely used docking programs (Glide, GOLD, and DOCK) for virtual database screening is studied when they are applied to the same protein target and ligand set. Comparisons of the docking programs and scoring functions using a large and diverse data set of pharmaceutically interesting targets and active compounds are carried out. We focus on the problem of docking and scoring flexible compounds which are sterically capable of docking into a rigid conformation of the receptor. The Glide XP methodology is shown to consistently yield enrichments superior to the two alternative methods, while GOLD outperforms DOCK on average. The study also shows that docking into multiple receptor structures can decrease the docking error in screening a diverse set of active compounds.  相似文献   

4.
Receptor flexibility is a critical issue in structure-based virtual screening methods. Although a multiple-receptor conformation docking is an efficient way to account for receptor flexibility, it is still too slow for large molecular libraries. It was reported that a fast ligand-centric, shape-based virtual screening was more consistent for hit enrichment than a typical single-receptor conformation docking. Thus, we designed a "distributed docking" method that improves virtual high throughput screening by combining a shape-matching method with a multiple-receptor conformation docking. Database compounds are classified in advance based on shape similarities to one of the crystal ligands complexed with the target protein. This classification enables us to pick the appropriate receptor conformation for a single-receptor conformation docking of a given compound, thereby avoiding time-consuming multiple docking. In particular, this approach utilizes cross-docking scores of known ligands to all available receptor structures in order to optimize the algorithm. The present virtual screening method was tested for reidentification of known PPARgamma and p38 MAP kinase active compounds. We demonstrate that this method improves the enrichment while maintaining the computation speed of a typical single-receptor conformation docking.  相似文献   

5.
A new method has been developed to design a focused library based on available active compounds using protein-compound docking simulations. This method was applied to the design of a focused library for cytochrome P450 (CYP) ligands, not only to distinguish CYP ligands from other compounds but also to identify the putative ligands for a particular CYP. Principal component analysis (PCA) was applied to the protein-compound affinity matrix, which was obtained by thorough docking calculations between a large set of protein pockets and chemical compounds. Each compound was depicted as a point in the PCA space. Compounds that were close to the known active compounds were selected as candidate hit compounds. A machine-learning technique optimized the docking scores of the protein-compound affinity matrix to maximize the database enrichment of the known active compounds, providing an optimized focused library.  相似文献   

6.
The low accuracy of predicted docking scores is critical at in silico drug screening. In order to improve the accuracy of docking scores, we approximated the protein-compound binding free energy as a linear combination of the raw docking scores of a target compound with many different protein pockets. The coefficients of the linear combination were estimated by the similarities among proteins, simply by using the amino-acid sequence similarities or identities of the proteins. This method was applied to in silico screening of the active compounds of five target proteins, and it increased the hit ratio by approximately four to five times compared to that given only by the raw docking scores in every case. The hit ratio also became robust against differences of target proteins.  相似文献   

7.
Docking and scoring is currently one of the tools used for hit finding and hit-to-lead optimization when structural information about the target is known. Docking scores have been found useful for optimizing ligand binding to reproduce experimentally observed binding modes. The question is, can docking and scoring be used reliably for hit-to-lead optimization? To illustrate the challenges of scoring for hit-to-lead optimization, the relationship of docking scores with experimentally determined IC50 values measured in-house were tested. The influences of the particular target, crystal structure, and the precision of the scoring function on the ability to differentiate between actives and inactives were analyzed by calculating the area under the curve of receiver operator characteristic curves for docking scores. It was found that for the test sets considered, MW and sometimes ClogP were as useful as GlideScores and no significant difference was observed between SP and XP scores for differentiating between actives and inactives. Interpretation by an expert is still required to successfully utilize docking and scoring in hit-to-lead optimization.  相似文献   

8.
A molecular docking method designated as ADDock, anchor- dependent molecular docking process for docking small flexible molecules into rigid protein receptors, is presented in this article. ADDock makes the bond connection lists for atoms based on anchors chosen for building molecular structures for docking small flexible molecules or ligands into rigid active sites of protein receptors. ADDock employs an extended version of piecewise linear potential for scoring the docked structures. Since no translational motion for small molecules is implemented during the docking process, ADDock searches the best docking result by systematically changing the anchors chosen, which are usually the single-edge connected nodes or terminal hydrogen atoms of ligands. ADDock takes intact ligand structures generated during the docking process for computing the docked scores; therefore, no energy minimization is required in the evaluation phase of docking. The docking accuracy by ADDock for 92 receptor-ligand complexes docked is 91.3%. All these complexes have been docked by other groups using other docking methods. The receptor-ligand steric interaction energies computed by ADDock for some sets of active and inactive compounds selected and docked into the same receptor active sites are apparently separated. These results show that based on the steric interaction energies computed between the docked structures and receptor active sites, ADDock is able to separate active from inactive compounds for both being docked into the same receptor.  相似文献   

9.
We are participating in the challenge of identifying active compounds for target proteins using structure-based virtual screening (SBVS). We use an in-house customized docking program, CONSENSUS-DOCK, which is a customized version of the DOCK4 program in which three scoring functions (DOCK4, FlexX and PMF) and consensus scoring have been implemented. This paper compares the docking calculation results obtained using CONSENSUS-DOCK and DOCK4, and demonstrates that CONSENSUS-DOCK produces better results than DOCK4 for major X-ray structures obtained from the Protein Data Bank (PDB).  相似文献   

10.
We present the results of a comprehensive study in which we explored how the docking procedure affects the performance of a virtual screening approach. We used four docking engines and applied 10 scoring functions to the top-ranked docking solutions of seeded databases against six target proteins. The scores of the experimental poses were placed within the total set to assess whether the scoring function required an accurate pose to provide the appropriate rank for the seeded compounds. This method allows a direct comparison of library ranking efficacy. Our results indicate that the LigandFit/Ligscore1 and LigandFit/GOLD docking/scoring combinations, and to a lesser degree FlexX/FlexX, Glide/Ligscore1, DOCK/PMF (Tripos implementation), LigandFit1/Ligscore2 and LigandFit/PMF (Tripos implementation) were able to retrieve the highest number of actives at a 10% fraction of the database when all targets were looked upon collectively. We also show that the scoring functions rank the observed binding modes higher than the inaccurate poses provided that the experimental poses are available. This finding stresses the discriminatory ability of the scoring algorithms, when better poses are available, and suggests that the number of false positives can be lowered with conformers closer to bioactive ones.  相似文献   

11.
We have developed a method that uses energetic analysis of structure-based fragment docking to elucidate key features for molecular recognition. This hybrid ligand- and structure-based methodology uses an atomic breakdown of the energy terms from the Glide XP scoring function to locate key pharmacophoric features from the docked fragments. First, we show that Glide accurately docks fragments, producing a root mean squared deviation (RMSD) of <1.0 Å for the top scoring pose to the native crystal structure. We then describe fragment-specific docking settings developed to generate poses that explore every pocket of a binding site while maintaining the docking accuracy of the top scoring pose. Next, we describe how the energy terms from the Glide XP scoring function are mapped onto pharmacophore sites from the docked fragments in order to rank their importance for binding. Using this energetic analysis we show that the most energetically favorable pharmacophore sites are consistent with features from known tight binding compounds. Finally, we describe a method to use the energetically selected sites from fragment docking to develop a pharmacophore hypothesis that can be used in virtual database screening to retrieve diverse compounds. We find that this method produces viable hypotheses that are consistent with known active compounds. In addition to retrieving diverse compounds that are not biased by the co-crystallized ligand, the method is able to recover known active compounds from a database screen, with an average enrichment of 8.1 in the top 1% of the database.  相似文献   

12.
Target-specific optimization of scoring functions for protein–ligand docking is an effective method for significantly improving the discrimination of active and inactive molecules in virtual screening applications. Its applicability, however, is limited due to the narrow focus on, e.g., single protein structures. Using an ensemble of protein kinase structures, the publically available directory of useful decoys ligand dataset, and a novel multi-factorial optimization procedure, it is shown here that scoring functions can be tuned to multiple targets of a target class simultaneously. This leads to an improved robustness of the resulting scoring function parameters. Extensive validation experiments clearly demonstrate that (1) virtual screening performance for kinases improves significantly; (2) variations in database content affect this kind of machine-learning strategy to a lesser extent than binary QSAR models, and (3) the reweighting of interaction types is of particular importance for improved screening performance. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

13.
We developed a new method to improve the accuracy of molecular interaction data using a molecular interaction matrix. This method was applied to enhance the database enrichment of in silico drug screening and in silico target protein screening using a protein-compound affinity matrix calculated by a protein-compound docking software. Our assumption was that the protein-compound binding free energy of a compound could be improved by a linear combination of its docking scores with many different proteins. We proposed two approaches to determine the coefficients of the linear combination. The first approach is based on similarity among the proteins, and the second is a machine-learning approach based on the known active compounds. These methods were applied to in silico screening of the active compounds of several target proteins and in silico target protein screening.  相似文献   

14.
Most of the recent published works in the field of docking and scoring protein/ligand complexes have focused on ranking true positives resulting from a Virtual Library Screening (VLS) through the use of a specified or consensus linear scoring function. In this work, we present a methodology to speed up the High Throughput Screening (HTS) process, by allowing focused screens or for hitlist triaging when a prohibitively large number of hits is identified in the primary screen, where we have extended the principle of consensus scoring in a nonlinear neural network manner. This led us to introduce a nonlinear Generalist scoring Function, GFscore, which was trained to discriminate true positives from false positives in a data set of diverse chemical compounds. This original Generalist scoring Function is a combination of the five scoring functions found in the CScore package from Tripos Inc. GFscore eliminates up to 75% of molecules, with a confidence rate of 90%. The final result is a Hit Enrichment in the list of molecules to investigate during a research campaign for biological active compounds where the remaining 25% of molecules would be sent to in vitro screening experiments. GFscore is therefore a powerful tool for the biologist, saving both time and money.  相似文献   

15.
A new method for the postprocessing of docking outputs has been developed, based on encoding putative 3D binding modes (docking solutions) as ligand-protein interactions into simple bit strings, a method analogous to the structural interaction fingerprint. Instead of employing traditional scoring functions, the method uses a series of new, knowledge-based scores derived from the similarity of the bit strings for each docking solution to that of a known reference binding mode. A GOLD docking study was carried out using the Bissantz estrogen receptor antagonist set along with the new scoring method. Superior recovery rates, with up to 2-fold enrichments, were observed when the new knowledge-based scoring was compared to the GOLD fitness score. In addition, top ranking sets of molecules (actives and potential actives or decoys) were structurally diverse with low molecular weights and structural complexities. Principal component analysis and clustering of the fingerprints permits the easy separation of active from inactive binding modes and the visualization of diverse binding modes.  相似文献   

16.
Improving the scoring functions for small molecule-protein docking is a highly challenging task in current computational drug design. Here we present a novel consensus scoring concept for the prediction of binding modes for multiple known active ligands. Similar ligands are generally believed to bind to their receptor in a similar fashion. The presumption of our approach was that the true binding modes of similar ligands should be more similar to each other compared to false positive binding modes. The number of conserved (consensus) interactions between similar ligands was used as a docking score. Patterns of interactions were modeled using ligand receptor interaction fingerprints. Our approach was evaluated for four different data sets of known cocrystal structures (CDK-2, dihydrofolate reductase, HIV-1 protease, and thrombin). Docking poses were generated with FlexX and rescored by our approach. For comparison the CScore scoring functions from Sybyl were used, and consensus scores were calculated thereof. Our approach performed better than individual scoring functions and was comparable to consensus scoring. Analysis of the distribution of docking poses by self-organizing maps (SOM) and interaction fingerprints confirmed that clusters of docking poses composed of multiple ligands were preferentially observed near the native binding mode. Being conceptually unrelated to commonly used docking scoring functions our approach provides a powerful method to complement and improve computational docking experiments.  相似文献   

17.
18.
We propose a hypothesis that "a model of active compound can be provided by integrating information of compounds high-ranked by docking simulation of a random compound library". In our hypothesis, the inclusion of true active compounds in the high-ranked compound is not necessary. We regard the high-ranked compounds as being pseudo-active compounds. As a method to embody our hypothesis, we introduce a pseudo-structure-activity relationship (PSAR) model. Although the PSAR model is the same as a quantitative structure activity relationship (QSAR) model, in terms of statistical methodology, the implications of the training data are different. Known active compounds (ligands) are used as training data in the QSAR model, whereas the pseudo-active compounds are used in the PSAR model. In this study, Random Forest was used as a machine-learning algorithm. From tests for four functionally different targets, estrogen receptor antagonist (ER), thymidine kinase (TK), thrombin, and acetylcholine esterase (AChE), using five scoring functions, we obtained three conclusions: (1) the PSAR models significantly gave higher percentages of known ligands found than random sampling, and these results are sufficient to support our hypothesis; (2) the PSAR models gave higher percentages of known ligands found than normal scoring by scoring function, and these results demonstrate the practical usefulness of the PSAR model; and (3) the PSAR model can assess compounds failed in the docking simulation. Note that PSAR and QSAR models are used in different situations; the advantage of the PSAR model emerges when no ligand is available as training data or when one wants to find novel types of ligands, whereas the QSAR model is effective for finding compounds similar to known ligands when the ligands are already known.  相似文献   

19.
The results of 16 docking simulations with rigid receptor sites and flexible ligands (∼60,000 compounds in each case) are statistically analyzed and compared. Different combinations of binding sites, scoring functions, and compound collections are used in these calculations. The docking scores are not randomly distributed over the scoring range; they follow Gaussian distributions (regardless of the binding sites), scoring functions, or screened compounds. If the docking sites are small, the Gaussian distributions are positively skewed. Peaks of the Gaussian distributions are populated with compounds having similar scores but different sizes and binding modes. These findings have implications for compound selection via computational docking. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 1634–1643, 1999  相似文献   

20.
A set of 32 known thrombin inhibitors representing different chemical classes has been used to evaluate the performance of two implementations of incremental construction algorithms for flexible molecular docking: DOCK 4.0 and FlexX 1.5. Both docking tools are able to dock 10–35% of our test set within 2 Å of their known, bound conformations using default sampling and scoring parameters. Although flexible docking with DOCK or FlexX is not able to reconstruct all native complexes, it does offer a significant improvement over rigid body docking of single, rule-based conformations, which is still often used for docking of large databases. Docking of sets of multiple conformers of each inhibitor, obtained with a novel protocol for diverse conformer generation and selection, yielded results comparable to those obtained by flexible docking. Chemical scoring, which is an empirically modified force field scoring method implemented in DOCK 4.0, outperforms both interaction energy scoring by DOCK and the Böhm scoring function used by FlexX in rigid and flexible docking of thrombin inhibitors. Our results indicate that for reliable docking of flexible ligands the selection of anchor fragments, conformational sampling and currently available scoring methods still require improvement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号