首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Structure-based virtual screening (SBVS) utilizing docking algorithms has become an essential tool in the drug discovery process, and significant progress has been made in successfully applying the technique to a wide range of receptor targets. In silico validation of virtual screening protocols before application to a receptor target using a corporate or commercially available compound collection is key to establishing a successful process. Ultimately, retrieval of a set of active compounds from a database of inactives is required, and the metric of enrichment (E) is habitually used to discern the quality of separation of the two. Numerous reports have addressed the performance of docking algorithms with regard to the quality of binding mode prediction and the issue of postprocessing "hit lists" of docked ligands. However, the impact of ligand database preprocessing has yet to be examined in the context of virtual screening and prioritization of compounds for biological evaluation. We provide an insight into the implications of cheminformatic preprocessing of a validation database of compounds where multiple protonated, tautomeric, stereochemical, and conformational states have been enumerated. Several commonly used methods for the generation of ligand conformations and conformational ensembles are examined, paired with an exhaustive rigid-body algorithm for the docking of different "multimeric" compound representations to the ligand binding site of the human estrogen receptor alpha. Chemgauss, a shapegaussian scoring function with intrinsic chemical knowledge, was combined with PLP as a consensus-scoring scheme to rank output from the docking protocol and enrichment rates calculated for each screen. The overheads of CPU consumption and the effect on relative database size (disk requirement) for each of the protocols employed are considered. Assessment of these parameters indicates that SBVS enrichments are highly dependent on the initial cheminformatic treatment(s) used in database construction. The interplay of SMILES representations, stereochemical information, protonation state enumeration, and ligand conformation ensembles are critical in achieving optimum enrichment rates in such screening.  相似文献   

2.
A na?ve Bayes classifier, employed in conjunction with 2D pharmacophore feature triplet vectors describing the molecules, is presented and validated. Molecules are described using a vector where each element in the vector contains the number of times a particular triplet of atom-based features separated by a set of topological distances occurs. Using the feature triplet vectors it is possible to generate na?ve Bayes classifiers that predict whether molecules are likely to be active against a given target (or family of targets). Two retrospective validation experiments were performed using a range of actives from WOMBAT, the Prous Integrity database, and the Arena screening library. The performance of the classifiers was evaluated using enrichment curves, enrichment factors, and the BEDROC metric. The classifiers were found to give significant enrichments for the various test sets.  相似文献   

3.
4.
Structure-based virtual screening plays an important role in drug discovery and complements other screening approaches. In general, protein crystal structures are prepared prior to docking in order to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes, and perform other operations that are not part of the x-ray crystal structure refinement process. In addition, ligands must be prepared to create 3-dimensional geometries, assign proper bond orders, and generate accessible tautomer and ionization states prior to virtual screening. While the prerequisite for proper system preparation is generally accepted in the field, an extensive study of the preparation steps and their effect on virtual screening enrichments has not been performed. In this work, we systematically explore each of the steps involved in preparing a system for virtual screening. We first explore a large number of parameters using the Glide validation set of 36 crystal structures and 1,000 decoys. We then apply a subset of protocols to the DUD database. We show that database enrichment is improved with proper preparation and that neglecting certain steps of the preparation process produces a systematic degradation in enrichments, which can be large for some targets. We provide examples illustrating the structural changes introduced by the preparation that impact database enrichment. While the work presented here was performed with the Protein Preparation Wizard and Glide, the insights and guidance are expected to be generalizable to structure-based virtual screening with other docking methods.  相似文献   

5.
SAMPL3 fragment based virtual screening challenge provides a valuable opportunity for researchers to test their programs, methods and screening protocols in a blind testing environment. We participated in SAMPL3 challenge and evaluated our virtual fragment screening protocol, which involves RosettaLigand as the core component by screening a 500 fragments Maybridge library against bovine pancreatic trypsin. Our study reaffirmed that the real test for any virtual screening approach would be in a blind testing environment. The analyses presented in this paper also showed that virtual screening performance can be improved, if a set of known active compounds is available and parameters and methods that yield better enrichment are selected. Our study also highlighted that to achieve accurate orientation and conformation of ligands within a binding site, selecting an appropriate method to calculate partial charges is important. Another finding is that using multiple receptor ensembles in docking does not always yield better enrichment than individual receptors. On the basis of our results and retrospective analyses from SAMPL3 fragment screening challenge we anticipate that chances of success in a fragment screening process could be increased significantly with careful selection of receptor structures, protein flexibility, sufficient conformational sampling within binding pocket and accurate assignment of ligand and protein partial charges.  相似文献   

6.
Pharmacophore multiplets are useful tools for 3D database searching, with the queries used ordinarily being derived from ensembles of random conformations of active ligands. It seems reasonable to expect that their usefulness can be augmented by instead using queries derived from single ligand conformations obtained from aligned ligands. Comparisons of pharmacophore multiplet searching using random conformations with multiplet searching using single conformations derived from GALAHAD (a genetic algorithm with linear assignment for hypermolecular alignment of datasets) models do indeed show that, while query hypotheses based on random conformations are quite effective, hypotheses based on aligned conformations do a better job of discriminating between active and inactive compounds. In particular, the hypothesis created from a neuraminidase inhibitor model was more similar to half of 18 known actives than all but 0.2% of the compounds in a structurally diverse subset of the World Drug Index. Similarly, a model developed from five angiotensin II antagonists yielded hypotheses that placed 65 known antagonists within the top 0.1–1% of decoy databases. The differences in discriminating power ranged from 2 to 20-fold, depending on the protein target and the type of pharmacophore multiplet used.  相似文献   

7.
In conjunction with the recent American Chemical Society symposium titled "Docking and Scoring: A Review of Docking Programs" the performance of the DOCK6 program was evaluated through (1) pose reproduction and (2) database enrichment calculations on a common set of organizer-specified systems and datasets (ASTEX, DUD, WOMBAT). Representative baseline grid score results averaged over five docking runs yield a relatively high pose identification success rate of 72.5?% (symmetry corrected rmsd) and sampling rate of 91.9?% for the multi site ASTEX set (N?=?147) using organizer-supplied structures. Numerous additional docking experiments showed that ligand starting conditions, symmetry, multiple binding sites, clustering, and receptor preparation protocols all affect success. Encouragingly, in some cases, use of more sophisticated scoring and sampling methods yielded results which were comparable (Amber score ligand movable protocol) or exceeded (LMOD score) analogous baseline grid-score results. The analysis highlights the potential benefit and challenges associated with including receptor flexibility and indicates that different scoring functions have system dependent strengths and weaknesses. Enrichment studies with the DUD database prepared using the SB2010 preparation protocol and native ligand pairings yielded individual area under the curve (AUC) values derived from receiver operating characteristic curve analysis ranging from 0.29 (bad enrichment) to 0.96 (good enrichment) with an average value of 0.60 (27/38 have AUC?≥?0.5). Strong early enrichment was also observed in the critically important 1.0-2.0?% region. Somewhat surprisingly, an alternative receptor preparation protocol yielded comparable results. As expected, semi-random pairings yielded poorer enrichments, in particular, for unrelated receptors. Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.  相似文献   

8.
Shape Signatures, a new 3-dimensional molecular comparison method, has been adapted to rank ligands of the serotonin receptors. A set of 825 agonists and 400 antagonists together with approximately 10,000 randomly chosen compounds from the NCI database were used in this study. Both 1D and 2D Shape Signature databases were created, and enrichment studies were carried out. Results from these studies reveal that the 1D Shape Signature approach is highly efficient in separating agonists from a mixture of molecules which includes compounds randomly selected from the NCI database taken as inactives. It is also equally effective at separating agonists and antagonists from a pool of active ligands for the serotonin receptor. Parallel enrichment studies using 2D shape signatures showed high selectivity with more restricted coverage due to the high specificity of 2D signatures. The influence of conformational variation of the shape signature on enrichment was explored by docking a subset of ligands into the crystal structure of serotonin N-acetyltransferase. Enrichment studies on the resulting "docked" conformations produced only slightly improved results compared with the CORINA-generated conformations.  相似文献   

9.
10.
11.
A major problem in structure-based virtual screening applications is the appropriate selection of a single or even multiple protein structures to be used in the virtual screening process. A priori it is unknown which protein structure(s) will perform best in a virtual screening experiment. We investigated the performance of ensemble docking, as a function of ensemble size, for eight targets of pharmaceutical interest. Starting from single protein structure docking results, for each ensemble size up to 500,000 combinations of protein structures were generated, and, for each ensemble, pose prediction and virtual screening results were derived. Comparison of single to multiple protein structure results suggests improvements when looking at the performance of the worst and the average over all single protein structures to the performance of the worst and average over all protein ensembles of size two or greater, respectively. We identified several key factors affecting ensemble docking performance, including the sampling accuracy of the docking algorithm, the choice of the scoring function, and the similarity of database ligands to the cocrystallized ligands of ligand-bound protein structures in an ensemble. Due to these factors, the prospective selection of optimum ensembles is a challenging task, shown by a reassessment of published ensemble selection protocols.  相似文献   

12.
Incorporating receptor flexibility is considered crucial for improvement of docking-based virtual screening. With an abundance of crystallographic structures freely available, docking with multiple crystal structures is believed to be a practical approach to cope with protein flexibility. Here we describe a successful application of the docking of multiple structures to discover novel and potent Chk1 inhibitors. Forty-six Chk1 structures were first compared in single structure docking by predicting the binding mode and recovering known ligands. Combinations of different protein structures were then compared by recovery of known ligands and an optimal ensemble of Chk1 structures were selected. The chosen structures were used in the virtual screening of over 60?000 diverse compounds for Chk1 inhibitors. Six novel compounds ranked at the top of the hits list were tested experimentally, and two of these compounds inhibited Chk1 activity-the best with an IC(50) value of 9.6 μM. Further study indicated that achieving a better enrichment and identifying more diverse compounds was more likely using multiple structures than using only a single structure even when protein structures were randomly selected. Taking into account conformational energy difference did not help to improve enrichment in the top ranked list.  相似文献   

13.
Shape-based methods for aligning and scoring ligands have proven to be valuable in the field of computer-aided drug design. Here, we describe a new shape-based flexible ligand superposition and virtual screening method, Phase Shape, which is shown to rapidly produce accurate 3D ligand alignments and efficiently enrich actives in virtual screening. We describe the methodology, which is based on the principle of atom distribution triplets to rapidly define trial alignments, followed by refinement of top alignments to maximize the volume overlap. The method can be run in a shape-only mode or it can include atom types or pharmacophore feature encoding, the latter consistently producing the best results for database screening. We apply Phase Shape to flexibly align molecules that bind to the same target and show that the method consistently produces correct alignments when compared with crystal structures. We then illustrate the effectiveness of the method for identifying active compounds in virtual screening of eleven diverse targets. Multiple parameters are explored, including atom typing, query structure conformation, and the database conformer generation protocol. We show that Phase Shape performs well in database screening calculations when compared with other shape-based methods using a common set of actives and decoys from the literature.  相似文献   

14.
Since the evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, scoring functions play significant roles in it. However, it is known that a scoring function does not always work well for all target proteins. When one cannot know which scoring function works best against a target protein a priori, there is no standard scoring method to know it even if 3D structure of a target protein-ligand complex is available. Therefore, development of the method to achieve high enrichments from given scoring functions and 3D structure of protein-ligand complex is a crucial and challenging task. To address this problem, we applied SCS (supervised consensus scoring), which employs a rough linear correlation between the binding free energy and the root-mean-square deviation (rmsd) of a native ligand conformations and incorporates protein-ligand binding process with docked ligand conformations using supervised learning, to virtual screening. We evaluated both the docking poses and enrichments of SCS and five scoring functions (F-Score, G-Score, D-Score, ChemScore, and PMF) for three different target proteins: thymidine kinase (TK), thrombin (thrombin), and peroxisome proliferator-activated receptor gamma (PPARgamma). Our enrichment studies show that SCS is competitive or superior to a best single scoring function at the top ranks of screened database. We found that the enrichments of SCS could be limited by a best scoring function, because SCS is obtained on the basis of the five individual scoring functions. Therefore, it is concluded that SCS works very successfully from our results. Moreover, from docking pose analysis, we revealed the connection between enrichment and average centroid distance of top-scored docking poses. Since SCS requires only one 3D structure of protein-ligand complex, SCS will be useful for identifying new ligands.  相似文献   

15.
This study unites six popular machine learning approaches to enhance the prediction of a molecular binding affinity between receptors (large protein molecules) and ligands (small organic molecules). Here we examine a scheme where affinity of ligands is predicted against a single receptor – human thrombin, thus, the models consider ligand features only. However, the suggested approach can be repurposed for other receptors. The methods include Support Vector Machine, Random Forest, CatBoost, feed-forward neural network, graph neural network, and Bidirectional Encoder Representations from Transformers. The first five methods use input features based on physico-chemical properties of molecules, while the last one is based on textual molecular representations. All approaches do not rely on atomic spatial coordinates, avoiding a potential bias from known structures, and are capable of generalizing for compounds with unknown conformations. Within each of the methods, we have trained two models that solve classification and regression tasks. Then, all models are grouped into a pipeline of two subsequent ensembles. The first ensemble aggregates six classification models which vote whether a ligand binds to a receptor or not. If a ligand is classified as active (i.e., binds), the second ensemble predicts its binding affinity in terms of the inhibition constant Ki.  相似文献   

16.
Protein-ligand docking programs have been used to efficiently discover novel ligands for target proteins from large-scale compound databases. However, better scoring methods are needed. Generally, scoring functions are optimized by means of various techniques that affect their fitness for reproducing X-ray structures and protein-ligand binding affinities. However, these scoring functions do not always work well for all target proteins. A scoring function should be optimized for a target protein to enhance enrichment for structure-based virtual screening. To address this problem, we propose the supervised scoring model (SSM), which takes into account the protein-ligand binding process using docked ligand conformations with supervised learning for optimizing scoring functions against a target protein. SSM employs a rough linear correlation between binding free energy and the root mean square deviation of a native ligand for predicting binding energy. We applied SSM to the FlexX scoring function, that is, F-Score, with five different target proteins: thymidine kinase (TK), estrogen receptor (ER), acetylcholine esterase (AChE), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). For these five proteins, SSM always enhanced enrichment better than F-Score, exhibiting superior performance that was particularly remarkable for TK, AChE, and PPARgamma. We also demonstrated that SSM is especially good at enhancing enrichments of the top ranks of screened compounds, which is useful in practical drug screening.  相似文献   

17.
This paper describes the validation of a molecular docking method and its application to virtual database screening. The code flexibly docks ligand molecules into rigid receptor structures using a tabu search methodology driven by an empirically derived function for estimating the binding affinity of a protein-ligand complex. The docking method has been tested on 70 ligand-receptor complexes for which the experimental binding affinity and binding geometry are known. The lowest energy geometry produced by the docking protocol is within 2.0 A root mean square of the experimental binding mode for 79% of the complexes. The method has been applied to the problem of virtual database screening to identify known ligands for thrombin, factor Xa, and the estrogen receptor. A database of 10,000 randomly chosen "druglike" molecules has been docked into the three receptor structures. In each case known receptor ligands were included in the study. The results showed good separation between the predicted binding affinities of the known ligand set and the database subset.  相似文献   

18.
Herein, we describe a method to flexibly align molecules (FLAME = FLexibly Align MolEcules). FLAME aligns two molecules by first finding maximum common pharmacophores between them using a genetic algorithm. The resulting alignments are then subjected to simultaneous optimizations of their internal energies and an alignment score. The utility of the method in pairwise alignment, multiple molecule flexible alignment, and database searching was examined. For pairwise alignment, two carboxypeptidase ligands (Protein Data Bank codes and ), two estrogen receptor ligands ( and ), and two thrombin ligands ( and ) were used as test sets. Alignments generated by FLAME starting from CONCORD structures compared very well to the X-ray structures (average root-mean-square deviation = 0.36 A) even without further minimization in the presence of the protein. For multiple flexible alignments, five structurally diverse D3 receptor ligands were used as a test set. The FLAME alignment automatically identified three common pharmacophores: a base, a hydrogen-bond acceptor, and a hydrophobe/aromatic ring. The best alignment was then used to search the MDDR database. The search results were compared to the results using atom pair and Daylight fingerprint similarity. A similar database search comparison was also performed using estrogen receptor modulators. In both cases, hits identified by FLAME were structurally more diverse compared to those from the atom pair and Daylight fingerprint methods.  相似文献   

19.
It is often difficult to differentiate effectively between related G-protein coupled receptors and their subtypes when doing ligand-based drug design. GALAHAD uses a multi-objective scoring system to generate multiple alignments involving alternative trade-offs between the conflicting desires to minimize internal strain while maximizing pharmacophoric and steric (pharmacomorphic) concordance between ligands. The various overlays obtained can be associated with different subtypes by examination, even when the ligands available do not discriminate completely between receptors and when no specificity information has been used to bias the alignment process. This makes GALAHAD a potentially powerful tool for identifying discriminating models, as is illustrated here using a set of dopaminergic agonists that vary in their D1 vs. D2 receptor selectivity.  相似文献   

20.
HIV infection is initiated by fusion of the virus with the target cell through binding of the viral gp120 protein with the CD4 cell surface receptor protein and the CXCR4 or CCR5 co-receptors. There is currently considerable interest in developing novel ligands that can modulate the conformations of these co-receptors and, hence, ultimately block virus-cell fusion. This article describes a detailed comparison of the performance of receptor-based and ligand-based virtual screening approaches to find CXCR4 and CCR5 antagonists that could potentially serve as HIV entry inhibitors. Because no crystal structures for these proteins are available, homology models of CXCR4 and CCR5 have been built, using bovine rhodopsin as the template. For ligand-based virtual screening, several shape-based and property-based molecular comparison approaches have been compared, using high-affinity ligands as query molecules. These methods were compared by virtually screening a library assembled by us, consisting of 602 known CXCR4 and CCR5 inhibitors and some 4700 similar presumed inactive molecules. For each receptor, the library was queried using known binders, and the enrichment factors and diversity of the resulting virtual hit lists were analyzed. Overall, ligand-based shape-matching searches yielded higher enrichments than receptor-based docking, especially for CXCR4. The results obtained for CCR5 suggest the possibility that different active scaffolds bind in different ways within the CCR5 pocket.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号