期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust optimization of scoring functions for a target class

Markus H. J. Seifert 《Journal of computer-aided molecular design》2009,23(9):633-644

Target-specific optimization of scoring functions for protein–ligand docking is an effective method for significantly improving the discrimination of active and inactive molecules in virtual screening applications. Its applicability, however, is limited due to the narrow focus on, e.g., single protein structures. Using an ensemble of protein kinase structures, the publically available directory of useful decoys ligand dataset, and a novel multi-factorial optimization procedure, it is shown here that scoring functions can be tuned to multiple targets of a target class simultaneously. This leads to an improved robustness of the resulting scoring function parameters. Extensive validation experiments clearly demonstrate that (1) virtual screening performance for kinases improves significantly; (2) variations in database content affect this kind of machine-learning strategy to a lesser extent than binary QSAR models, and (3) the reweighting of interaction types is of particular importance for improved screening performance. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献

2.

A machine learning-based method to improve docking scoring functions and its application to drug repurposing

Kinnings SL Liu N Tonge PJ Jackson RM Xie L Bourne PE 《Journal of chemical information and modeling》2011,51(2):408-419

Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here. 相似文献

3.

Evaluation and application of multiple scoring functions for a virtual screening experiment

Xing L Hodgkin E Liu Q Sedlock D 《Journal of computer-aided molecular design》2004,18(5):333-344

In order to identify novel chemical classes of factor Xa inhibitors, five scoring functions (FlexX, DOCK, GOLD, ChemScore and PMF) were engaged to evaluate the multiple docking poses generated by FlexX. The compound collection was composed of confirmed potent factor Xa inhibitors and a subset of the LeadQuest screening compound library. Except for PMF the other four scoring functions succeeded in reproducing the crystal complex (PDB code: 1FAX). During virtual screening the highest hit rate (80%) was demonstrated by FlexX at an energy cutoff of -40 kJ/mol, which is about 40-fold over random screening (2.06%). Limited results suggest that presenting more poses of a single molecule to the scoring functions could deteriorate their enrichment factors. A series of promising scaffolds with favorable binding scores was retrieved from LeadQuest. Consensus scoring by pair-wise intersection failed to enrich the hit rate yielded by single scorings (i.e. FlexX). We note that reported successes of consensus scoring in hit rate enrichment could be artificial because their comparisons were based on a selected subset of single scoring and a markedly reduced subset of double or triple scoring. The findings presented in this report are based upon a single biological system and support further studies. 相似文献

4.

Development and assessment of scoring functions for protein identification using PMF data

Song Z Chen L Ganapathy A Wan XF Brechenmacher L Tao N Emerich D Stacey G Xu D 《Electrophoresis》2007,28(5):864-870

PMF is one of the major methods for protein identification using the MS technology. It is faster and cheaper than MS/MS. Although PMF does not differentiate trypsin-digested peptides of identical mass, which makes it less informative than MS/MS, current computational methods for PMF have the potential to improve its detection accuracy by better use of the information content in PMF spectra. We developed a number of new probability-based scoring functions for PMF protein identification based on the MOWSE algorithm. We considered a detailed distribution of matching masses in a protein database and peak intensity, as well as the likelihood of peptide matches to be close to each other in a protein sequence. Our computational methods are assessed and compared with other methods using PMF data of 52 gel spots of known protein standards. The comparison shows that our new scoring schemes have higher or comparable accuracies for protein identification in comparison to the existing methods. Our software is freely available upon request. The scoring functions can be easily incorporated into other proteomics software packages. 相似文献

5.

Ligand-specific scoring functions: improved ranking of docking solutions

Pyrkov TV Priestle JP Jacoby E Efremov RG 《SAR and QSAR in environmental research》2008,19(1-2):91-99

Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein-ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5-8% better compared to the standard functions implemented in popular docking software. 相似文献

6.

Impact of scoring functions on enrichment in docking-based virtual screening: an application study on renin inhibitors

Krovat EM Langer T 《Journal of chemical information and computer sciences》2004,44(3):1123-1129

The docking program LigandFit/Cerius(2) has been used to perform shape-based virtual screening of databases against the aspartic protease renin, a target of determined three-dimensional structure. The protein structure was used in the induced fit binding conformation that occurs when renin is bound to the highly active renin inhibitor 1 (IC(50) = 2 nM). The scoring was calculated using several different scoring functions in order to get insight into the predictability of the magnitude of binding interactions. A database of 1000 diverse and druglike compounds, comprised of 990 members of a virtual database generated by using the iLib diverse software and 10 known active renin inhibitors, was docked flexibly and scored to determine appropriate scoring functions. All seven scoring functions used (LigScore1, LigScore2, PLP1, PLP2, JAIN, PMF, LUDI) were able to retrieve at least 50% of the active compounds within the first 20% (200 molecules) of the entire test database. A hit rate of 90% in the top 1.4% resulted using the quadruple consensus scoring of LigScore2, PLP1, PLP2, and JAIN. Additionally, a focused database was created with the iLib diverse software and used for the same procedure as the test database. Docking and scoring of the 990 focused compounds and the 10 known actives were performed. A hit rate of 100% in the top 8.4% resulted with use of the triple consensus scoring of PLP1, PLP2, and PMF. As expected, a ranking of the known active compounds within the focused database compared to the test database was observed. Adequate virtual screening conditions were derived empirically. They can be used for proximate docking and scoring application of compounds with putative renin inhibiting potency. 相似文献

7.

Knowledge-based interaction fingerprint scoring: a simple method for improving the effectiveness of fast scoring functions

Mpamhanga CP Chen B McLay IM Willett P 《Journal of chemical information and modeling》2006,46(2):686-698

A new method for the postprocessing of docking outputs has been developed, based on encoding putative 3D binding modes (docking solutions) as ligand-protein interactions into simple bit strings, a method analogous to the structural interaction fingerprint. Instead of employing traditional scoring functions, the method uses a series of new, knowledge-based scores derived from the similarity of the bit strings for each docking solution to that of a known reference binding mode. A GOLD docking study was carried out using the Bissantz estrogen receptor antagonist set along with the new scoring method. Superior recovery rates, with up to 2-fold enrichments, were observed when the new knowledge-based scoring was compared to the GOLD fitness score. In addition, top ranking sets of molecules (actives and potential actives or decoys) were structurally diverse with low molecular weights and structural complexities. Principal component analysis and clustering of the fingerprints permits the easy separation of active from inactive binding modes and the visualization of diverse binding modes. 相似文献

8.

Ligand-specific scoring functions: improved ranking of docking solutions1

T.V. Pyrkov J.P. Priestle E. Jacoby R.G. Efremov 《SAR and QSAR in environmental research》2013,24(1-2):91-99

Molecular docking is a powerful computational method that has been widely used in many biomolecular studies to predict geometry of a protein-ligand complex. However, while its conformational search algorithms are usually able to generate correct conformation of a ligand in the binding site, the scoring methods often fail to discriminate it among many false variants. We propose to treat this problem by applying more precise ligand-specific scoring filters to re-rank docking solutions. In this way specific features of interactions between protein and different types of compounds can be implicitly taken into account. New scoring functions were constructed including hydrogen bonds, hydrophobic and hydrophilic complementarity terms. These scoring functions also discriminate ligands by the size of the molecule, the total hydrophobicity, and the number of peptide bonds for peptide ligands. Weighting coefficients of the scoring functions were adjusted using a training set of 60 protein–ligand complexes. The proposed method was then tested on the results of docking obtained for an additional 70 complexes. In both cases the success rate was 5–8% better compared to the standard functions implemented in popular docking software. 相似文献

9.

Unbiasing scoring functions: a new normalization and rescoring strategy

Carta G Knox AJ Lloyd DG 《Journal of chemical information and modeling》2007,47(4):1564-1571

相似文献

10.

Prediction of protein loop conformations using multiscale modeling methods with physical energy scoring functions

Olson MA Feig M Brooks CL 《Journal of computational chemistry》2008,29(5):820-831

相似文献

11.

Cheminformatics meets molecular mechanics: a combined application of knowledge-based pose scoring and physical force field-based hit scoring functions improves the accuracy of structure-based virtual screening

Hsieh JH Yin S Wang XS Liu S Dokholyan NV Tropsha A 《Journal of chemical information and modeling》2012,52(1):16-28

Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening (VS), which is most frequently manifested in the scoring functions' inability to discriminate between true ligands vs known nonbinders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from VS. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of VS in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE::HMSCORE, ChemScore, PLP, and Chemgauss3, in 6 out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP::LBX). We also compare our method with FLAP::RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP::RBLB, hinting effective directions for best VS applications. We suggest that this integrative VS approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies. 相似文献

12.

Novel, customizable scoring functions, parameterized using N-PLS, for structure-based drug discovery

Catana C Stouten PF 《Journal of chemical information and modeling》2007,47(1):85-91

The ability to accurately predict biological affinity on the basis of in silico docking to a protein target remains a challenging goal in the CADD arena. Typically, "standard" scoring functions have been employed that use the calculated docking result and a set of empirical parameters to calculate a predicted binding affinity. To improve on this, we are exploring novel strategies for rapidly developing and tuning "customized" scoring functions tailored to a specific need. In the present work, three such customized scoring functions were developed using a set of 129 high-resolution protein-ligand crystal structures with measured Ki values. The functions were parametrized using N-PLS (N-way partial least squares), a multivariate technique well-known in the 3D quantitative structure-activity relationship field. A modest correlation between observed and calculated pKi values using a standard scoring function (r2 = 0.5) could be improved to 0.8 when a customized scoring function was applied. To mimic a more realistic scenario, a second scoring function was developed, not based on crystal structures but exclusively on several binding poses generated with the Flo+ docking program. Finally, a validation study was conducted by generating a third scoring function with 99 randomly selected complexes from the 129 as a training set and predicting pKi values for a test set that comprised the remaining 30 complexes. Training and test set r2 values were 0.77 and 0.78, respectively. These results indicate that, even without direct structural information, predictive customized scoring functions can be developed using N-PLS, and this approach holds significant potential as a general procedure for predicting binding affinity on the basis of in silico docking. 相似文献

13.

Optimization of catalysts using specific, description-based genetic algorithms

Holena M Cukic T Rodemerck U Linke D 《Journal of chemical information and modeling》2008,48(2):274-282

相似文献

14.

Prediction of protein modification sites of gamma-carboxylation using position specific scoring matrices based evolutionary information

《Computational Biology and Chemistry》2013

相似文献

15.

Quantitative analysis of specific target DNA oligomers using a DNA-immobilized packed-column system

Pack SP Heo TH Devarayapalli KC Makino K 《Analytical and bioanalytical chemistry》2011,401(2):667-676

Although a DNA-immobilized packed-column (DNA-packed column), which relies on sequence-dependent interactions of target DNA or mRNA (in the mobile phase) with DNA probes (on the silica particle) in a continuous flow process, could be considered as an alternative platform for quantitative analysis of specific DNA to DNA chip methodology, the performance in practice has not been satisfactory. In this study, we set up a more efficient quantitative analysis system based on a DNA-packed column by employing a temperature-gradient strategy and DMSO-containing mobile phase. Using a temperature-gradient strategy based on T _m values of probe/target DNA hybridizations and DMSO (5%)-containing mobile phase, we succeeded in the quantitative analysis of a specific complementary target distinguishable from non-complementary DNA oligomers or other similar DNA samples. In addition, two different target DNA oligomers even with similar T _m values were separated and detected quantitatively by using a packed column carrying two different DNA probes. 相似文献

16.

Comments on "leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets": significance for the validation of scoring functions

Ballester PJ Mitchell JB 《Journal of chemical information and modeling》2011,51(8):1739-1741

相似文献

17.

Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes

Matthew D. Eldridge Christopher W. Murray Timothy R. Auton Gaia V. Paolini Roger P. Mee 《Journal of computer-aided molecular design》1997,11(5):425-445

This paper describes the development of a simple empirical scoringfunction designed to estimate the free energy of binding for aprotein–ligand complex when the 3D structure of the complex is knownor can be approximated. The function uses simple contact terms to estimatelipophilic and metal–ligand binding contributions, a simple explicitform for hydrogen bonds and a term which penalises flexibility. Thecoefficients of each term are obtained using a regression based on 82ligand–receptor complexes for which the binding affinity is known. Thefunction reproduces the binding affinity of the complexes with across-validated error of 8.68 kJ/mol. Tests on internal consistency indicatethat the coefficients obtained are stable to changes in the composition ofthe training set. The function is also tested on two test sets containing afurther 20 and 10 complexes, respectively. The deficiencies of this type offunction are discussed and it is compared to approaches by other workers. 相似文献

18.

An extensive test of 14 scoring functions using the PDBbind refined set of 800 protein-ligand complexes

Wang R Lu Y Fang X Wang S 《Journal of chemical information and computer sciences》2004,44(6):2114-2125

Fourteen popular scoring functions, i.e., X-Score, DrugScore, five scoring functions in the Sybyl software (D-Score, PMF-Score, G-Score, ChemScore, and F-Score), four scoring functions in the Cerius2 software (LigScore, PLP, PMF, and LUDI), two scoring functions in the GOLD program (GoldScore and ChemScore), and HINT, were tested on the refined set of the PDBbind database, a set of 800 diverse protein-ligand complexes with high-resolution crystal structures and experimentally determined Ki or Kd values. The focus of our study was to assess the ability of these scoring functions to predict binding affinities based on the experimentally determined high-resolution crystal structures of proteins in complex with their ligands. The quantitative correlation between the binding scores produced by each scoring function and the known binding constants of the 800 complexes was computed. X-Score, DrugScore, Sybyl::ChemScore, and Cerius2::PLP provided better correlations than the other scoring functions with standard deviations of 1.8-2.0 log units. These four scoring functions were also found to be robust enough to carry out computation directly on unaltered crystal structures. To examine how well scoring functions predict the binding affinities for ligands bound to the same target protein, the performance of these 14 scoring functions were evaluated on three subsets of protein-ligand complexes from the test set: HIV-1 protease complexes (82 entries), trypsin complexes (45 entries), and carbonic anhydrase II complexes (40 entries). Although the results for the HIV-1 protease subset are less than desirable, several scoring functions are able to satisfactorily predict the binding affinities for the trypsin and the carbonic anhydrase II subsets with standard deviation as low as 1.0 log unit (corresponding to 1.3-1.4 kcal/mol at room temperature). Our results demonstrate the strengths as well as the weaknesses of current scoring functions for binding affinity prediction. 相似文献

19.

Polyamide platinum anticancer complexes designed to target specific DNA sequences

Jaramillo D Wheate NJ Ralph SF Howard WA Tor Y Aldrich-Wright JR 《Inorganic chemistry》2006,45(15):6004-6013

相似文献

20.

Combined application of cheminformatics- and physical force field-based scoring functions improves binding affinity prediction for CSAR data sets

Hsieh JH Yin S Liu S Sedykh A Dokholyan NV Tropsha A 《Journal of chemical information and modeling》2011,51(9):2027-2035

相似文献