期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers

Glick M Jenkins JL Nettles JH Hitchings H Davies JW 《Journal of chemical information and modeling》2006,46(1):193-200

High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, recursive partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise. 相似文献

2.

Combinatorial QSAR of ambergris fragrance compounds 总被引：4，自引：0，他引：4

Kovatcheva A Golbraikh A Oloff S Xiao YD Zheng W Wolschann P Buchbauer G Tropsha A 《Journal of chemical information and computer sciences》2004,44(2):582-595

相似文献

3.

Prediction of glass transition temperature (T(g)) of some compounds in organic electroluminescent devices with their molecular properties

Kim YS Kim JH Kim JS No KT 《Journal of chemical information and computer sciences》2002,42(1):75-81

相似文献

4.

Application 2D Descriptors and Artificial Neural Networks for Beta-Glucosidase Inhibitors Screening

Maciej Przyby&#x;ek 《Molecules (Basel, Switzerland)》2020,25(24)

相似文献

5.

Classification models for predicting the antimalarial activity against Plasmodium falciparum

Q. Liu M. Liu 《SAR and QSAR in environmental research》2020,31(4):313-324

相似文献

6.

Classification of diverse organic compounds that induce chromosomal aberrations in Chinese hamster cells

McElroy NR Thompson ED Jurs PC 《Journal of chemical information and computer sciences》2003,43(6):2111-2119

相似文献

7.

Prediction of the Lee retention indices of polycyclic aromatic hydrocarbons by artificial neural network

Skrbić B Onjia A 《Journal of chromatography. A》2006,1108(2):279-284

相似文献

8.

Quantitative Structure Retention Relationship Modeling of Retention Time for Some Organic Pollutants

《Analytical letters》2012,45(5):823-835

相似文献

9.

Combinatorial QSAR modeling of P-glycoprotein substrates

de Cerqueira Lima P Golbraikh A Oloff S Xiao Y Tropsha A 《Journal of chemical information and modeling》2006,46(3):1245-1254

相似文献

10.

Predictive activity profiling of drugs by topological-fragment-spectra-based support vector machines

Kawai K Fujishima S Takahashi Y 《Journal of chemical information and modeling》2008,48(6):1152-1160

Aiming at the prediction of pleiotropic effects of drugs, we have investigated the multilabel classification of drugs that have one or more of 100 different kinds of activity labels. Structural feature representation of each drug molecule was based on the topological fragment spectra method, which was proposed in our previous work. Support vector machine (SVM) was used for the classification and the prediction of their activity classes. Multilabel classification was carried out by a set of the SVM classifiers. The collective SVM classifiers were trained with a training set of 59,180 compounds and validated by another set (validation set) of 29,590 compounds. For a test set that consists of 9,864 compounds, the classifiers correctly classified 80.8% of the drugs into their own active classes. The SVM classifiers also successfully performed predictions of the activity spectra for multilabel compounds. 相似文献

11.

Prediction of aqueous solubility and partition coefficient optimized by a genetic algorithm based descriptor selection method 总被引：1，自引：0，他引：1

Wegner JK Zell A 《Journal of chemical information and computer sciences》2003,43(3):1077-1084

相似文献

12.

Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu–Fe sulphides by principal component analysis and artificial neural networks

Yogesh Kalegowda Sarah L. Harmer 《Analytica chimica acta》2013

Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu–Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples. 相似文献

13.

Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations

Stahura FL Godden JW Bajorath J 《Journal of chemical information and computer sciences》2002,42(3):550-558

相似文献

14.

Random forest models to predict aqueous solubility

Palmer DS O'Boyle NM Glen RC Mitchell JB 《Journal of chemical information and modeling》2007,47(1):150-158

相似文献

15.

Machine learning study for the prediction of transdermal peptide

Jung E Choi SH Lee NK Kang SK Choi YJ Shin JM Choi K Jung DH 《Journal of computer-aided molecular design》2011,25(4):339-347

相似文献

16.

Evaluating the applicability domain in the case of classification predictive models for carcinogenicity based on the counter propagation artificial neural network

Fjodorova N Novič M Roncaglioni A Benfenati E 《Journal of computer-aided molecular design》2011,25(12):1147-1158

相似文献

17.

Theoretical prediction for the half wave reduction potential of organic molecules

Hadi Noorizadeh Abbas Farmany 《Russian Journal of Electrochemistry》2014,50(6):579-586

相似文献

18.

QSPR study of Setschenow constants of organic compounds using MLR, ANN, and SVM analyses

Xu J Wang L Wang L Shen X Xu W 《Journal of computational chemistry》2011,32(15):3241-3252

相似文献

19.

支持向量机分类和回归用于肽的QSAR研究 总被引：4，自引：0，他引：4

周鹏曾晖李波周原李志良《化学通报》2006,69(5):342-346

使用支持向量机技术对两类肽化合物体系进行了分类和回归研究,并将其系统地与K最邻近法、多元线性回归、偏最小二乘、人工神经网络进行了比较。结果表明,对于小样本、非线性问题,支持向量机具有较强的稳定性能及泛化能力,在大多数情况下能够得到优于传统方法的建模效果。对于分类问题,支持向量机对训练集和测试集都达到了100%的分类正确率;对于回归问题,支持向量机虽对训练集样本拟合效果略低于人工神经网络,但对外部测试集却表现出较强的预测能力。相似文献

20.

PXR ligand classification model with SFED-weighted WHIM and CoMMA descriptors

Ma SL Joung JY Lee S Cho KH No KT 《SAR and QSAR in environmental research》2012,23(5-6):485-504

相似文献