期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An approach to the interpretation of backpropagation neural network models in QSAR studies

I.I. Baskin A.O. Ait N.M. Halberstam V.A. Palyulin N.S. Zefirov 《SAR and QSAR in environmental research》2013,24(1):35-41

An approach to the interpretation of backpropagation neural network models for quantitative structure-activity and structure-property relationships (QSAR/QSPR) studies is proposed. The method is based on analyzing the first and second moments of distribution of the values of the first and the second partial derivatives of neural network outputs with respect to inputs calculated at data points. The use of such statistics makes it possible not only to obtain actually the same characteristics as for the case of traditional "interpretable" statistical methods, such as the linear regression analysis, but also to reveal important additional information regarding the non-linear character of QSAR/QSPR relationships. The approach is illustrated by an example of interpreting a backpropagation neural network model for predicting position of the long-wave absorption band of cyane dyes. 相似文献

2.

An approach to the interpretation of backpropagation neural network models in QSAR studies

Baskin II Ait AO Halberstam NM Palyulin VA Zefirov NS 《SAR and QSAR in environmental research》2002,13(1):35-41

An approach to the interpretation of backpropagation neural network models for quantitative structure-activity and structure-property relationships (QSAR/QSPR) studies is proposed. The method is based on analyzing the first and second moments of distribution of the values of the first and the second partial derivatives of neural network outputs with respect to inputs calculated at data points. The use of such statistics makes it possible not only to obtain actually the same characteristics as for the case of traditional "interpretable" statistical methods, such as the linear regression analysis, but also to reveal important additional information regarding the non-linear character of QSAR/QSPR relationships. The approach is illustrated by an example of interpreting a backpropagation neural network model for predicting position of the long-wave absorption band of cyane dyes. 相似文献

3.

On some aspects of validation of predictive QSAR models

K Roy PP Roy JT Leonard 《Chemistry Central journal》2008,2(Z1):P9

相似文献

4.

QSAR and mechanistic interpretation of estrogen receptor binding

Serafimova R Todorov M Nedelcheva D Pavlov T Akahori Y Nakai M Mekenyan O 《SAR and QSAR in environmental research》2007,18(3-4):389-421

相似文献

5.

e-Statistics for deriving QSAR models

J. Devillers J.C. Doré 《SAR and QSAR in environmental research》2013,24(3-4):409-416

This paper presents some freeware, shareware, and commercial statistical tools available via the Internet and which could be used in QSAR for deriving models. Programming environments useful in Statistics, newsgroups and FAQs are also introduced due to their interest for the discipline. 相似文献

6.

e-statistics for deriving QSAR models

Devillers J Doré JC 《SAR and QSAR in environmental research》2002,13(3-4):409-416

This paper presents some freeware, shareware, and commercial statistical tools available via the Internet and which could be used in QSAR for deriving models. Programming environments useful in Statistics, newsgroups and FAQs are also introduced due to their interest for the discipline. 相似文献

7.

Three-dimensional QSAR using the k-nearest neighbor method and its interpretation

Ajmani S Jadhav K Kulkarni SA 《Journal of chemical information and modeling》2006,46(1):24-31

In this paper we report a novel three-dimensional QSAR approach, kNN-MFA, developed based on principles of the k-nearest neighbor method combined with various variable selection procedures. The kNN-MFA approach was used to generate models for three different data sets and predict the activity of test molecules through each of these models. The three data sets used were the standard steroid benchmark, an antiinflammatory and an anticancerous data set. The study resulted in kNN-MFA models having better statistical parameters than the reported CoMFA models for all the three data sets. It was also found that stochastic methods generate better models resulting in more accurate predictions as compared to stepwise forward selection procedures. Thus, kNN-MFA method represents a good alternative to CoMFA-like methods. 相似文献

8.

On further application of r as a metric for validation of QSAR models

Indrani Mitra Partha Pratim Roy Supratik Kar Probir Kumar Ojha Kunal Roy 《Journal of Chemometrics》2010,24(1):22-33

Validation is a crucial aspect for quantitative structure–activity relationship (QSAR) model development. External validation is considered, in general, as the most conclusive proof of predictive capacity of a QSAR model. In the absence of truly external data set, external validation is usually performed on test set compounds, which are members of the original data set but not used in model development exercise. In the case of small data sets, QSAR researchers experience problem in model development due to the fact that the developed models may be less reliable on account of the small number of training set compounds and such models may also show poor external predictability because the models may not have captured all necessary features required for the particular structure–activity relationships. The present paper attempts to show that ‘true r_(LOO)’ statistic calculated based on the model derived from the undivided data set with application of variable selection strategy at each cycle of leave‐one‐out (LOO) validation may reflect external validation characteristics of the developed model thus obviating the requirement of splitting of the data set into training and test sets. This approach may be helpful in the case of small data sets as it uses all available data for model development and validation thus making the resulting model more reliable. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

9.

Robust cross-validation of linear regression QSAR models 总被引：1，自引：0，他引：1

Konovalov DA Llewellyn LE Vander Heyden Y Coomans D 《Journal of chemical information and modeling》2008,48(10):2081-2094

A quantitative structure-activity relationship (QSAR) model is typically developed to predict the biochemical activity of untested compounds from the compounds' molecular structures. "The gold standard" of model validation is the blindfold prediction when the model's predictive power is assessed from how well the model predicts the activity values of compounds that were not considered in any way during the model development/calibration. However, during the development of a QSAR model, it is necessary to obtain some indication of the model's predictive power. This is often done by some form of cross-validation (CV). In this study, the concepts of the predictive power and fitting ability of a multiple linear regression (MLR) QSAR model were examined in the CV context allowing for the presence of outliers. Commonly used predictive power and fitting ability statistics were assessed via Monte Carlo cross-validation when applied to percent human intestinal absorption, blood-brain partition coefficient, and toxicity values of saxitoxin QSAR data sets, as well as three known benchmark data sets with known outlier contamination. It was found that (1) a robust version of MLR should always be preferred over the ordinary-least-squares MLR, regardless of the degree of outlier contamination and that (2) the model's predictive power should only be assessed via robust statistics. The Matlab and java source code used in this study is freely available from the QSAR-BENCH section of www.dmitrykonovalov.org for academic use. The Web site also contains the java-based QSAR-BENCH program, which could be run online via java's Web Start technology (supporting Windows, Mac OSX, Linux/Unix) to reproduce most of the reported results or apply the reported procedures to other data sets. 相似文献

10.

三嗪噁二唑基吡唑衍生物抑酶活性的QSAR模型

岳玮何红梅冯长君《化学通报》2018,81(7):636-640

基于拓扑化学理论,原子类型电拓扑态指数(Mk)被用于表征18种三嗪噁二唑基吡唑衍生物的化学微环境。采用最佳变量子集回归方法,分别建立上述化合物对蛋白酪氨酸磷酸酯酶1B(PTP1B)、细胞分裂周期25磷酸酯酶B(Cdc25B)的抑酶活性(P_t、C_d)与Mk的定量构效关系(QSAR)模型。它们的最佳三元QSAR模型的判定系数(R~2)依次为0.896、0.828,逐一剔除法交叉验证相关系数(R_(cv)~2)依次为0.830、0.688。经R_(cv)~2、VIF、FT、AC等检验,该模型具有良好的稳健性及预测能力。经训练集验证,上述模型均具有良好的外部预测能力。模型显示,影响Pt、Cd的因素既有不同的结构基团(-CH_3、-O-、-NH_2和芳环中-N=),也有相同的因素(芳环中-C=)。相似文献

11.

Benchmarking of QSAR models for blood-brain barrier permeation

Konovalov DA Coomans D Deconinck E Heyden YV 《Journal of chemical information and modeling》2007,47(4):1648-1656

相似文献

12.

Filter feature selectors in the development of binary QSAR models

G. Cerruela García J. Pérez-Parras Toledano A. de Haro García N. García-Pedrajas 《SAR and QSAR in environmental research》2019,30(5):313-345

The application of machine learning methods to the construction of quantitative structure–activity relationship models is a complex computational problem in which dimensionality reduction of the representation of the molecular structure plays a fundamental role in predicting a target activity. The feature selection pre-processing approach has been indicated to be effective in dimensionality reduction for building simpler and more understandable models. In this paper, a performance comparative study of 13 state-of-the-art feature selection filter methods is conducted. Structure–activity relationship models are constructed using three widely used classifiers and a diverse collection of datasets. The comparative study utilizes robust statistical tests to compare the algorithms. According to the experimental results, there are substantial differences in performance among the evaluated feature selection methods. The methods that exhibit the best performance are correlation-based feature selection, fast clustering-based feature selection and the set cover method. 相似文献

13.

The quality of QSAR models: problems and solutions 总被引：1，自引：0，他引：1

Kolossov E Stanforth R 《SAR and QSAR in environmental research》2007,18(1-2):89-100

Assessment of the quality of goodness-of-fit and the confidence in predictivity (prediction power) are the main terms used to define the statistical quality of QSAR models. Three parts of this assessment can be defined as: (1) Measure of goodness-of-fit. (2) Validation of model stability. (3) Predictivity analysis. Currently there are no mandatory requirements for the validation methods to be used and rules for the quantitative confidence estimates. To compare the statistical quality of QSAR models it is necessary to have an overall statistical quality index which will depend on the goodness-of-fit, validation and predictivity results together. To do so it is necessary to define the set of mandatory parameters for all three parts of assessment listed above and develop the approach for overall quality estimates based on these parameters. It is also necessary to include into the overall index the penalty mechanism for parameter absence. The goal of the present study is to analyse parameters for all three parts of the QSAR model statistical quality assessment and investigate the flexible weighting approach for the overall statistical quality index development. Due the different statistical parameters traditionally used for assessment of goodness-of-fit it is necessary to create the mechanism, which allows flexible set of parameters to be used for the overall statistical quality index. Only after approval by scientific community and regulatory boards the final set of mandatory parameters can be selected. 相似文献

14.

QSAR models using a large diverse set of estrogens 总被引：12，自引：0，他引：12

Shi LM Fang H Tong W Wu J Perkins R Blair RM Branham WS Dial SL Moland CL Sheehan DM 《Journal of chemical information and computer sciences》2001,41(1):186-195

Endocrine disruptors (EDs) have a variety of adverse effects in humans and animals. About 58,000 chemicals, most having little safety data, must be tested in a group of tiered assays. As assays will take years, it is important to develop rapid methods to help in priority setting. For application to large data sets, we have developed an integrated system that contains sequential four phases to predict the ability of chemicals to bind to the estrogen receptor (ER), a prevalent mechanism for estrogenic EDs. Here we report the results of evaluating two types of QSAR models for inclusion in phase III to quantitatively predict chemical binding to the ER. Our data set for the relative binding affinities (RBAs) to the ER consists of 130 chemicals covering a wide range of structural diversity and a 6 orders of magnitude spread of RBAs. CoMFA and HQSAR models were constructed and compared for performance. The CoMFA model had a r2 = 0.91 and a q2LOO = 0.66. HQSAR showed reduced performance compared to CoMFA with r2 = 0.76 and q2LOO = 0.59. A number of parameters were examined to improve the CoMFA model. Of these, a phenol indicator increased the q2LOO to 0.71. When up to 50% of the chemicals were left out in the leave-N-out cross-validation, the q2 remained significant. Finally, the models were tested by using two test sets; the q2pred for these were 0.71 and 0.62, a significant result which demonstrates the utility of the CoMFA model for predicting the RBAs of chemicals not included in the training set. If used in conjunction with phases I and II, which reduced the size of the data set dramatically by eliminating most inactive chemicals, the current CoMFA model (phase III) can be used to predict the RBA of chemicals with sufficient accuracy and to provide quantitative information for priority setting. 相似文献

15.

Global QSAR models of skin sensitisers for regulatory purposes

Qasim Chaudhry Nadège Piclin Jane Cotterill Marco Pintore Nick R Price Jacques R Chrétien Alessandra Roncaglioni 《Chemistry Central journal》2010,4(Z1):S5

Background

The new European Regulation on chemical safety, REACH, (Registration, Evaluation, Authorisation and Restriction of CHemical substances), is in the process of being implemented. Many chemicals used in industry require additional testing to comply with the REACH regulations. At the same time EU member states are attempting to reduce the number of animals used in experiments under the 3 Rs policy, (refining, reducing, and replacing the use of animals in laboratory procedures). Computational techniques such as QSAR have the potential to offer an alternative for generating REACH data. The FP6 project CAESAR was aimed at developing QSAR models for 5 key toxicological endpoints of which skin sensitisation was one.

Results

This paper reports the development of two global QSAR models using two different computational approaches, which contribute to the hybrid model freely available online.

Conclusions

The QSAR models for assessing skin sensitisation have been developed and tested under stringent quality criteria to fulfil the principles laid down by the OECD. The final models, accessible from CAESAR website, offer a robust and reliable method of assessing skin sensitisation for regulatory use.

相似文献

16.

Could deep learning in neural networks improve the QSAR models?

G. Gini F. Zanoli A. Gamba G. Raitano E. Benfenati 《SAR and QSAR in environmental research》2019,30(9):617-642

相似文献

17.

Comparison of different approaches to define the applicability domain of QSAR models

Sahigara F Mansouri K Ballabio D Mauri A Consonni V Todeschini R 《Molecules (Basel, Switzerland)》2012,17(5):4791-4810

相似文献

18.

QSAR classification models for the screening of the endocrine-disrupting activity of perfluorinated compounds

Kovarich S Papa E Li J Gramatica P 《SAR and QSAR in environmental research》2012,23(3-4):207-220

Perfluorinated compounds (PFCs) are a class of emerging pollutants still widely used in different materials as non-adhesives, waterproof fabrics, fire-fighting foams, etc. Their toxic effects include potential for endocrine-disrupting activity, but the amount of experimental data available for these pollutants is limited. The use of predictive strategies such as quantitative structure-activity relationships (QSARs) is recommended under the REACH regulation, to fill data gaps and to screen and prioritize chemicals for further experimentation, with a consequent reduction of costs and number of tested animals. In this study, local classification models for PFCs were developed to predict their T4-TTR (thyroxin-transthyretin) competing potency. The best models were selected by maximizing the sensitivity and external predictive ability. These models, characterized by robustness, good predictive power and a defined applicability domain, were applied to predict the activity of 33 other PFCs of environmental concern. Finally, classification models recently published by our research group for T4-TTR binding of brominated flame retardants and for estrogenic and anti-androgenic activity were applied to the studied perfluorinated chemicals to compare results and to further evaluate the potential for these PFCs to cause endocrine disruption. 相似文献

19.

Infrared spectra as chemical descriptors for QSAR models.

R Benigni A Giuliani L Passerini 《Journal of chemical information and computer sciences》2001,41(3):727-730

相似文献

20.

QSAR models for predicting the toxicity of piperidine derivatives against Aedes aegypti

J. P. Doucet E. Papa A. Doucet-Panaye J. Devillers 《SAR and QSAR in environmental research》2017,28(6):451-470

相似文献