首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
QSAR models have been under development for decades but acceptance and utilization of model results have been slow, in part, because there is no widely accepted metric for assessing their reliability. We reapply a method commonly used in quantitative epidemiology and medical decision-making for evaluating the results of screening tests to assess reliability of a QSAR model. It quantifies the accuracy (expressed as sensitivity and specificity) of QSAR models as conditional probabilities of correct and incorrect classification of chemical characteristic, given a true characteristic. Using Bayes formula, these conditional probabilities are combined with prior information to generate a posterior distribution to determine the probability a specific chemical has a particular characteristic, given a model prediction. As an example, we apply this approach to evaluate the predictive reliability of a CATABOL model and base on it a "ready" and "not ready" biodegradability classification. Finally, we show how predictive capability of the model can be improved by sequential use of two models, the first one with high sensitivity and the second with high specificity.  相似文献   

2.
Quantitative Structure–Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors (kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction (R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.  相似文献   

3.
Summary In this work, the TOMOCOMD-CARDD approach has been applied to estimate the anthelmintic activity. Total and local (both atom and atom-type) quadratic indices and linear discriminant analysis were used to obtain a quantitative model that discriminates between anthelmintic and non-anthelmintic drug-like compounds. The obtained model correctly classified 90.37% of compounds in the training set. External validation processes to assess the robustness and predictive power of the obtained model were carried out. The QSAR model correctly classified 88.18% of compounds in this external prediction set. A second model was performed to outline some conclusions about the possible modes of action of anthelmintic drugs. This model permits the correct classification of 94.52% of compounds in the training set, and 80.00% of good global classification in the external prediction set. After that, the developed model was used in virtual in silicoscreening and several compounds from the Merck Index, Negwers handbook and Goodman and Gilman were identified by models as anthelmintic. Finally, the experimental assay of one organic chemical (G-1) by an in vivo test coincides fairly well (100) with model predictions. These results suggest that the proposed method will be a good tool for studying the biological properties of drug candidates during the early state of the drug-development process.  相似文献   

4.
ABSTRACT

The aryl hydrocarbon receptor (AhR) plays an important role in several biological processes such as reproduction, immunity and homoeostasis. However, little is known on the chemical-structural and physicochemical features that influence the activity of AhR antagonistic modulators. In the present report, in vitro AhR antagonistic activity evaluations, based on a chemical-activated luciferase gene expression (AhR-CALUX) bioassay, and an extensive literature review were performed with the aim of constructing a structurally diverse database of contaminants and potentially toxic chemicals. Subsequently, QSAR models based on Linear Discriminant Analysis and Logistic Regression, as well as two toxicophoric hypotheses were proposed to model the AhR antagonistic activity of the built dataset. The QSAR models were rigorously validated yielding satisfactory performance for all classification parameters. Likewise, the toxicophoric hypotheses were validated using a diverse set of 350 decoys, demonstrating adequate robustness and predictive power. Chemical interpretations of both the QSAR and toxicophoric models suggested that hydrophobic constraints, the presence of aromatic rings and electron-acceptor moieties are critical for the AhR antagonism. Therefore, it is hoped that the deductions obtained in the present study will contribute to elucidate further on the structural and physicochemical factors influencing the AhR antagonistic activity of chemical compounds.  相似文献   

5.
6.
7.
8.
ABSTRACT

A method for combining statistical-based QSAR predictions of two or more binary classification models is presented. It was assumed that all models were independent. This facilitated the combination of positive and negative predictions using a quantitative weight of evidence (qWoE) procedure based on Bayesian statistics and the additivity of the logarithms of the likelihood ratios. Previous studies combined more than one prediction but used arbitrary strengths for positive and negative predictions. In our approach, the combined models were validated by determining the sensitivity and specificity values, which are performance metrics that are a point of departure for obtaining values that measure the weight of evidence of positive and negative predictions. The developed method was experimentally applied in the prediction of Ames mutagenicity. The method achieved a similar accuracy to that of the experimental Ames test for this endpoint when the overall prediction was determined using a combination of the individual predictions of more than one model. Calculating the qWoE value would reduce the requirement for expert knowledge and decrease the subjectivity of the prediction. This method could be applied to other endpoints such as developmental toxicity and skin sensitisation with binary classification models.  相似文献   

9.
Abstract

Computational chemistry provides a means for the calculation or estimation of three-dimensional chemical structure, organization and analysis of chemical data, classification of industrial chemicals by structure and properties, prediction of toxicity, and identification of chemical structure. The development of the EPA National Environmental Supercomputer Center (NESC) in Bay City, Michigan, makes available to scientists in EPA Headquarters, the ability to perform advanced QSAR modeling. This provides the means to develop and apply QSAR models for chemicals acting by a variety of molecular mechanisms. The work makes possible improved programmatic support to the Office of Pollution Prevention and Toxics under the Toxic Substances Control Act and the Pollution Prevention Act.  相似文献   

10.
The transport activity of a membrane protein, bilitranslocase (T.C. # 2.A.65.1.1), which acts as a transporter of bilirubin from blood to liver cells, was experimentally determined for a large set of various endogenous compounds, drugs, purine and pyrimidine derivatives. On these grounds, the structure-activity models were developed following the OECD principles of QSAR models and their predictive ability for new chemicals was evaluated. The applicability domain of the models was estimated by Euclidean distances criteria according to the applied modeling method. The selection of the most influential structural variables was an important stage in the adopted modeling methodology. The interpretation of selected variables was performed in order to get an insight into the mechanism of transport through the cell membrane via bilitranslocase. Validation of the optimized models was performed by a previously determined validation set. The classification model was build to separate active from inactive compounds. The resulting accuracy, sensitivity, and specificity were 0.73, 0.89, and 0.64, respectively. Only active compounds were used to develop a predictive model for bilitranslocase inhibition constants. The model showed good predictive ability; Root Mean Squared error of the validation set, RMS(V)=0.29 log units.  相似文献   

11.
12.
13.
14.
15.
Abstract

In aquatic toxicology, QSAR models are generally designed for chemicals presenting the same mode of toxic action. Their proper use provides good simulation results. Problems arise when the mechanism of toxicity of a chemical is not clearly identified. Indeed, in that case, the inappropriate application of a specific QSAR model can lead to a dramatic error in the toxicity estimation. With the advent of powerful computers and easy access to them, and the introduction of soft modeling and artificial intelligence in SAR and QSAR, radically different models, designed from large non-congeneric sets of chemicals have been proposed. Some of these new QSAR models are reviewed and their originality, advantages, and limitations are stressed.  相似文献   

16.

A novel mechanistic modeling approach has been developed that assesses chemical biodegradability in a quantitative manner. It is an expert system predicting biotransformation pathway working together with a probabilistic model that calculates probabilities of the individual transformations. The expert system contains a library of hierarchically ordered individual transformations and matching substructure engine. The hierarchy in the expert system was set according to the descending order of the individual transformation probabilities. The integrated principal catabolic steps are derived from set of metabolic pathways predicted for each chemical from the training set and encompass more than one real biodegradation step to improve the speed of predictions. In the current work, we modeled O 2 yield during OECD 302 C (MITI I) test. MITI-I database of 532 chemicals was used as a training set. To make biodegradability predictions, the model only needs structure of a chemical. The output is given as percentage of theoretical biological oxygen demand (BOD). The model allows for identifying potentially persistent catabolic intermediates and their molar amounts. The data in the training set agreed well with the calculated BODs ( r 2 =0.90) in the entire range i.e. a good fit was observed for readily, intermediate and difficult to degrade chemicals. After introducing 60% ThOD as a cut off value the model predicted correctly 98% ready biodegradable structures and 96% not ready biodegradable structures. Crossvalidation by four times leaving 25% of data resulted in Q 2 =0.88 between observed and predicted values. Presented approach and obtained results were used to develop computer software for biodegradability prediction CATABOL.  相似文献   

17.
18.
19.
Abstract

The ability to determine the biodegradability of chemicals without resorting to expensive tests is ecologically and economically desirable. Models based on quantitative structure–activity relations (QSAR) provide some promise in this direction. However, QSAR models in the literature rarely provide uncertainty estimates in more detail than aggregated statistics such as the sensitivity and specificity of the model’s predictions. Almost never is there a means of assessing the uncertainty in an individual prediction. Without an uncertainty estimate, it is impossible to assess the trustworthiness of any particular prediction, which leaves the model with a low utility for regulatory purposes. In the present work, a QSAR model with uncertainty estimates is used to predict biodegradability for a set of substances from a publicly available data set. Separation was performed using a partial least squares discriminant analysis model, and the uncertainty was estimated using bootstrapping. The uncertainty prediction allows for confidence intervals to be assigned to any of the model’s predictions, allowing for a more complete assessment of the model than would be possible through a traditional statistical analysis. The results presented here are broadly applicable to other areas of modelling as well, because the calculation of the uncertainty will clearly demonstrate where additional tests are needed.  相似文献   

20.

Derivation of quantitative structure-activity relationships (QSAR) usually involves computational models that relate a set of input variables describing the structural properties of the molecules for which the activity has been measured to the output variable representing activity. Many of the input variables may be correlated, and it is therefore often desirable to select an optimal subset of the input variables that results in the most predictive model. In this paper we describe an optimization technique for variable selection based on artificial ant colony systems. The algorithm is inspired by the behavior of real ants, which are able to find the shortest path between a food source and their nest using deposits of pheromone as a communication agent. The underlying basic self-organizing principle is exploited for the construction of parsimonious QSAR models based on neural networks for several classical QSAR data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号