首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The revised general solubility equation (GSE) is used along with four different methods including Huuskonen's artificial neural network (ANN) and three multiple linear regression (MLR) methods to estimate the aqueous solubility of a test set of the 21 pharmaceutically and environmentally interesting compounds. For the selected test sets, it is clear that the GSE and ANN predictions are more accurate than MLR methods. The GSE has the advantages of being simple and thermodynamically sound. The only two inputs used in the GSE are the Celsius melting point (MP) and the octanol water partition coefficient (K(ow)). No fitted parameters and no training data are used in the GSE, whereas other methods utilize a large number of parameters and require a training set. The GSE is also applied to a test set of 413 organic nonelectrolytes that were studied by Huuskonen. Although the GSE uses only two parameters and no training set, its average absolute errors is only 0.1 log units larger than that of the ANN, which requires many parameters and a large training set. The average absolute error AAE is 0.54 log units using the GSE and 0.43 log units using Huuskonen's ANN modeling. This study provides evidence for the GSE being a convenient and reliable method to predict aqueous solubilities of organic compounds.  相似文献   

2.
3.
Using a training set of 191 drug-like compounds extracted from the AQUASOL database a quantitative structure-property relationship (QSPR) study was conducted employing a set of simple structural and physicochemical properties to predict aqueous solubility. The resultant regression model comprised five parameters (ClogP, molecular weight, indicator variable for aliphatic amine groups, number of rotatable bonds and number of aromatic rings) and demonstrated acceptable statistics (r2 = 0.87, s = 0.51, F = 243.6, n = 191). The model was applied to two test sets consisting of a drug-like set of compounds (r2 = 0.80, s = 0.68, n = 174) and a set of agrochemicals (r2 = 0.88, s = 0.65, n = 200). Using the established general solubility equation (GSE) on the training and drug-like test set gave poorer results than the current study. The agrochemical test set was predicted with equal accuracy using the GSE and the QSPR equation. The results of this study suggest that increasing molecular size, rigidity and lipophilicity decrease solubility whereas increasing conformational flexibility and the presence of a non-conjugated amine group increase the solubility of drug-like compounds. Indeed, the proposed structural parameters make physical sense and provide simple guidelines for modifying solubility during lead optimisation.  相似文献   

4.
5.
The molecular weight and electrotopological E-state indices were used to estimate by Artificial Neural Networks aqueous solubility for a diverse set of 1291 organic compounds. The neural network with 33-4-1 neurons provided highly predictive results with r(2) = 0.91 and RMS = 0.62. The used parameters included several combinations of E-state indices with similar properties. The calculated results were similar to those published for these data by Huuskonen (2000). However, in the current study only E-state indices were used without need of additional indices (the molecular connectivity, shape, flexibility and indicator indices) also considered in the previous study. In addition, the present neural network contained three times less hidden neurons. Smaller neural networks and use of one homogeneous set of parameters provides a more robust model for prediction of aqueous solubility of chemical compounds. Limitations of the developed method for prediction of large compounds are discussed. The developed approach is available online at http://www.lnh.unil.ch/~itetko/logp.  相似文献   

6.
Support vector machines for the estimation of aqueous solubility   总被引:2,自引:0,他引:2  
Support Vector Machines (SVMs) are used to estimate aqueous solubility of organic compounds. A SVM equipped with a Tanimoto similarity kernel estimates solubility with accuracy comparable to results from other reported methods where the same data sets have been studied. Complete cross-validation on a diverse data set resulted in a root-mean-squared error = 0.62 and R(2) = 0.88. The data input to the machine is in the form of molecular fingerprints. No physical parameters are explicitly involved in calculations.  相似文献   

7.
Using a training set of 191 drug-like compounds extracted from the AQUASOL database a quantitative structure-property relationship (QSPR) study was conducted employing a set of simple structural and physicochemical properties to predict aqueous solubility. The resultant regression model comprised five parameters (ClogP, molecular weight, indicator variable for aliphatic amine groups, number of rotatable bonds and number of aromatic rings) and demonstrated acceptable statistics (r 2 = 0.87, s = 0.51, F = 243.6, n = 191). The model was applied to two test sets consisting of a drug-like set of compounds (r 2 = 0.80, s = 0.68, n = 174) and a set of agrochemicals (r 2 = 0.88, s = 0.65, n = 200). Using the established general solubility equation (GSE) on the training and drug-like test set gave poorer results than the current study. The agrochemical test set was predicted with equal accuracy using the GSE and the QSPR equation. The results of this study suggest that increasing molecular size, rigidity and lipophilicity decrease solubility whereas increasing conformational flexibility and the presence of a non-conjugated amine group increase the solubility of drug-like compounds. Indeed, the proposed structural parameters make physical sense and provide simple guidelines for modifying solubility during lead optimisation.  相似文献   

8.
9.
This study compares the solubility predictions of the two parameter general solubility equation (GSE) of Jain and Yalkowsky with the 171 parameter Klopman group contribution approach. Melting points and partition coefficients were obtained for each of the compounds from Klopman's test set. Using these two variables, the solubility of each compound was calculated by the GSE and compared to the values predicted by Klopman. Both methods give reasonable solubility predictions. The data of Klopman produced an average absolute error (AAE) of 0.71 and a root-mean-square error (RMSE) of 0.86, while the GSE had an AAE of 0.64 and a RMSE of 0.92.  相似文献   

10.
11.
12.
An accurate and generally applicable method for estimating aqueous solubilities for a diverse set of 1297 organic compounds based on multilinear regression and artificial neural network modeling was developed. Molecular connectivity, shape, and atom-type electrotopological state (E-state) indices were used as structural parameters. The data set was divided into a training set of 884 compounds and a randomly chosen test set of 413 compounds. The structural parameters in a 30-12-1 artificial neural network included 24 atom-type E-state indices and six other topological indices, and for the test set, a predictive r2 = 0.92 and s = 0.60 were achieved. With the same parameters the statistics in the multilinear regression were r2 = 0.88 and s = 0.71, respectively.  相似文献   

13.
14.
15.
A reliable and generally applicable aqueous solubility estimation method for organic compounds based on a group contribution approach has been developed. Two models have been established based on two different sets of parameters. One has a higher accuracy, while the other has a more general applicability. The prediction potentials of these two models have been evaluated through cross-validation experiments. For model I, the mean cross-validated r2 and SD for 10 such cross-validation experiments were 0.946 and 0.503 log units, respectively. While for model II, they were 0.953 and 0.546 log units, respectively. Applying our models to estimate the water solubility values for the compounds in an independent test set, we found that model I can be applied to 13 out of 21 compounds with a SD equal to 0.58 log unit and model II can be applied to all the 21 compounds with a SD equal to 1.25 log units. Our models compare favorably to all the current available water estimation methods. A program based on this approach has been written in FORTRAN77 and is currently running on a VAX/VMS system. The program can be applied to estimate the water solubility of the water solubility of any organic chemical with a good or fairly good accuracy except for except for electrolytes. Applying our aqueous solubility estimation models to biodegradation studies, we found that although the water solubility was not the sole factor controlling the rate of biodegradation, ring compounds with greater solubilities were more likely to biodegrade at a faster rate. The significance of the relationship between water solubility and biodegradation activity has been illustrated by predicting the biodegradation activity of 27 new chemicals based solely on their estimated solubility values.  相似文献   

16.
17.
18.
19.
20.
Several group contribution methods to estimate the aqueous solubility of organic molecules are proposed and evaluated for their ability to predict the water solubility of new molecules. The learning set consisted of 1168 organic compounds with experimental data taken from the literature after critical evaluation. The best method, based on a new fragment atom scheme, leads to a squared correlation coefficient of 0.95 and an average absolute calculation error of 0.50 log unit, which is superior to other group contribution methods currently available. One of the advantages of this model is that it has upper and lower limits so that the predicted solubilities cannot be unrealistily high or low.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号