首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The molecular weight and electrotopological E-state indices were used to estimate by Artificial Neural Networks aqueous solubility for a diverse set of 1291 organic compounds. The neural network with 33-4-1 neurons provided highly predictive results with r(2) = 0.91 and RMS = 0.62. The used parameters included several combinations of E-state indices with similar properties. The calculated results were similar to those published for these data by Huuskonen (2000). However, in the current study only E-state indices were used without need of additional indices (the molecular connectivity, shape, flexibility and indicator indices) also considered in the previous study. In addition, the present neural network contained three times less hidden neurons. Smaller neural networks and use of one homogeneous set of parameters provides a more robust model for prediction of aqueous solubility of chemical compounds. Limitations of the developed method for prediction of large compounds are discussed. The developed approach is available online at http://www.lnh.unil.ch/~itetko/logp.  相似文献   

2.
This study compares the solubility predictions of the two parameter general solubility equation (GSE) of Jain and Yalkowsky with the 171 parameter Klopman group contribution approach. Melting points and partition coefficients were obtained for each of the compounds from Klopman's test set. Using these two variables, the solubility of each compound was calculated by the GSE and compared to the values predicted by Klopman. Both methods give reasonable solubility predictions. The data of Klopman produced an average absolute error (AAE) of 0.71 and a root-mean-square error (RMSE) of 0.86, while the GSE had an AAE of 0.64 and a RMSE of 0.92.  相似文献   

3.
QSAR models have been under development for decades but acceptance and utilization of model results have been slow, in part, because there is no widely accepted metric for assessing their reliability. We reapply a method commonly used in quantitative epidemiology and medical decision-making for evaluating the results of screening tests to assess reliability of a QSAR model. It quantifies the accuracy (expressed as sensitivity and specificity) of QSAR models as conditional probabilities of correct and incorrect classification of chemical characteristic, given a true characteristic. Using Bayes formula, these conditional probabilities are combined with prior information to generate a posterior distribution to determine the probability a specific chemical has a particular characteristic, given a model prediction. As an example, we apply this approach to evaluate the predictive reliability of a CATABOL model and base on it a "ready" and "not ready" biodegradability classification. Finally, we show how predictive capability of the model can be improved by sequential use of two models, the first one with high sensitivity and the second with high specificity.  相似文献   

4.

QSAR models have been under development for decades but acceptance and utilization of model results have been slow, in part, because there is no widely accepted metric for assessing their reliability. We reapply a method commonly used in quantitative epidemiology and medical decision-making for evaluating the results of screening tests to assess reliability of a QSAR model. It quantifies the accuracy (expressed as sensitivity and specificity) of QSAR models as conditional probabilities of correct and incorrect classification of chemical characteristic, given a true characteristic. Using Bayes formula, these conditional probabilities are combined with prior information to generate a posterior distribution to determine the probability a specific chemical has a particular characteristic, given a model prediction. As an example, we apply this approach to evaluate the predictive reliability of a CATABOL model and base on it a "ready" and "not ready" biodegradability classification. Finally, we show how predictive capability of the model can be improved by sequential use of two models, the first one with high sensitivity and the second with high specificity.  相似文献   

5.
6.
Support vector machines for the estimation of aqueous solubility   总被引:2,自引:0,他引:2  
Support Vector Machines (SVMs) are used to estimate aqueous solubility of organic compounds. A SVM equipped with a Tanimoto similarity kernel estimates solubility with accuracy comparable to results from other reported methods where the same data sets have been studied. Complete cross-validation on a diverse data set resulted in a root-mean-squared error = 0.62 and R(2) = 0.88. The data input to the machine is in the form of molecular fingerprints. No physical parameters are explicitly involved in calculations.  相似文献   

7.
Cyclodextrins (CDs) are cyclic oligosaccharides that form inclusion complexes with lipophilic molecules through their hydrophobic central cavity. In this study, the effect of α-CD, hydroxylpropyl-β-CD (HP-β-CD) and mixtures of these two CDs on the aqueous solubility of cyclosporine A (CyA) was investigated. Infrared spectroscopy and thermal analysis were used to confirm CyA-CD complex formation. CyA aqueous solubility was increased by 10 and 80 fold in the presence of α-CD and HP β-CD, respectively. The phase-solubility profile for HP-β-CD was linear while that for α-CD had positive deviation from linearity. In the presence of constant concentration of α-CD (15% w/v), aqueous solubility of CyA was further increased upon addition of HP-β-CD up to a concentration of 20% w/v. At higher HP-β-CD concentrations, aqueous solubility of CyA was observed to decrease. Addition of sodium acetate (up to 5% w/v) to aqueous solutions containing 20% w/v HP-β-CD and increasing concentrations of α-CD resulted in a significant reduction in CyA solubility. Complex formation between CyA and both α-CD and HP-β-CD was confirmed by differential scanning calorimetry (DSC). No significant changes were observed in the IR spectra of either CyA or CD following complex formation suggesting chemical interaction between CyA and the CD was unlikely. Phase-solubility studies showed that α-CD had a much greater effect on the solubility of CyA than HP-β-CD. Addition of HP-β-CD to aqueous solutions of α-CD affected the solubility of CyA in these systems. A mixture of 15% w/v α-CD and 20% w/v HP-β-CD was optimal for increasing aqueous solubility of CyA.  相似文献   

8.
It is a difficult task to recognize the trends in molecular physical properties relevant to a specific chemical class and find a way to optimize potential compounds. We present here a novel hierarchical data visualization technique, named "HeiankyoView", to visualize large-scale multidimensional chemical information. HeiankyoView represents hierarchically organized data objects by mapping leaf nodes as colored square icons and nonleaf nodes as rectangular borders. In this way, data objects can be expressed as equishaped icons without overlapping one another in the two-dimensional display space. HeiankyoView has been applied to visualize aqueous solubility data for 908 compounds collected from the published literature. When the results of a recursive partitioning analysis and hierarchical clustering analysis were visualized, the trends hidden in the solubility data could be effectively displayed as intuitively understandable visual images. Most interestingly, the data visualization technique, without any statistical computations, was able to assist us in extracting from such large-scale data meaningful information establishing that ClogP and the molecular weight are critical factors in determining aqueous solubility. Thus, HeiankyoView is a powerful tool to help us understand structure-activity relationships intuitively from a large-scale data set.  相似文献   

9.
10.
Accurate modeling of the solubility behavior of CO2 in the aqueous alkanolamine solutions is important to design and optimization of equipment and process. In this work, the thermodynamics of CO2 in aqueous solution of N-methyldiethanolamine (MDEA) and piperazine (PZ) is studied by the electrolyte non-random two liquids (NRTL) model. The chemical equilibrium constants are calculated from the free Gibbs energy of formation, and the Henry’s constants of CO2 in MDEA and PZ are regressed to revise the value in the pure water. New experimental data from literatures are added to the regression process. Therefore, this model should provide a comprehensive thermodynamic representation for the quaternary system with broader ranges and more accurate predictions than previous work. Model results are compared to the experimental vapor-liquid equilibrium (VLE), speciation and heat of absorption data, which show that the model can predict the experimental data with reasonable accuracy.  相似文献   

11.
The solubility of drugs in water is of central importance in the process of drug discovery and development from molecular design to pharmaceutical formulation and biopharmacy. The ability to estimate the aqueous solubility and other properties of a promising lead compound affecting its pharmacokinetics is a prerequisite to rational drug design, although it has received much less attention than the prediction of drug-receptor interactions. In this review, methods for the estimation of aqueous solubility of organic compounds are described and limited to approaches, which might be used in the early stage of drug design and development.  相似文献   

12.
Artificial neural networks have been used for the correlation and prediction of solubility data of ammonia in ionic liquids. This solubility of ammonia is highly variable for different types of ionic liquids at the same temperature and pressure, its correlation and prediction is of special importance in the removal of ammonia from flue gases for which effective and efficient solvents are required. Nine binary ammonia + ionic liquids mixtures were considered in the study. Solubility data (PTx) of these systems were taken from the literature (208 data points for training and 50 data points for testing). The training variables are the temperature and the pressure of the binary systems (T, P), being the target variable the solubility of ammonia in the ionic liquid (x). The study shows that the neural network model is a good alternative method for the estimation of solubility for this type of mixtures. Absolute average deviations were below 5.6%, for each isothermal data set and overall absolute average deviations were below 3.0%. Only in the range of low solubility (below 0.2 in mole fraction) did predicted solubility give deviations higher than 10%.  相似文献   

13.
Indisputable importance of drug solubility in various industrial perspectives has motivated the scientists to evaluate different techniques to improve it. Fenoprofen is a significant nonsteroidal anti-inflammatory drug (NSAID), that is the orally administered to relieve mild to moderate pain and the unfavorable symptoms of osteoarthritis and rheumatoid arthritis (i.e., inflammation and stiffness). Supercritical fluids (SCFs) belong to a certain type of fluids, in which their temperature and pressure are higher than the critical point. This property allows the CO2SCF to simultaneously possess the characteristics of both a liquid and a gas. The prominent target of this paper is to mathematically develop three predictive models via machine learning (ML) technique to optimize the solubility of Fenoprofen in CO2SCF. In this study, we have 32 data vectors in each dataset, including two input features of pressure and temperature. The output target is solubility, which we are going to model and analyze. Models are constructed through the use of Modular ANN (MANN), Gaussian processes regression (GPR), and the K-Nearest Neighbor technique (KNN) in this body of work. The glowworm swarm optimization (GSO) swarm-based method is utilized in order to carry out the process of model optimization. The root mean squared error (RMSE) rates for GSO-KNN, GSO-MANN, and GSO-GPR are respectively 5.25E-04, 5.46E-04, and 3.01E-05. The aforementioned models were also judged according to a number of other criteria, and since the GSO-GPR model was found to be the most effective according to all of these standards, it is being treated as the conclusive model of this investigation. In addition, the maximum error has been brought down to 5.02E-05 with the help of this model, which has an R2-score of 0.999.  相似文献   

14.
Component control is a key step of the quality control process in gentamicin (GM) production. The near infrared (NIR) method allows the rapid analysis of various components for the on-line control in the production process. In this study, we selected specific NIR spectral regions and eliminated solvent interference in constructing the predictive models of GM content in aqueous solution. We found that two factors could lead to better NIR predictions: (i) the combined use of specific NIR spectral regions related to both structural characteristics and content; and (ii) the analysis of sample spectra based on solvent reference spectra to identify and extract spectral information of testing components. We constructed predictive models for total GM C components, GM C1a (C1a), micronomicin (MCR) and sisomicin (SISO) in aqueous solution, respectively. These models can be used for the quick content analysis of different components in various types of GM injection samples.  相似文献   

15.
16.
We describe the use of Bayesian regularized artificial neural networks (BRANNs) coupled with automatic relevance determination (ARD) in the development of quantitative structure-activity relationship (QSAR) models. These BRANN-ARD networks have the potential to solve a number of problems which arise in QSAR modeling such as the following: choice of model; robustness of model; choice of validation set; size of validation effort; and optimization of network architecture. The ARD method ensures that irrelevant or highly correlated indices used in the modeling are neglected as well as showing which are the most important variables in modeling the activity data. The application of the methods to QSAR of compounds active at the benzodiazepine and muscarinic receptors as well as some toxicological data of the effect of substituted benzenes on Tetetrahymena pyriformis is illustrated.  相似文献   

17.
In the present work, the Henderson-Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained on a druglike PHYSPROP subset of 4548 compounds. For the prediction of acid/base dissociation coefficients, the commercial tool Marvin has been used, following validation on a data set of 467 molecules from the PHYSPROP database. The best performing network for intrinsic solubility predictions has a cross-validated root mean square error (RMSE) of 0.70 log S-units, while the Marvin pKa plug-in has an RMSE of 0.71 pH-units. A data set of 27 drugs with experimentally determined pH-solubility curves was assembled from the literature for the validation of the combined pH-dependent model, giving a mean RMSE of 0.79 log S-units. Finally, the combined model has been applied on profiling the solubility space at low pH of five large vendor libraries.  相似文献   

18.
19.
The solubility of oxygen in aqueous fluorocarbon emulsions has been measured directly for several perfluorocarbons and monobromo or monoiodo-perfluorocarbons. The measured oxygen solubilities are consistent with results for the solubility of oxygen in neat liquid perfluorinated organic compounds.  相似文献   

20.
Aqueous solubility is recognized as a critical parameter in both the early- and late-stage drug discovery. Therefore, in silico modeling of solubility has attracted extensive interests in recent years. Most previous studies have been limited in using relatively small data sets with limited diversity, which in turn limits the predictability of derived models. In this work, we present a support vector machines model for the binary classification of solubility by taking advantage of the largest known public data set that contains over 46?000 compounds with experimental solubility. Our model was optimized in combination with a reduction and recombination feature selection strategy. The best model demonstrated robust performance in both cross-validation and prediction of two independent test sets, indicating it could be a practical tool to select soluble compounds for screening, purchasing, and synthesizing. Moreover, our work may be used for comparative evaluation of solubility classification studies ascribe to the use of completely public resources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号