首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 90 毫秒
1.
Multivariate curve resolution (MCR) and especially the orthogonal projection approach (OPA) can be applied to spectroscopic data and were proved to be suitable for process monitoring. To improve the quality of the on-line monitoring of batch processes, it is interesting to get as many as possible spectra in a given period of time. Nevertheless, hardware limitations could lead to the fact that it is not possible to acquire more than a certain number of spectra in this given period of time. Wavelength selection could be a good way to limit this problem since it decreases size, and consequently the acquisition time, of each recorded spectrum. This paper details an industrial application of genetic algorithms (GA) coupled with a curve resolution method (OPA) for such purpose.  相似文献   

2.
3.
4.
5.
Simplified Molecular Input Line Entry System (SMILES) nomenclature has been used as elucidating the molecular structure in construction of the quantitative structure-activity relationships (QSAR) for predicting bee toxicity. On the basis of the symbols used in the SMILES notation numerical parameters have been obtained, which are simple and fast to calculate. The method has been used to develop a QSAR model to predict toxicity of pesticides on bees. Results on a heterogeneous set of pesticides are good. Statistical characteristics of this model are: n=85, R2=0.68, s=0.82, F=180 (training set); n=20, R2=0.72, s=0.68, F=46 (test set).  相似文献   

6.
7.
8.
Quantitative structure–activity relationships (QSAR) methods are urgently needed for predicting ADME/T (absorption, distribution, metabolism, excretion and toxicity) properties to select lead compounds for optimization at the early stage of drug discovery, and to screen drug candidates for clinical trials. Use of suitable QSAR models ultimately results in lesser time-cost and lower attrition rate during drug discovery and development. In the case of ADME/T parameters, drug metabolism is a key determinant of metabolic stability, drug–drug interactions, and drug toxicity. QSAR models for predicting drug metabolism have undergone significant advances recently. However, most of the models used lack sufficient interpretability and offer poor predictability for novel drugs. In this review, we describe some considerations to be taken into account by QSAR for modeling drug metabolism, such as the accuracy/consistency of the entire data set, representation and diversity of the training and test sets, and variable selection. We also describe some novel statistical techniques (ensemble methods, multivariate adaptive regression splines and graph machines), which are not yet used frequently to develop QSAR models for drug metabolism. Subsequently, rational recommendations for developing predictable and interpretable QSAR models are made. Finally, the recent advances in QSAR models for cytochrome P450-mediated drug metabolism prediction, including in vivo hepatic clearance, in vitro metabolic stability, inhibitors and substrates of cytochrome P450 families, are briefly summarized.  相似文献   

9.
In this study, a new variable selection method called bootstrapping soft shrinkage (BOSS) method is developed. It is derived from the idea of weighted bootstrap sampling (WBS) and model population analysis (MPA). The weights of variables are determined based on the absolute values of regression coefficients. WBS is applied according to the weights to generate sub-models and MPA is used to analyze the sub-models to update weights for variables. The optimization procedure follows the rule of soft shrinkage, in which less important variables are not eliminated directly but are assigned smaller weights. The algorithm runs iteratively and terminates until the number of variables reaches one. The optimal variable set with the lowest root mean squared error of cross-validation (RMSECV) is selected. The method was tested on three groups of near infrared (NIR) spectroscopic datasets, i.e. corn datasets, diesel fuels datasets and soy datasets. Three high performing variable selection methods, i.e. Monte Carlo uninformative variable elimination (MCUVE), competitive adaptive reweighted sampling (CARS) and genetic algorithm partial least squares (GA-PLS) are used for comparison. The results show that BOSS is promising with improved prediction performance. The Matlab codes for implementing BOSS are freely available on the website: http://www.mathworks.com/matlabcentral/fileexchange/52770-boss.  相似文献   

10.
There are many pathogen microbial species with very different antimicrobial drugs susceptibility. In this work, we selected pairs of antifungal drugs with similar/dissimilar species predicted-activity profile and represented it as a large network, which may be used to identify drugs with similar mechanism of action. Computational chemistry prediction of the biological activity based on quantitative structure-activity relationships (QSAR) susbtantially increases the potentialities of this kind of networks, avoiding time and resource-consuming experiments. Unfortunately, most QSAR models are unspecific or predict activity against only one species. To solve this problem we developed a multispecies QSAR classification model, in which the outputs were the inputs of the aforementioned network. Overall model classification accuracy was 87.0% (161/185 compounds) in training, 83.4% (50/61) in validation, and 83.7% for 288 additional antifungal compounds used to extend model validation for network construction. The network predicted has 59 nodes (compounds), 648 edges (pairs of compounds with similar activity), low coverage density d = 37.8%, and distribution more close to normal than to exponential. These results are more characteristic of a not-overestimated random network, clustering different drug mechanisms of actions, than of a less useful power law network with few mechanisms (network hubs).  相似文献   

11.
12.
This study performed an analysis of the influence of the training and test set rational selection on the quality and predictively of the quantitative structure–activity relationship (QSAR) model. The study was carried out on three different datasets of Influenza Neuraminidase (H1N1) inhibitors. The three datasets were divided into training and test sets using three rational selection methods: based on k-means, Kennard–Stone algorithm and Activity and the results were compared with Random selection. Then, a total of 31,490 mathematical models were developed and those models that presented a determination coefficient higher than: r2train > 0.8, r2loo > 0.7, r2test > 0.5 and minimum standard deviation (SD) and minimum root-mean square error (RMS) were selected. The selected models were validated using the internal leave-one-out method and the predictive capacity was evaluated by the external test set. The results indicate that random selection could lead to erroneous results. In return, a rational selection allows for obtaining more reliable conclusions. The QSAR models with major predictive power were found using the k-means algorithm and selection by activity.  相似文献   

13.
14.
A quantitative structure–activity relationship (QSAR) is a mathematical model that relates a molecular structure to a physicochemical property or a biological activity. The log P of a set of 38 of 2-furylethylenes, biologically active substances exhibiting a broad spectrum of antimicrobial, antiparasitic, cytotoxic, carcinogenic and mutagenic activities, was modeled by using topological indices provided by TOPOCLUJ and DRAGON software packages. The models derived showed good stability and predictability (as given by the leave-one-out LOO cross-validation data). The results are compared with those reported in literature, obtained by different methodology.  相似文献   

15.
16.
The identification of disease-relevant genes represents a challenge in microarray-based disease diagnosis where the sample size is often limited. Among established methods, reversible jump Markov Chain Monte Carlo (RJMCMC) methods have proven to be quite promising for variable selection. However, the design and application of an RJMCMC algorithm requires, for example, special criteria for prior distributions. Also, the simulation from joint posterior distributions of models is computationally extensive, and may even be mathematically intractable. These disadvantages may limit the applications of RJMCMC algorithms. Therefore, the development of algorithms that possess the advantages of RJMCMC methods and are also efficient and easy to follow for selecting disease-associated genes is required. Here we report a RJMCMC-like method, called random frog that possesses the advantages of RJMCMC methods and is much easier to implement. Using the colon and the estrogen gene expression datasets, we show that random frog is effective in identifying discriminating genes. The top 2 ranked genes for colon and estrogen are Z50753, U00968, and Y10871_at, Z22536_at, respectively. (The source codes with GNU General Public License Version 2.0 are freely available to non-commercial users at: http://code.google.com/p/randomfrog/.)  相似文献   

17.
Validation is a crucial aspect for quantitative structure–activity relationship (QSAR) model development. External validation is considered, in general, as the most conclusive proof of predictive capacity of a QSAR model. In the absence of truly external data set, external validation is usually performed on test set compounds, which are members of the original data set but not used in model development exercise. In the case of small data sets, QSAR researchers experience problem in model development due to the fact that the developed models may be less reliable on account of the small number of training set compounds and such models may also show poor external predictability because the models may not have captured all necessary features required for the particular structure–activity relationships. The present paper attempts to show that ‘true r(LOO)’ statistic calculated based on the model derived from the undivided data set with application of variable selection strategy at each cycle of leave‐one‐out (LOO) validation may reflect external validation characteristics of the developed model thus obviating the requirement of splitting of the data set into training and test sets. This approach may be helpful in the case of small data sets as it uses all available data for model development and validation thus making the resulting model more reliable. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
19.
硝基芳烃对黑呆头鱼毒性定量构效关系的研究   总被引:6,自引:1,他引:6  
用CNDO/2法计算50种硝基芳烃化合物的净电荷(QC、QN及Q-NO2);使用MNDO法计算其中42种化合物的ELUMO、EHOMO、生成热之差△(△Hf)及偶极矩μ。定量分析了7种量化参数与黑呆头鱼毒性96h-LC50的构效关系,通过统计分析,得到如下模式:式中:-1gLC50=11. 35-1. 28ELUMO-9.17QN+0. 46EHOMO-0.12μ n=35,r=0.920,s=0.298。应用所得方程及量化参数讨论所研究系列化合物在鱼体内的毒性作用。  相似文献   

20.
The calibration performance of partial least squares regression for one response (PLS1) can be improved by eliminating uninformative variables. Many variable-reduction methods are based on so-called predictor-variable properties or predictive properties, which are functions of various PLS-model parameters, and which may change during the steps of the variable-reduction process. Recently, a new predictive-property-ranked variable reduction method with final complexity adapted models, denoted as PPRVR-FCAM or simply FCAM, was introduced. It is a backward variable elimination method applied on the predictive-property-ranked variables. The variable number is first reduced, with constant PLS1 model complexity A, until A variables remain, followed by a further decrease in PLS complexity, allowing the final selection of small numbers of variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号