首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
构建支持向量机-偏最小二乘法为药物构效关系建模   总被引:6,自引:0,他引:6  
李剑  陈德钊  成忠  叶子青 《分析化学》2006,34(2):263-266
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。  相似文献   

3.
The performance of back-propagation artificial neural networks (NN) and partial least squares (PLS) regression for the calibration of linear and nonlinear systems has been investigated by using six types of synthetic data. Three PLS methods, conventional linear-PLS and two nonlinear-PLS methods, have been used in the study. In all but one of the synthetic data types, the band intensities varied nonlinearly with concentration. These five data types were designed to represent the effect of band shifts with increasing concentration, a nonlinear relationship between peak height and concentration, or a combination of both types of nonlinearities. The results showed that NNs perform better than PLS for all the nonlinear datasets. When a band shift is the major reason for the nonlinearity, the relative performance of NNs and PLS depends on the overlap of the absorption bands. If there is no band overlap, neither NN nor PLS can calibrate the data accurately but the results could be improved by convolving the spectral features with a Gaussian broadening function. The results indicate that a combination of peak position shift and peak height change is the most difficult nonlinearity to calibrate. NN and PLS were also used to determine the concentration of CHCl3 in pure component and mixtures of CHCl3 and CH2Cl2 using their Fourier transform infrared (FT-IR) spectra, a dataset that has been proved nonlinear in high concentrations due to the nonlinear response of the detector. The best results for the experimental data were obtained by applying one hidden layer NN to the mean-centered absorbance spectra.  相似文献   

4.
A chemometric approach based on the combined use of the principal component analysis (PCA) and artificial neural network (ANN) was developed for the multicomponent determination of caffeine (CAF), mepyramine (MEP), phenylpropanolamine (PPA) and pheniramine (PNA) in their pharmaceutical preparations without any chemical separation. The predictive ability of the ANN method was compared with the classical linear regression method Partial Least Squares 2 (PLS2). The UV spectral data between 220 and 300 nm of a training set of sixteen quaternary mixtures were processed by PCA to reduce the dimensions of input data and eliminate the noise coming from instrumentation. Several spectral ranges and different numbers of principal components (PCs) were tested to find the PCA-ANN and PLS2 models reaching the best determination results. A two layer ANN, using the first four PCs, was used with log-sigmoid transfer function in first hidden layer and linear transfer function in output layer. Standard error of prediction (SEP) was adopted to assess the predictive accuracy of the models when subjected to external validation. PCA-ANN showed better prediction ability in the determination of PPA and PNA in synthetic samples with added excipients and pharmaceutical formulations. Since both components are characterized by low absorptivity, the better performance of PCA-ANN was ascribed to the ability in considering all non-linear information from noise or interfering excipients.  相似文献   

5.
Infrared emissions (IREs) of samples of pentaerythritol tetranitrate (PETN) deposited as contamination residues on various substrates were measured to generate models for the detection and discrimination of the important nitrate ester from the emissions of the substrates. Mid‐infrared emissions were generated by heating the samples remotely using laser‐induced thermal emission (LITE). Chemometrics multivariate analysis techniques such as principal component analysis (PCA), soft independent modeling by class analogy (SIMCA), partial least squares‐discriminant analysis (PLS‐DA), support vector machines (SVMs), and neural network (NN) were employed to generate the models for the classification and discrimination of PETN IREs from substrate thermal emissions. PCA exhibited less variability for the LITE spectra of PETN/substrates. SIMCA was able to predict only 44.7% of all samples, while SVM proved to be the most effective statistical analysis routine, with a discrimination performance of 95%. PLS‐DA and NN achieved prediction accuracies of 94% and 88%, respectively. High sensitivity and specificity values were achieved for five of the seven substrates investigated. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
Simultaneous determination of several elements (U, Ta, Mn, Zr and W) with inductively coupled plasma atomic emission spectrometry (ICP-AES) in the presence of spectral interference was performed using chemometrics methods. True comparison between artificial neural network (ANN) and partial least squares regression (PLS) for simultaneous determination in different degrees of overlap was investigated. The emission spectra were recorded at uranium analytical line (263.553 nm) with a 0.06 nm spectral window by ICP-AES. Principal component analysis was applied to data and scores on 5 dominant principal components were subjected to ANN. A 5-5-5 (input, hidden and output neurons) network was used with linear transfer function after both hidden and output layers. The PI,S model was trained with five latent variables and 20 samples in calibration set. The relative errors of predictions (REP) in test set were 3.75% and 3.56% for ANN and PLS respectively.  相似文献   

7.
This paper introduces a technique to visualise the information content of the kernel matrix and a way to interpret the ingredients of the Support Vector Regression (SVR) model. Recently, the use of Support Vector Machines (SVM) for solving classification (SVC) and regression (SVR) problems has increased substantially in the field of chemistry and chemometrics. This is mainly due to its high generalisation performance and its ability to model non-linear relationships in a unique and global manner. Modeling of non-linear relationships will be enabled by applying a kernel function. The kernel function transforms the input data, usually non-linearly related to the associated output property, into a high dimensional feature space where the non-linear relationship can be represented in a linear form. Usually, SVMs are applied as a black box technique. Hence, the model cannot be interpreted like, e.g., Partial Least Squares (PLS). For example, the PLS scores and loadings make it possible to visualise and understand the driving force behind the optimal PLS machinery. In this study, we have investigated the possibilities to visualise and interpret the SVM model. Here, we exclusively have focused on Support Vector Regression to demonstrate these visualisation and interpretation techniques. Our observations show that we are now able to turn a SVR black box model into a transparent and interpretable regression modeling technique.  相似文献   

8.
自适应模糊偏最小二乘方法在药物构效关系建模中的应用   总被引:2,自引:0,他引:2  
作为一种局部逼近方法,自适应神经模糊推理系统(ANFIS)适于为药物定量构效关系(QSAR)建模。描述药物分子结构的参数较多,常存在耦合关系,会增加建模难度,并影响模型的预报性能。为此,将ANFIS和偏最小二乘(PLS)相结合,先由PLS从样本数据中提取成分,再由ANFIS实现每对成分间的非线性映射,并基于输出误差进一步修正所提取的成分,使之对因变量具有最优的解释能力,由此构建为EB-AFPLS方法。该法已成功地应用于HIV-1蛋白酶抑制剂的QSAR建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

9.
10.
Different calibration techniques are available for spectroscopic applications that show nonlinear behavior. This comprehensive comparative study presents a comparison of different nonlinear calibration techniques: kernel PLS (KPLS), support vector machines (SVM), least-squares SVM (LS-SVM), relevance vector machines (RVM), Gaussian process regression (GPR), artificial neural network (ANN), and Bayesian ANN (BANN). In this comparison, partial least squares (PLS) regression is used as a linear benchmark, while the relationship of the methods is considered in terms of traditional calibration by ridge regression (RR). The performance of the different methods is demonstrated by their practical applications using three real-life near infrared (NIR) data sets. Different aspects of the various approaches including computational time, model interpretability, potential over-fitting using the non-linear models on linear problems, robustness to small or medium sample sets, and robustness to pre-processing, are discussed. The results suggest that GPR and BANN are powerful and promising methods for handling linear as well as nonlinear systems, even when the data sets are moderately small. The LS-SVM is also attractive due to its good predictive performance for both linear and nonlinear calibrations.  相似文献   

11.
Iron, copper, zinc and selenium were determined directly in serum samples from healthy individuals (n=33) and cancer patients (n=27) by total reflection X-ray fluorescence spectrometry using the Compton peak as internal standard [L.M. Marcó P. et al., Spectrochim. Acta Part B 54 (1999) 1469–1480]. The standardized concentrations of these elements were used as input data for two-layer artificial neural networks trained with the generalized delta rule in order to classify such individuals according to their health status. Various artificial neural networks, comprising a linear function in the input layer, a hyperbolic tangent function in the hidden layer and a sigmoid function in the output layer, were evaluated for such a purpose. Of the networks studied, the (4:4:1) gave the highest estimation (98%) and prediction rates (94%). The latter demonstrates the potential of the total reflection X-ray fluorescence spectrometry/artificial neural network approach in clinical chemistry.  相似文献   

12.
Near-infrared (NIR) spectroscopy, in combination with chemometrics, enables nondestructive analysis of solid samples without time-consuming sample preparation methods. A new method for the nondestructive determination of compound amoxicillin powder drug via NIR spectroscopy combined with an improved neural network model based on principal component analysis (PCA) and radial basis function (RBF) neural networks is investigated. The PCA technique is applied to extraction relevant features from lots of spectra data in order to reduce the input variables of the RBF neural networks. Various optimum principal component analysis-radial basis function (PCA-RBF) network models based on conventional spectra and preprocessing spectra (standard normal variate (SNV) and multiplicative scatter correction (MSC)) have been established and compared. Principal component regression (PCR) and partial least squares (PLS) multivariate calibrations are also used, which are compared with PCA-RBF neural networks. Experiment results show that the proposed PCA-RBF method is more efficient than PCR and PLS multivariate calibrations. And the PCA-RBF approach with SNV preprocessing spectra is found to provide the best performance.  相似文献   

13.
In this study, different approaches to the multivariate calibration of the vapors of two refrigerants are reported. As the relationships between the time-resolved sensor signals and the concentrations of the analytes are nonlinear, the widely used partial least-squares regression (PLS) fails. Therefore, different methods are used, which are known to be able to deal with nonlinearities present in data. First, the Box–Cox transformation, which transforms the dependent variables nonlinearly, was applied. The second approach, the implicit nonlinear PLS regression, tries to account for nonlinearities by introducing squared terms of the independent variables to the original independent variables. The third approach, quadratic PLS (QPLS), uses a nonlinear quadratic inner relationship for the model instead of a linear relationship such as PLS. Tree algorithms are also used, which split a nonlinear problem into smaller subproblems, which are modeled using linear methods or discrete values. Finally, neural networks are applied, which are able to model any relationship. Different special implementations, like genetic algorithms with neural networks and growing neural networks, are also used to prevent an overfitting. Among the fast and simpler algorithms, QPLS shows good results. Different implementations of neural networks show excellent results. Among the different implementations, the most sophisticated and computing-intensive algorithms (growing neural networks) show the best results. Thus, the optimal method for the data set presented is a compromise between quality of calibration and complexity of the algorithm.Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

14.
人工神经网络法预测二维色谱柱效   总被引:11,自引:0,他引:11  
 采用基于变步长BP算法的人工神经网络,对高效微 填充柱-毛细管柱构成的二维柱色谱系统建立了柱效与影响因素的权接拓扑模型, 并用于不同操作条件下二维柱系统的柱效预测中,取得了较好的效果。  相似文献   

15.
用于药品质量快速检测的近红外光谱模糊神经元分类方法   总被引:9,自引:1,他引:9  
刘雪松  程翼宇 《化学学报》2005,63(24):2216-2220
针对非线性且分类界线模糊的药品质量类别快速测定难题, 将近红外光谱分析与模糊神经网络相结合, 经研究提出近红外光谱模糊神经网络分类方法, 用于计算辨析中药等化学组成复杂药品的近红外光谱模式类别, 从而快速评定这类药品的质量. 以参麦注射液为典型分析对象, 以鉴别其生产厂家这一模式分类问题为例, 考核本文方法, 结果表明, 其分类准确率达到94.2%, 明显优于经典的BP神经网络分类方法(84.6%), 可望用于中药产品质量类别的快速检测与评价.  相似文献   

16.
Maleki N  Safavi A  Sedaghatpour F 《Talanta》2004,64(4):830-835
An artificial neural network (ANN) model is developed for simultaneous determination of Al(III) and Fe(III) in alloys by using chrome azurol S (CAS) as the chromogenic reagent and CCD camera as the detection system. All calibration, prediction and real samples data were obtained by taking a single image. Experimental conditions were established to reduce interferences and increase sensitivity and selectivity in the analysis of Al(III) and Fe(III). In this way, an artificial neural network consisting of three layers of nodes was trained by applying a back-propagation learning rule. Sigmoid transfer functions were used in the hidden and output layers to facilitate nonlinear calibration. Both Al(III) and Fe(III) can be determined in the concentration range of 0.25-4 μg ml−1 with satisfactory accuracy and precision. The proposed method was also applied satisfactorily to the determination of considered metal ions in two synthetic alloys.  相似文献   

17.
We introduce a new nonlinear partial least squares algorithm ‘Quadratic Fuzzy PLS (QFPLS)’ that combines the outer linear Partial Least Squares (PLS) framework and the Takagi–Sugeno–Kang (TSK) fuzzy inference system. The inner relation between the input and the output PLS score vectors is modeled by a quadratic TSK fuzzy inference system. The performance of the proposed QFPLS method is tested and compared against four other well‐known partial least squares methods (Linear PLS (LPLS), Quadratic PLS (QPLS), Linear Fuzzy PLS (LFPLS), and Neural Network PLS (NNPLS)) on various different types of randomly generated test data. QFPLS outperformed competitors based on two comparison measures: the output variables cumulative per cent variance captured by the PLS latent variables and the root mean‐square error of prediction (RMSEP). Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
In this work, a framework is provided for identifying intracranial electroencephalography (iEEG) seizures based on discrete wavelet transform (DWT) analysis of iEEG signals using forward propagation and feedback neural networks. The performance of 5 different data sets combination classifications is studied using the probabilistic neural network (PNN), learning vector quantization neural network (LVQ) and Elman neural network (ENN). Different feature combinations serve as the input vectors of the classifiers to obtain the best outcomes. It has been found that PNN has less running time and provides better classification accuracy (CA) than ENN and LVQ classifiers for all 5 classification problems. It is worth noticing that the CA for the C-D classification task, which shows the status of pre-ictal versus post-ictal, has been greatly improved, and reached 83.13%. Hence, the epilepsy iEEG signals pattern recognition based on DWT statistical features using the PNN classifier is more suitable for forming a reliable, automatic classification system in order to assist doctors in diagnosis.  相似文献   

19.
20.
罗明亮  李梦龙 《化学学报》2000,58(11):1409-1412
针对化学领域中的非线性关系特点,在常规BP网络基础上,提出了一种“杂交”型BP网络,包含两个隐层,并有输入层到输出层的直连接。它可很好地解释数据中同时存在的线性及非线性关系,效果优于多元回归法及普通BP算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号