首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
构建支持向量机-偏最小二乘法为药物构效关系建模   总被引:6,自引:0,他引:6  
李剑  陈德钊  成忠  叶子青 《分析化学》2006,34(2):263-266
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。  相似文献   

2.
In the present study, boosting has been combined with partial least‐squares discriminant analysis (PLS‐DA) to develop a new pattern recognition method called boosting partial least‐squares discriminant analysis (BPLS‐DA). BPLS‐DA is implemented by firstly constructing a series of PLS‐DA models on the various weighted versions of the original calibration set and then combining the predictions from the constructed PLS‐DA models to obtain the integrative results by weighted majority vote. Coupled with near infrared (NIR) spectroscopy, BPLS‐DA has been applied to discriminate different kinds of tea varieties. As comparisons to BPLS‐DA, the conventional principal component analysis, linear discriminant analysis (LDA), and PLS‐DA have also been investigated. Experimental results have shown that the inter‐variety difference can be accurately and rapidly distinguished via NIR spectroscopy coupled with BPLS‐DA. Moreover, the introduction of boosting drastically enhances the performance of an individual PLS‐DA, and BPLS‐DA is a well‐performed pattern recognition technique superior to LDA. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

3.
This study presents an analytical method for determining interfacial tension and relative density in insulating oils using near infrared spectrometry (NIR). Five different strategies of regression were evaluated: partial least squares (PLS) with significant regression coefficients selected by jack-knife algorithm; interval PLS (iPLS); multiple linear regression (MLR) with variable selection by genetic algorithm (MLR/GA), successive projections algorithm (MLR/SPA) and stepwise strategy (SR/MLR). The overall results point to MLR/SPA as the best modeling strategy. The strategy is simpler and uses fewer spectral variables.  相似文献   

4.
A new class-modeling method, referred to as partial least squares density modeling (PLS-DM), is presented. The method is based on partial least squares (PLS), using a distance-based sample density measurement as the response variable. Potential function probability density is subsequently calculated on PLS scores and used, jointly with residual Q statistics, to develop efficient class models. The influence of adjustable model parameters on the resulting performances has been critically studied by means of cross-validation and application of the Pareto optimality criterion. The method has been applied to verify the authenticity of olives in brine from cultivar Taggiasca, based on near-infrared (NIR) spectra recorded on homogenized solid samples. Two independent test sets were used for model validation. The final optimal model was characterized by high efficiency and equilibrate balance between sensitivity and specificity values, if compared with those obtained by application of well-established class-modeling methods, such as soft independent modeling of class analogy (SIMCA) and unequal dispersed classes (UNEQ).  相似文献   

5.
6.
Extension of standard regression to the case of multiple regressor arrays is given via the Kronecker product. The method is illustrated using ordinary least squares regression (OLS) as well as the latent variable (LV) methods principal component regression (PCR) and partial least squares regression (PLS). Denoting the method applied to PLS as mrPLS, the latter was shown to explain as much or more variance for the first LV relative to the comparable L‐partial least squares regression (L‐PLS) model. The same relationship holds when mrPLS is compared to PLS or n‐way partial least squares (N‐PLS) and the response array is 2‐way or 3‐way, respectively, where the regressor array corresponding to the first mode of the response array is 2‐way and the second mode regressor array is an identity matrix. In a comparison with N‐PLS using fragrance data, mrPLS proved superior in a validation sense when model selection was used. Though the focus is on 2‐way regressor arrays, the method can be applied to n‐way regressors via N‐PLS. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

7.
武晓莉  李艳君  吴铁军 《分析化学》2006,34(8):1091-1095
为提高水质参数总有机碳(TOC)的紫外吸收光谱分析的预测精度,提出一种基于Boosting理论的迭代式回归建模算法,并根据统计学习理论提出一种新的迭代停止判据,可有效防止过拟合,显著提高模型预测精度。为评估所提算法的性能,分别采用本算法和3种常用的光谱分析方法,即偏最小二乘、主成分回归和人工神经网络,对自行研制的紫外光谱水质分析仪实测的一组数据进行了建模和预测。计算结果表明:相对于其他3种方法,本算法具有生成的模型预测精度高的显著优势。  相似文献   

8.
In the present study, a dry film-based Fourier transformed-infrared (FT-IR) spectroscopic technique, coupled with boosting support vector regression (BSVR), was employed for a blood glucose assay. Potassium thiocyanate (KSCN) was taken in the dry-film method as an internal standard to compensate for any film thickness variation. This technique circumvents interference from water absorption, and requires only 5 microl of a sample. Moving window partial least-squares regression (MWPLSR) was used for wavenumber interval selection before multivariate modeling. By using the BSVR modeling technique, glucose in plasma could be determined over a 0.4 - 20 mmol/l concentration range with satisfactory accuracy. The performance of the BSVR methodology was compared with that of conventional support vector regression (SVR) as well as partial-least squares (PLS). The results demonstrated that BSVR is an effective multivariate calibration tool, providing better performance than conventional PLS and SVR.  相似文献   

9.
To date, few efforts have been made to take simultaneous advantage of the local nature of spectral data in both the time and frequency domains in a single regression model. We describe here the use of a novel chemometrics algorithm using the wavelet transform. We call the algorithm dual-domain regression, as the regression step defines a weighted model in the time-domain based on the contributions of parallel, frequency-domain models made from wavelet coefficients reflecting different scales. In principle, any regression method can be used, and implementation of the algorithm using partial least squares regression and principal component regression are reported here. The performance of the models produced from the algorithm is generally superior to that of regular partial least squares (PLS) or principal component regression (PCR) models applied to data restricted to a single domain. Dual-domain PLS and PCR algorithms are applied to near infrared (NIR) spectral datasets of Cargill corn samples and sets of spectra collected on batch chemical reactions run in different reactors to illustrate the improved robustness of the modeling.  相似文献   

10.
In this work we evaluated the use of different variable selection techniques combined with partial least‐squares regression (PLS) – genetic algorithm PLS (GA‐PLS), interval PLS (iPLS), and synergy interval PLS (siPLS) – in the simultaneous determination of Cd(II), Cu(II), Pb(II) and Zn(II) by anodic stripping voltammetry at a bismuth film. Generally, variable selection provided an improvement in prediction results when compared to full‐voltammogram PLS. The use of interval selection based algorithms have shown to be most adequate than the selection of discrete variables by GA. Excellent analytical performances were obtained despite the inherent complexity of the simultaneous determination.  相似文献   

11.
By employing the simple but effective principle ‘survival of the fittest’ on which Darwin's Evolution Theory is based, a novel strategy for selecting an optimal combination of key wavelengths of multi-component spectral data, named competitive adaptive reweighted sampling (CARS), is developed. Key wavelengths are defined as the wavelengths with large absolute coefficients in a multivariate linear regression model, such as partial least squares (PLS). In the present work, the absolute values of regression coefficients of PLS model are used as an index for evaluating the importance of each wavelength. Then, based on the importance level of each wavelength, CARS sequentially selects N subsets of wavelengths from N Monte Carlo (MC) sampling runs in an iterative and competitive manner. In each sampling run, a fixed ratio (e.g. 80%) of samples is first randomly selected to establish a calibration model. Next, based on the regression coefficients, a two-step procedure including exponentially decreasing function (EDF) based enforced wavelength selection and adaptive reweighted sampling (ARS) based competitive wavelength selection is adopted to select the key wavelengths. Finally, cross validation (CV) is applied to choose the subset with the lowest root mean square error of CV (RMSECV). The performance of the proposed procedure is evaluated using one simulated dataset together with one near infrared dataset of two properties. The results reveal an outstanding characteristic of CARS that it can usually locate an optimal combination of some key wavelengths which are interpretable to the chemical property of interest. Additionally, our study shows that better prediction is obtained by CARS when compared to full spectrum PLS modeling, Monte Carlo uninformative variable elimination (MC-UVE) and moving window partial least squares regression (MWPLSR).  相似文献   

12.
This work describes a home-made microelectrode array, based on reticulated vitreous carbon, which has been used to record the normal pulse voltammograms with the aim of obtaining the growth curves of Escherichia coli ATCC 13706 and Pseudomonas aeruginosa ATCC 27853 chosen as test microorganisms. The electrochemical signal data have been analysed with partial least squares (PLS) regression in order to highlight the useful analytical information and correlate with the data obtained from the aerobic plate counting. The obtained PLS models generally had a low root mean square error of cross-validation (RMSECV), a cross validated explained variance percentage near 90%.  相似文献   

13.
自适应模糊偏最小二乘方法在药物构效关系建模中的应用   总被引:2,自引:0,他引:2  
作为一种局部逼近方法,自适应神经模糊推理系统(ANFIS)适于为药物定量构效关系(QSAR)建模。描述药物分子结构的参数较多,常存在耦合关系,会增加建模难度,并影响模型的预报性能。为此,将ANFIS和偏最小二乘(PLS)相结合,先由PLS从样本数据中提取成分,再由ANFIS实现每对成分间的非线性映射,并基于输出误差进一步修正所提取的成分,使之对因变量具有最优的解释能力,由此构建为EB-AFPLS方法。该法已成功地应用于HIV-1蛋白酶抑制剂的QSAR建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

14.
Differential Pulse Voltammetry has been used for the simultaneous determination of cysteine, tyrosine and trptophan on the unmodified glassy carbon electrode. In the analysis of these analytes in the same samples, the main difficulty is the high degree of overlapping of voltammograms. The relationships between the currents and the concentrations are complex and highly nonlinear. The predictive ability of principal component regression (PCR), partial least squares regression (PLS), genetic algorithm‐partial least squares regression (GA‐PLS) and principal component‐artificial neural networks (PC‐ANNs) were examined for simultaneous determination of three amino acids. For a regression model, everything that could not help in constructing the model may be considered as noise without further specification. PC‐ANN and GA‐PLS use significant data and show superiority over other applied multivariate methods. The proposed method was also applied satisfactorily to determination of analytes in some synthetic samples.  相似文献   

15.
A novel near infrared (NIR) modeling method—Laplacian regularized least squares regression (LapRLSR) was presented, which can take the advantage of many unlabeled spectra to promote the prediction performance of the model even if there are only few calibration samples. Using LapRLSR modeling, NIR spectral analysis was applied to the online monitoring of the concentration of salvia acid B in the column separation of Salvianolate. The results demonstrated that LapRLSR outperformed partial least squares (PLS) significantly, and NIR online analysis was applicable.  相似文献   

16.
偏最小二乘法用于药物分析   总被引:19,自引:4,他引:19  
谢玉珑  梁逸曾 《分析化学》1989,17(7):588-592
  相似文献   

17.
《Analytical letters》2012,45(10):2081-2089
ABSTRACT

The polarographic waves of lead (II) and tin (II) overlap due to their similar reductive potentials and it is difficult to determine these two components simultaneously without a pre-separation. In this paper, differential pulse polarography (DPP) combined with multivariate calibration approaches, such as classical least squares (CLS), principal component regression (PCR) and partial least squares (PLS), were successfully applied to the resolution of overlapping polarographic waves of these two components in the concentration range of 0.05-3.50 mg 1?1. Satisfactory quantitative results were obtained.  相似文献   

18.
19.
In the present study, Quantitative Structure-Activity Relationship (QSAR) modeling has been carried out for lipid peroxidation (LPO)-inhibition potential of a set of 27 flavonoids, using structural and topological parameters. For the development of models, three methods were used: (1) stepwise regression, (2) factor analysis followed by multiple linear regressions (FA-MLR) and (3) partial least squares (PLS) analysis. The best equation was obtained from stepwise regression analysis (Q2 = 0.626) considering the leave-oneout prediction statistics.   相似文献   

20.
Microcrystalline naphthalene extraction has been used for the preconcentration of p-benzoquinone and tetrachloro-p-benzoquinone (chloranil), after their reaction by aniline, and later simultaneous spectrophotometric analysis by genetic algorithm-partial least squares (GA-PLS) calibration. The chemical variables affecting the analytical performance of the methodology were studied and optimized. Under the optimum conditions i.e., [aniline] = 0.05 M and [naphthalene] = 2.2% (w/v), preconcentration of 25 ml of sample solution permitted the detection of 0.32 and 0.23 microg ml(-1) for p-benzoquinone and chloranil, respectively. The predictive abilities of partial least squares regression (PLS) and genetic algorithm-partial least squares regression (GA-PLS) were examined for simultaneous determination of two quinones. The GA-PLS shows superiority over other PLS methods due to the wavelength selection in PLS calibration using a genetic algorithm without loss of prediction capacity, provides useful information about the chemical system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号