首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 84 毫秒
1.
Regression from high dimensional observation vectors is particularly difficult when training data is limited. Partial least squares (PLS) partly solves the high dimensional regression problem by projecting the data to latent variables space. The key issue in PLS is the computation of weight vector which describes the covariance between the responses and observations. For small-sample-size and high-dimensional regression problem, the covariance estimation is usually inaccurate and the correlated components in the predictors will distort the PLS weight. In this paper, we propose a sparse matrix transform (SMT) based PLS (SMT-PLS) method for high-dimensional spectroscopy regression. In SMT-PLS, the observation data is first decorrelated by SMT. Then, in the decorrelated data space, the PLS loading weight is computed by least squares regression. SMT technique provides an accurate data covariance estimation, which can overcome the effect of small-sample-size and benefit both the PLS weight computation and subsequent regression prediction. The proposed SMT-PLS method is compared, in terms of root mean square errors of prediction, to PLS, Power PLS and PLS with orthogonal scatter correction on four real spectroscopic data sets. Experimental results demonstrate the efficacy and effectiveness of our proposed method.  相似文献   

2.
Extension of standard regression to the case of multiple regressor arrays is given via the Kronecker product. The method is illustrated using ordinary least squares regression (OLS) as well as the latent variable (LV) methods principal component regression (PCR) and partial least squares regression (PLS). Denoting the method applied to PLS as mrPLS, the latter was shown to explain as much or more variance for the first LV relative to the comparable L‐partial least squares regression (L‐PLS) model. The same relationship holds when mrPLS is compared to PLS or n‐way partial least squares (N‐PLS) and the response array is 2‐way or 3‐way, respectively, where the regressor array corresponding to the first mode of the response array is 2‐way and the second mode regressor array is an identity matrix. In a comparison with N‐PLS using fragrance data, mrPLS proved superior in a validation sense when model selection was used. Though the focus is on 2‐way regressor arrays, the method can be applied to n‐way regressors via N‐PLS. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

3.
This study compares the performance of partial least squares (PLS) regression analysis and artificial neural networks (ANN) for the prediction of total anthocyanin concentration in red-grape homogenates from their visible-near-infrared (Vis-NIR) spectra. The PLS prediction of anthocyanin concentrations for new-season samples from Vis-NIR spectra was characterised by regression non-linearity and prediction bias. In practice, this usually requires the inclusion of some samples from the new vintage to improve the prediction. The use of WinISI LOCAL partly alleviated these problems but still resulted in increased error at high and low extremes of the anthocyanin concentration range. Artificial neural networks regression was investigated as an alternative method to PLS, due to the inherent advantages of ANN for modelling non-linear systems. The method proposed here combines the advantages of the data reduction capabilities of PLS regression with the non-linear modelling capabilities of ANN. With the use of PLS scores as inputs for ANN regression, the model was shown to be quicker and easier to train than using raw full-spectrum data. The ANN calibration for prediction of new vintage grape data, using PLS scores as inputs, was more linear and accurate than global and LOCAL PLS models and appears to reduce the need for refreshing the calibration with new-season samples. ANN with PLS scores required fewer inputs and was less prone to overfitting than using PCA scores. A variation of the ANN method, using carefully selected spectral frequencies as inputs, resulted in prediction accuracy comparable to those using PLS scores but, as for PCA inputs, was also prone to overfitting with redundant wavelengths.  相似文献   

4.
Partial least-squares regression: a tutorial   总被引:5,自引:0,他引:5  
A tutorial on the partial least-squares (PLS) regression method is provided. Weak points in some other regression methods are outlined and PLS is developed as a remedy for those weaknesses. An algorithm for a predictive PLS and some practical hints for its use are given.  相似文献   

5.
《Analytical letters》2012,45(9):2073-2083
Abstract

A consensus regression approach based on partial least square (PLS) regression, named as cPLS, for calibrating the NIR data was investigated. In this approach, multiple independent PLS models were developed and integrated into a single consensus model. The utility and merits of the cPLS method were demonstrated by comparing its results with those from a regular PLS method in predicting moisture, oil, protein, and starch contents of corn samples using the NIR spectral data. It was found that cPLS was superior to regular PLS with respect to prediction accuracy and robustness.  相似文献   

6.
邵学广  陈达  徐恒  刘智超  蔡文生 《中国化学》2009,27(7):1328-1332
偏最小二乘法(PLS)在近红外光谱(NIR)定量分析中占有重要地位,但预测结果往往容易受到样本分组和奇异样本等因素的影响,稳健性不强。多模型PLS (EPLS)方法在模型稳健性上得到提高,然而它无法识别样本中存在的奇异样本。为了同时提高模型的预测准确性和稳健性,本文提出了一种根据取样概率重新取样的多模型PLS方法,称为稳健共识PLS(RE-PLS)方法。该方法通过迭代赋权偏最小二乘法(IRPLS)计算样本回归残差得到每个校正集样本的取样概率,然后根据样本的取样概率来选择训练子集建立多个PLS模型,最后将所有PLS模型的预测结果平均作为最终预测结果。该方法用于两种不同植物样品的近红外光谱建模,并与传统的PLS及EPLS方法进行比较。结果表明该方法可以有效的避免校正集中奇异样本对模型的影响,同时可以提高预测精确度和稳健性。对于含有较多奇异样本的,复杂近红外光谱烟草实际样本,利用简单PLS或者EPLS方法建模预测效果不是很理想,而RE-PLS凭借其独特优势则有望在这种复杂光谱定量分析中得到广泛的应用。  相似文献   

7.
8.
This study presents an analytical method for determining interfacial tension and relative density in insulating oils using near infrared spectrometry (NIR). Five different strategies of regression were evaluated: partial least squares (PLS) with significant regression coefficients selected by jack-knife algorithm; interval PLS (iPLS); multiple linear regression (MLR) with variable selection by genetic algorithm (MLR/GA), successive projections algorithm (MLR/SPA) and stepwise strategy (SR/MLR). The overall results point to MLR/SPA as the best modeling strategy. The strategy is simpler and uses fewer spectral variables.  相似文献   

9.
偏最小二乘法用于药物分析   总被引:19,自引:4,他引:19  
谢玉珑  梁逸曾 《分析化学》1989,17(7):588-592
  相似文献   

10.
构建支持向量机-偏最小二乘法为药物构效关系建模   总被引:6,自引:0,他引:6  
李剑  陈德钊  成忠  叶子青 《分析化学》2006,34(2):263-266
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号