期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust partial least squares regression: Part I,algorithmic developments

Uwe Kruger Yan Zhou Xun Wang David Rooney Jillian Thompson 《Journal of Chemometrics》2008,22(1):1-13

The work summarized in this paper presents the first part of a three‐paper series on robust partial least squares (RPLS) regression. Motivated by recent research activities in this area, this part provides a detailed algorithmic analysis of associated techniques, showing that existing work (i) may not represent a true robust formulation of partial least squares (PLS), (ii) may lead to convergence problems or (iii) may be insensitive to a certain type of outlier. On the basis of this analysis, Part I introduces a new conceptual RPLS algorithm that overcomes the deficiencies of existing work. The second part of this work details this new RPLS technique, compares its peformance with existing RPLS methods and provides an analysis on the computational efficiency and sensitivity of these algorithms. Whilst the first two parts of this work discuss algorithmic developments of RPLS, the final part concentrates on practical issues of RPLS implementations. This third part is devoted to practitioners of chemistry and chemical engineering covering a wide range of applications involving a calibration experiment, the analysis of recorded data from an industrial debutanizer process and data from a number of Raman spectroscopy experiments. Copyright © 2007 John Wiley & Sons, Ltd. 相似文献

2.

Beyond linear least squares regression

Ildiko E. Frank 《Trends in analytical chemistry : TRAC》1987,6(10):271-275

Regression is a collection of statistical methods that are used to study relationships among predictor and response variables. In addition to the most popular linear model, solved by least squares, several other techniques have found an application in analytical chemistry. Biased methods such as stepwise regression, ridge regression, principal components regression, and partial least squares regression are especially useful in cases of poorly or underdetermined systems with collinearity. When structural and/or distributional assumptions associated with linear least squares are violated, nonlinear regression, robust regression or generalized least squares estimators may offer potential remedies. 相似文献

3.

Distance algorithm based procedure for non‐negative least squares

Rbert Rajk Yu Zheng 《Journal of Chemometrics》2014,28(9):691-695

In chemistry and many other scientific disciplines, non‐negativity‐constrained estimation of models is of practical importance. The time required for estimating true least squares non‐negativity‐constrained models is typically many times longer than that for estimating unconstrained models. That is why it is necessary to find faster and faster non‐negative least squares (NNLS) algorithms. Very recently, the distance algorithm has been developed, and this algorithm can be adapted to solve NNLS regression task faster (in some cases) than the conventional algorithms. Based on some simulated investigation, DA_NNLS was the fastest for small‐sized and medium‐sized linear regression tasks. The visualization (geometry) of the NNLS task being solved by our new algorithm is discussed as well. Besides linear algebra, convex geometrical concepts and tools are suggested to investigate, to use, and to develop in chemometrics for exploiting the geometry of chemometry. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

4.

Robustified least squares support vector classification

Michiel Debruyne Sven Serneels Tim Verdonck 《Journal of Chemometrics》2009,23(9):479-486

Support vector machine (SVM) algorithms are a popular class of techniques to perform classification. However, outliers in the data can result in bad global misclassification percentages. In this paper, we propose a method to identify such outliers in the SVM framework. A specific robust classification algorithm is proposed adjusting the least squares SVM (LS‐SVM). This yields better classification performance for heavily tailed data and data containing outliers. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

5.

A heuristic and parallel simulated annealing algorithm for variable selection in near‐infrared spectroscopy analysis

Jiyong Shi Xuetao Hu Xiaobo Zou Jiewen Zhao Wen Zhang Xiaowei Huang Yaodi Zhu Zhihua Li Yiwei Xu 《Journal of Chemometrics》2016,30(8):442-450

A new heuristic and parallel simulated annealing algorithm was proposed for variable selection in near‐infrared spectroscopy analysis. The algorithm employs a parallel mechanism to enhance the search efficiency, a heuristic mechanism to generate high‐quality candidate solutions, and the concept of Metropolis criterion to estimate accuracy of the candidate solutions. Several near‐infrared datasets have been evaluated under the proposed new algorithm, with partial least squares leading to improved analytical figures of merit upon wavelength selection. Improved robust and predictive regression models were obtained by the new algorithm. The method could also be helpful in other chemometric activities such as classification or quantitative structure‐activity relationship problems. 相似文献

6.

Robustness properties of a robust partial least squares regression method

K. Vanden Branden 《Analytica chimica acta》2004,515(1):229-241

The presence of multicollinearity in regression data is no exception in real life examples. Instead of applying ordinary regression methods, biased regression techniques such as principal component regression and ridge regression have been developed to cope with such datasets. In this paper, we consider partial least squares (PLS) regression by means of the SIMPLS algorithm. Because the SIMPLS algorithm is based on the empirical variance-covariance matrix of the data and on least squares regression, outliers have a damaging effect on the estimates. To reduce this pernicious effect of outliers, we propose to replace the empirical variance-covariance matrix in SIMPLS by a robust covariance estimator. We derive the influence function of the resulting PLS weight vectors and the regression estimates, and conclude that they will be bounded if the robust covariance estimator has a bounded influence function. Also the breakdown value is inherited from the robust estimator. We illustrate the results using the MCD estimator and the reweighted MCD estimator (RMCD) for low-dimensional datasets. Also some empirical properties are provided for a high-dimensional dataset. 相似文献

7.

Linear calibrations in chromatography: The incorrect use of ordinary least squares for determinations at low levels,and the need to redefine the limit of quantification with this regression model

Juan M. Sanchez 《Journal of separation science》2020,43(13):2708-2717

Ordinary least squares is widely applied as the standard regression method for analytical calibrations, and it is usually accepted that this regression method can be used for quantification starting at the limit of quantification. However, it requires calibration being homoscedastic and this is not common. Different calibrations have been evaluated to assess whether ordinary least squares is adequate to quantify estimates at low levels. All calibrations evaluated were linear and heteroscedastic. Despite acceptable values for precision at limit of quantification levels were obtained, ordinary least squares fitting resulted in significant and unacceptable bias at low levels. When weighted least squares regression was applied, bias at low levels was solved and accurate estimates were obtained. With heteroscedastic calibrations, limit values determined by conventional methods are only appropriate if weighted least squares are used. A “practical limit of quantification” can be determined with ordinary least squares in heteroscedastic calibrations, which should be fixed at a minimum of 20 times the value calculated with conventional methods. Biases obtained above this “practical limit” were acceptable applying ordinary least squares and no significant differences were obtained between the estimates measured using weighted and ordinary least squares when analyzing real‐world samples. 相似文献

8.

分析化学中非线性拟合的最小二乘法及其与遗传算法的比较

张小吐《分析化学》1996,24(8):947-950

相似文献

9.

Unsupervised forward selection: a method for eliminating redundant variables

Whitley DC Ford MG Livingstone DJ 《Journal of chemical information and computer sciences》2000,40(5):1160-1168

相似文献

10.

A first‐order system least‐squares finite element method for the Poisson‐Boltzmann equation

Stephen D. Bond Jehanzeb Hameed Chaudhry Eric C. Cyr Luke N. Olson 《Journal of computational chemistry》2010,31(8):1625-1635

The Poisson‐Boltzmann equation is an important tool in modeling solvent in biomolecular systems. In this article, we focus on numerical approximations to the electrostatic potential expressed in the regularized linear Poisson‐Boltzmann equation. We expose the flux directly through a first‐order system form of the equation. Using this formulation, we propose a system that yields a tractable least‐squares finite element formulation and establish theory to support this approach. The least‐squares finite element approximation naturally provides an a posteriori error estimator and we present numerical evidence in support of the method. The computational results highlight optimality in the case of adaptive mesh refinement for a variety of molecular configurations. In particular, we show promising performance for the Born ion, Fasciculin 1, methanol, and a dipole, which highlights robustness of our approach. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010 相似文献

11.

稳健线性回归法探讨分析精度与浓度之间的关系 总被引：2，自引：0，他引：2

高志何锡文张贵珠李一峻李玉环伍孝余《分析化学》1999,27(6):644-647

研究分析精度与浓度之间的关系对于分析温度的质量控制具有重要意义。本文以土壤及水系沉积中微量元素为例,研究了不同实验室之间分析精度与被测组分浓度之间的关系。应用重新加权迭代最小二乘法（ＩＲＬＳ）成功地对分析结果的标准偏差与含量之间的关系进行了线性拟合,表明线性关系良好,与经典最小二乘法比较发现,ＩＲＬＳ法具有很好的稳健性,受离群值的影响小,回归结果也更符合实际。相似文献

12.

段式正交信号校正方法及在小麦近红外光谱数据分析中的应用 总被引：3，自引：1，他引：2

成忠诸爱士《分析化学》2008,36(6):788-792

针对光谱数据峰宽、局部效应显著、含有噪音、变量个数多及彼此间常存在严重的复共线性等问题,改进和设计一种光谱数据局部校正方法:基于窗口平滑的段式正交信号校正方法,并将之结合偏最小二乘回归,以实现光谱数据的预处理及定量分析。通过NIPALS算法初始化将滤去的正交成分,以近邻分段方式进行逐个波长点的正交信号校正。而后将去噪后的光谱矩阵作为新的自变量阵,通过偏最小二乘回归构建其与性质参变量间的校正模型。通过小麦近红外漫反射光谱数据的应用实验结果表明,本方法正交成分估计稳定,去噪明显,模型的预报性能优于其它方法,PLS成分数减少,模型更加简洁。相似文献

13.

New approach by Kriging models to problems in QSAR

Fang KT Yin H Liang YZ 《Journal of chemical information and computer sciences》2004,44(6):2106-2113

Most models in quantitative structure and activity relationship (QSAR) research, proposed by various techniques such as ordinary least squares regression, principal components regression, partial least squares regression, and multivariate adaptive regression splines, involve a linear parametric part and a random error part. The random errors in those models are assumed to be independently identical distributed. However, the independence assumption is not reasonable in many cases. Some dependence among errors should be considered just like Kriging. It has been successfully used in computer experiments for modeling. The aim of this paper is to apply Kriging models to QSAR. Our experiments show that the Kriging models can significantly improve the performances of the models obtained by many existing methods. 相似文献

14.

Accurate discrimination of Gastrodia elata from different geographical origins using high‐performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis

Shanshan Sun Yancheng Li Lijun Zhu Haiyan Ma Lupan Li Yufeng Liu 《Journal of separation science》2019,42(17):2875-2882

Gastrodia elata from different geographical origins varies in quality and pharmacological activity. This study focused on the classification and identification of Gastrodia elata from six producing areas using high‐performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis. Before recognition analysis, a principal component analysis was applied to ascertain the discrimination possibility with high‐performance liquid chromatography fingerprints. And then, boosting partial least‐squares discriminant analysis and conventional partial least‐squares discriminant analysis were applied in this study. Experimental results indicated that the adaptive iteratively reweighted penalized least‐squares algorithm could eliminate the baseline drift of high‐performance liquid chromatography chromatograms effectively. And compared with partial least‐squares discriminant analysis, the total recognition rates using high‐performance liquid chromatography fingerprint combined with boosting partial least‐squares discriminant analysis for the calibration sets and prediction sets were improved from 94 to 100% and 86 to 97%, respectively. In conclusion, high‐performance liquid chromatography combined with boosting partial least‐squares discriminant analysis, which has such advantages as effective, specific, accurate, non‐polluting, has an edge for discrimination of traditional Chinese medicine from different geographical origins. And the proposed methodology is a useful tool to classify and identify Gastrodia elata from different geographical origins. 相似文献

15.

加权最小二乘支持向量机稳健化迭代算法及其在光谱分析中的应用 总被引：1，自引：0，他引：1

包鑫戴连奎《化学学报》2009,67(10):1081-1086

为克服光谱分析中异常训练样本的影响, 提出了一种加权最小二乘支持向量机(WLS-SVM)的稳健化迭代算法. 针对原始WLS-SVM在收敛性和稳健性方面的不足, 提出了一种新的求取回归误差的方法, 从而从根本上解决了WLS-SVM的收敛性问题; 同时对原始算法求权值的步骤进行了修正, 采用回归误差的中值作为计算加权值的比较基准, 大幅度提高了WLS-SVM的稳健性. 将算法应用于光谱定量分析中, 实验结果证明了该方法是收敛的, 并且崩溃点在35%左右, 是一种有效的稳健建模方法. 相似文献

16.

Statistical modeling of a ligand knowledge base

Mansson RA Welsh AH Fey N Orpen AG 《Journal of chemical information and modeling》2006,46(6):2591-2600

相似文献

17.

Fluorescence spectroscopic determination of triglyceride in human serum with window genetic algorithm partial least squares

Xiangzhen Kong Weihua Zhu Zhimin Zhao Xiangyan Li Hui Wang Ran Chen Chuchu Chen Feng Zhu Xiaoying Guo 《Journal of Chemometrics》2012,26(1-2):25-33

Fluorescence spectrum, as well as the first and second derivative spectra in the region of 220–900 nm, was utilized to determine the concentration of triglyceride in human serum. Nonlinear partial least squares regression with cubic B‐spline‐function‐based nonlinear transformation was employed as the chemometric method. Window genetic algorithms partial least squares (WGAPLS) was proposed as a new wavelength selection method to find the optimized spectra wavelengths combination. Study shows that when WGAPLS is applied within the optimized regions ascertained by changeable size moving window partial least squares (CSMWPLS) or searching combination moving window partial least squares (SCMWPLS), the calibration and prediction performance of the model can be further improved at a reasonable latent variable number. SCMWPLS should start from the sub‐region found by CSMWPLS with the smallest root mean squares error of calibration (RMSEC). In addition, WGAPLS should be utilized within the region of smallest RMSEC whether it is the sub‐region found by CSMWPLS or region combination found by SCMWPLS. Moreover, the prediction ability of nonlinear models was better than the linear models significantly. The prediction performance of the three spectra was in the following order: second derivative spectrum < original spectrum < first derivative spectrum. Wavelengths within the region of 300–367 nm and 386–392 nm in the first derivative of the original fluorescence spectrum were the optimized wavelength combination for the prediction model. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

18.

Resolution of Differential Pulse Voltammetric Peaks Using Genetic Algorithm Based Variable Selection‐Partial Least Squares and Principal Component‐Artificial Neural Networks

Mir Reza Majidi Karim Asadpour‐Zeynali 《中国化学会会志》2005,52(1):21-28

Differential Pulse Voltammetry has been used for the simultaneous determination of cysteine, tyrosine and trptophan on the unmodified glassy carbon electrode. In the analysis of these analytes in the same samples, the main difficulty is the high degree of overlapping of voltammograms. The relationships between the currents and the concentrations are complex and highly nonlinear. The predictive ability of principal component regression (PCR), partial least squares regression (PLS), genetic algorithm‐partial least squares regression (GA‐PLS) and principal component‐artificial neural networks (PC‐ANNs) were examined for simultaneous determination of three amino acids. For a regression model, everything that could not help in constructing the model may be considered as noise without further specification. PC‐ANN and GA‐PLS use significant data and show superiority over other applied multivariate methods. The proposed method was also applied satisfactorily to determination of analytes in some synthetic samples. 相似文献

19.

Predicting the Activity of Peptides Based on Amino Acid Information

Xiao‐Yu Wang Juan Wang Yong Hu Yong Lin Mao Shu Li Wang Xiao‐Ming Cheng Zhi‐Hua Lin 《中国化学会会志》2011,58(7):877-883

相似文献

20.

偏最小二乘回归用于近红外光谱分析的稳健策略

邵学广陈达徐恒刘智超蔡文生《中国化学》2009,27(7):1328-1332

偏最小二乘法（PLS）在近红外光谱（NIR）定量分析中占有重要地位,但预测结果往往容易受到样本分组和奇异样本等因素的影响,稳健性不强。多模型PLS (EPLS）方法在模型稳健性上得到提高,然而它无法识别样本中存在的奇异样本。为了同时提高模型的预测准确性和稳健性,本文提出了一种根据取样概率重新取样的多模型PLS方法,称为稳健共识PLS（RE-PLS）方法。该方法通过迭代赋权偏最小二乘法(IRPLS)计算样本回归残差得到每个校正集样本的取样概率,然后根据样本的取样概率来选择训练子集建立多个PLS模型,最后将所有PLS模型的预测结果平均作为最终预测结果。该方法用于两种不同植物样品的近红外光谱建模,并与传统的PLS及EPLS方法进行比较。结果表明该方法可以有效的避免校正集中奇异样本对模型的影响,同时可以提高预测精确度和稳健性。对于含有较多奇异样本的,复杂近红外光谱烟草实际样本,利用简单PLS或者EPLS方法建模预测效果不是很理想,而RE-PLS凭借其独特优势则有望在这种复杂光谱定量分析中得到广泛的应用。相似文献