首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
针对高维小样本质谱数据在构造模型时易产生的过拟合现象、变量间的严重共线性、及结构与性质间的非线性关系,采用了核分段逆回归(KSIR)特征提取集成线性判别分析(LDA)新技术。首先以KSIR算法完成质谱数据的非线性特征提取,然后在由新特征矢量张成的低维空间构造样本类别的线性判别函数,负责各样本个体类别的判定。将KSIR-LDA方法应用于软饮料的质谱数据分类,结果表明:该方法不仅适应质谱数据与性质间的非线性关系,而且可以更少、解释能力更强的特征变量取得更高的分类精度,并能实现在低维特征空间对数据的解释及可视化。  相似文献   

2.
成忠  诸爱士 《分析化学》2008,36(6):788-792
针对光谱数据峰宽、局部效应显著、含有噪音、变量个数多及彼此间常存在严重的复共线性等问题,改进和设计一种光谱数据局部校正方法:基于窗口平滑的段式正交信号校正方法,并将之结合偏最小二乘回归,以实现光谱数据的预处理及定量分析。通过NIPALS算法初始化将滤去的正交成分,以近邻分段方式进行逐个波长点的正交信号校正。而后将去噪后的光谱矩阵作为新的自变量阵,通过偏最小二乘回归构建其与性质参变量间的校正模型。通过小麦近红外漫反射光谱数据的应用实验结果表明,本方法正交成分估计稳定,去噪明显,模型的预报性能优于其它方法,PLS成分数减少,模型更加简洁。  相似文献   

3.
将小波变换和多维偏最小二乘法相结合用于近红外光谱定量校正模型的建立.首先将原始光谱进行小波变换分解,得到系列小波细节系数,通过选取一组受外界因素少、信息强的小波系数组成三维光谱阵,然后再采用多维偏最小二乘法建立校正模型.实验结果表明,该方法所建近红外校正模型的预测能力更强,并更具稳健性.  相似文献   

4.
将小波变换和多维偏最小二乘法相结合用于近红外光谱定量校正模型的建立。首先将原始光谱进行小波变换分解,得到系列小波细节系数,通过选取一组受外界因素少、信息强的小波系数组成三维光谱阵,然后再采用多维偏最小二乘法建立校正模型。实验结果表明,该方法所建近红外校正模捌的预测能力更强,并更具稳健性。  相似文献   

5.
为了提高近红外光谱定量分析的预测精度和建模效率,提出了一种基于交互式自模型的混合物分析的波长优选方法,根据光谱各波长变量的纯度值和标准差值,选择含有用信息的波长变量,并引入相关权函数解决变量间共线性问题.通过依次迭代选择的变量建立定量校正模型,由交互验证均方根预测误差(RMSECV)确定最佳波长变量个数.应用该波长变量优选方法对具有不同葡萄糖含量的两组(四成分葡萄糖水溶液实验和人体血浆实验)近红外光谱数据进行分析,两组数据中分别只选择了全部变量的0.3%建立定量校正模型,其验证集葡萄糖浓度的均方根预测误差(RMSEP)分别减少为669和15 mg/L.与全谱范围及优选波段建立的定量校正模型比较,本方法能够通过波长变量优选最小化冗余信息、提高预测精度及建模效率.  相似文献   

6.
张婉洁  刘蓉  徐可欣 《化学学报》2013,71(9):1281-1286
采用近红外光谱进行无创血糖检测时, 样品背景变动造成的预测集样本与校正集样本量测体系不一致的问题是导致预测精度低的原因之一. 提出一种将母体背景作为变量引入回归建模中, 结合各个母体背景下的样本光谱信息构建三维光谱矩阵以提高校正模型稳健性的分析方法. 将平行因子分析(PARAFAC)与多元线性回归(MLR)相结合, 对人体三层皮肤模型的蒙特卡罗模拟实验和葡萄糖水溶液及其混合物的离体实验进行了验证. 实验结果表明, 与传统的单一母体背景所建立的偏最小二乘模型相比, 将母体背景作为建模元素采用PARAFAC-MLR法所建立的校正模型具有更好的预测能力和稳健性.  相似文献   

7.
组合偏最小二乘回归方法在近红外光谱定量分析中的应用   总被引:3,自引:1,他引:3  
成忠  诸爱士  陈德钊 《分析化学》2007,35(7):978-982
针对近红外光谱数据局部效应显著,变量个数多,彼此间常存在严重的复共线性,并多与样品组分含量呈非线性关系,构建一种组合非线性偏最小二乘回归(E-S-QPLSR)方法。它采用无重复采样技术(subag-ging),从训练样本中生成若干子样,然后每个子样通过二次多项式偏最小二乘回归(QPLSR),建立其子模型,并实现对训练样本因变量的定量预测,再将它们交由线性PLS算法用于计算各子模型的组合权系数。将该法应用于80个玉米样品的水组分含量与其近红外光谱的定量关系建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

8.
由于校正集样本的质量决定校正模型的质量,校正集中奇异样本的检测在多元校正建模中具有非常重要的意义.本研究建立了一种用于近红外光谱多元校正建模时校正集中奇异样本的检测方法.本方法基于奇异样本的定义和偏最小二乘方法的原理,通过考察每个校正集样本在模型的每个因子(或主成分)中对模型的贡献,将与多数样本表现不同的样本识别为奇异样本.采用218个橘汁样本构成的近红外光谱数据进行了分析,结果表明,校正集中存在6个奇异样本,扣除奇异样本后,校正集的交叉验证均方根误差由16.870减小为4.809,预测集的均方根误差从3.688减小为3.332.  相似文献   

9.
采用CARS(Competitive adaptive reweighted sampling)变量筛选方法建模,显著提高了液态奶中蛋白质与脂肪近红外模型的预测精度。用蒙特卡罗采样(Monte-Carlo sampling)方法先剔除奇异样本,再对光谱进行中心化与Karl Norris滤波降噪处理,通过CARS方法筛选出与样本性质密切相关的变量,建立预测蛋白质与脂肪含量的偏最小二乘法(PLS)校正模型,并与未选变量的PLS模型进行比较。以定标集相关系数(r2)及交互验证均方残差(RMSECV)和预测误差均方根(RMSEP)作为判定依据,确定了蛋白质与脂肪的最佳建模条件。蛋白质与脂肪校正模型的相关系数分别为0.975 0、0.995 1,RMSECV分别为0.194 8、0.136 3,RMSEP分别为0.113 3、0.140 1,预测结果优于未选变量的PLS模型及其他选变量方法,有效简化了模型,适于液态奶中脂肪和蛋白质的快速、无损检测。  相似文献   

10.
利用近红外光谱技术对食用植物油中反式脂肪酸(Trans fatty acids,TFA)含量进行快速定量检测,并通过波段选择、预处理方法、变量筛选及建模方法对TFA含量预测模型进行优化.采用AntarisⅡ傅里叶变换近红外光谱仪在4000~10000 cm-1光谱范围采集98个食用植物油样本的近红外透射光谱,然后采用气相色谱法测定TFA的真实含量.首先,对样本原始光谱进行波段、预处理方法优选;在此基础上,采用竞争自适应重加权法(Competitive adaptive reweighted sampling,CARS)筛选TFA相关的重要变量,最后应用主成分回归、偏最小二乘和最小二乘支持向量机方法分别建立食用植物油中TFA含量的预测模型.研究结果表明,近红外光谱技术检测食用植物油中的TFA含量是可行的,优化后的最佳预测模型的校正集和预测集R2分别为0.992和0.989,RMSEC和RMSEP分别为0.071%和0.075%.最佳预测模型所用的变量仅26个,占全波段变量的0.854%.此外,与全波段偏最小二乘预测模型相比,其预测集R2由0.904上升为0.989,RMSEP由0.230%下降为0.075%.由此表明,模型优化非常必要,CARS能有效筛选TFA相关的重要变量,极大减少建模变量数,从而简化预测模型,并较大提高预测模型的精度和稳定性.  相似文献   

11.
构建支持向量机-偏最小二乘法为药物构效关系建模   总被引:6,自引:0,他引:6  
李剑  陈德钊  成忠  叶子青 《分析化学》2006,34(2):263-266
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。  相似文献   

12.
Fluorescence spectrum, as well as the first and second derivative spectra in the region of 220–900 nm, was utilized to determine the concentration of triglyceride in human serum. Nonlinear partial least squares regression with cubic B‐spline‐function‐based nonlinear transformation was employed as the chemometric method. Window genetic algorithms partial least squares (WGAPLS) was proposed as a new wavelength selection method to find the optimized spectra wavelengths combination. Study shows that when WGAPLS is applied within the optimized regions ascertained by changeable size moving window partial least squares (CSMWPLS) or searching combination moving window partial least squares (SCMWPLS), the calibration and prediction performance of the model can be further improved at a reasonable latent variable number. SCMWPLS should start from the sub‐region found by CSMWPLS with the smallest root mean squares error of calibration (RMSEC). In addition, WGAPLS should be utilized within the region of smallest RMSEC whether it is the sub‐region found by CSMWPLS or region combination found by SCMWPLS. Moreover, the prediction ability of nonlinear models was better than the linear models significantly. The prediction performance of the three spectra was in the following order: second derivative spectrum < original spectrum < first derivative spectrum. Wavelengths within the region of 300–367 nm and 386–392 nm in the first derivative of the original fluorescence spectrum were the optimized wavelength combination for the prediction model. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
Regression is a collection of statistical methods that are used to study relationships among predictor and response variables. In addition to the most popular linear model, solved by least squares, several other techniques have found an application in analytical chemistry. Biased methods such as stepwise regression, ridge regression, principal components regression, and partial least squares regression are especially useful in cases of poorly or underdetermined systems with collinearity. When structural and/or distributional assumptions associated with linear least squares are violated, nonlinear regression, robust regression or generalized least squares estimators may offer potential remedies.  相似文献   

14.
Differential Pulse Voltammetry has been used for the simultaneous determination of cysteine, tyrosine and trptophan on the unmodified glassy carbon electrode. In the analysis of these analytes in the same samples, the main difficulty is the high degree of overlapping of voltammograms. The relationships between the currents and the concentrations are complex and highly nonlinear. The predictive ability of principal component regression (PCR), partial least squares regression (PLS), genetic algorithm‐partial least squares regression (GA‐PLS) and principal component‐artificial neural networks (PC‐ANNs) were examined for simultaneous determination of three amino acids. For a regression model, everything that could not help in constructing the model may be considered as noise without further specification. PC‐ANN and GA‐PLS use significant data and show superiority over other applied multivariate methods. The proposed method was also applied satisfactorily to determination of analytes in some synthetic samples.  相似文献   

15.
With the aim of developing a nonlinear tool for near-infrared spectral (NIRS) calibration, an applicable algorithm, called MIKPLS, is designed based on the combination of two different strategies, i.e. mutual information (MI) for interval selection and kernel partial least squares (KPLS) for modeling. Due to the ability of capturing linear and nonlinear dependencies between variables simultaneously, mutual information between each candidate variables and target is calculated and employed to induce a continuous wavelength interval, which is subsequently applied to build a parsimonious calibration model for future use by kernel partial least squares. Through the experiments on two datasets, it seems that mutual information (MI)-induced interval selection, followed by KPLS, forms a very simple and practical tool, allowing a prediction model to be constructed using a much-reduced set of neighboring variables, but without any loss of generalizations and with improved prediction performance instead.  相似文献   

16.
This paper proposes the use of the least-squares support vector machine (LS-SVM) as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants (starch, whey or sucrose) found in powdered milk samples, using near-infrared spectroscopy with direct measurements by diffuse reflectance. Due to the spectral differences of the three adulterants a nonlinear behavior is present when all groups of adulterants are in the same data set, making the use of linear methods such as partial least squares regression (PLSR) difficult. Excellent models were built using LS-SVM, with low prediction errors and superior performance in relation to PLSR. These results show it possible to built robust models to quantify some common adulterants in powdered milk using near-infrared spectroscopy and LS-SVM as a nonlinear multivariate calibration procedure.  相似文献   

17.
18.
Two‐way and three‐way calibration models were applied to ultra high performance liquid chromatography with photodiode array data with coeluted peaks in the same wavelength and time regions for the simultaneous quantitation of ciprofloxacin and ornidazole in tablets. The chromatographic data cube (tensor) was obtained by recording chromatographic spectra of the standard and sample solutions containing ciprofloxacin and ornidazole with sulfadiazine as an internal standard as a function of time and wavelength. Parallel factor analysis and trilinear partial least squares were used as three‐way calibrations for the decomposition of the tensor, whereas three‐way unfolded partial least squares was applied as a two‐way calibration to the unfolded dataset obtained from the data array of ultra high performance liquid chromatography with photodiode array detection. The validity and ability of two‐way and three‐way analysis methods were tested by analyzing validation samples: synthetic mixture, interday and intraday samples, and standard addition samples. Results obtained from two‐way and three‐way calibrations were compared to those provided by traditional ultra high performance liquid chromatography. The proposed methods, parallel factor analysis, trilinear partial least squares, unfolded partial least squares, and traditional ultra high performance liquid chromatography were successfully applied to the quantitative estimation of the solid dosage form containing ciprofloxacin and ornidazole.  相似文献   

19.
自适应模糊偏最小二乘方法在药物构效关系建模中的应用   总被引:2,自引:0,他引:2  
作为一种局部逼近方法,自适应神经模糊推理系统(ANFIS)适于为药物定量构效关系(QSAR)建模。描述药物分子结构的参数较多,常存在耦合关系,会增加建模难度,并影响模型的预报性能。为此,将ANFIS和偏最小二乘(PLS)相结合,先由PLS从样本数据中提取成分,再由ANFIS实现每对成分间的非线性映射,并基于输出误差进一步修正所提取的成分,使之对因变量具有最优的解释能力,由此构建为EB-AFPLS方法。该法已成功地应用于HIV-1蛋白酶抑制剂的QSAR建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号