首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
陈昭  吴志生  史新元  徐冰  赵娜  乔延江 《分析化学》2014,(11):1679-1686
建立金银花醇沉过程中稳健的近红外光谱( Near infrared spectroscopy,NIR)定量模型,为金银花醇沉过程的快速评价提供方法。研究基于金银花醇沉过程绿原酸的 NIR 数据,通过建立 Bagging 偏最小二乘(Bagging-PLS)模型、Boosting偏最小二乘(Boosting-PLS)模型与偏最小二乘(Partial Least Squares,PLS)模型,实现对模型性能比较;在此基础上,采用组合间隔偏最小二乘法( Synergy interval partial least squares,siPLS)和竞争自适应抽样( Competitive adaptive reweighted sampling,CARS )法分别对光谱进行变量筛选,建立模型,实现了对模型预测性能的考察。实验结果表明, Bagging-PLS和Boosting-PLS(潜变量因子数设为10)的预测性能均优于 PLS 模型。在此基础上,两批样品采用 siPLS 筛选变量,第一个批次金银花筛选波段820~1029.5 nm和1030~1239.5 nm,第二个批次金银花醇沉筛选波段为820~959.5 nm和960~1099.5 nm;采用CARS方法变量筛选,两批样品分别选择5折交叉验证和10折交叉验证,取交叉验证均方根误差( RMSECV)值最小的子集作为最终变量筛选的结果。经过变量筛选的两批金银花醇沉过程中的绿原酸含量Bagging-PLS和Boosting-PLS模型的预测均方根误差(RMSEP)值降低了0.02~0.04 g/L,预测相关系数提高了4%~5%。综上,Baggning-PLS和Boosting-PLS算法可作为金银花醇沉过程NIR定量模型的快速预测方法。  相似文献   

2.
用IR,NIR光谱法结合簇类的独立软模式(SIMCA)识别方法对植物油脂进行分类识别,建立了识别二元、三元植物调和油脂的测定方法。应用NIRCal5.2软件的SIMCA技术,分别为所制备的植物调和油脂建立了IR和NIR识别模型,并讨论了光谱处理和数据处理方法来提高模型的分类识别效果。分别以各种植物调和油脂的IR和NIR光谱为变量,随机抽取2/3的样本作训练集,建立了各个调和油的主成分分析(Princi-pal component analysis,PCA)模型;1/3作验证集,对所建模型进行验证识别。用聚类分析-主成分分析(CLU-PCA)方法考察调和油的IR,NIR光谱信息与其纯油的主成分分布。结果显示,在4000~10000cm-1光谱范围内,SIMCA可以对15种二元调和油和2种三元调和油的NIR光谱分别聚类并识别;并对10种二元调和油和2种三元调和油的IR光谱分别聚类并识别。IR以4个波数1099,1119,1746与2855cm-1的吸收值作为分析基础,选择不同的主成分数及数据预处理方法。各种油脂的SIMCA分析的分类精度均为100%,调和油的验证识别准确率100%,最低识别比例为1%,且IR识别灵敏度高于NIR。  相似文献   

3.
蛋白质含量是评价鱼粉质量的重要指标,该文采用近红外(NIR)光谱分析技术结合特征筛选方法建立了鱼粉蛋白质含量的快速定量分析模型,并结合区间偏最小二乘(iPLS)和二进制变异策略的差分进化(DE)算法建立了区间偏最小二乘差分进化(iPLS-DE)的波长筛选优化模式,对鱼粉NIR光谱数据进行特征波长筛选。iPLS-DE通过调试iPLS中等分子区间的数量,优选出9个最优特征波段,再采用二进制变异策略的DE算法在最优特征波段内筛选离散特征波长组合,最后根据模型的评价指标确定iPLS-DE优选模型并与iPLS优选模型进行比较。结果表明,将鱼粉全谱等分为5个子区间时,iPLS-DE筛选出50个离散特征波长建立的优选模型对测试集样品的预测均方根误差和相对分析误差分别为1.033%和4.058,而iPLS优选模型对测试集样品的预测均方根误差和相对分析误差分别为1.131%和3.855。表明iPLS-DE方法能够有效地提高NIR光谱分析模型对鱼粉蛋白质定量检测的预测能力。  相似文献   

4.
建立了中药口服固体制剂原辅料近红外(NIR)光谱数据库,采用模式识别方法研究了NIR光谱数据在物料分类和物性预测中的应用。使用便携式近红外光谱仪快速测量149批原辅料粉末的NIR漫反射光谱数据,并录入iTCM数据库。利用主成分分析(PCA)法探究NIR光谱数据对已知结构物料的分类能力,采用偏最小二乘(PLS)法研究了NIR光谱对原辅料物性参数和直接压片片剂性能的预测能力。经标准正态变量变换(SNV)+Savitzky-Golay(SG)平滑+一阶导数处理后的NIR光谱数据对微晶纤维素、乳糖、乙基纤维素、交联聚维酮和羟丙基甲基纤维素这5类辅料的区分能力较好。NIR光谱数据与原辅料粉末粒径、密度和吸湿性的相关性较强。NIR光谱信息作为物料物理性质的补充,可提高粉末直接压片片剂性能预测模型的性能。NIR光谱数据是iTCM数据库物性参数数据的补充,物性参数与NIR光谱数据的结合能更全面地表征原辅料的性质。  相似文献   

5.
提出了一种蒙特卡洛-偏最小二乘回归系数法用于近红外光谱的变量筛选。方法主要包含如下几步:(1)采用蒙特卡洛采样方式,建立多个子集;(2)对每个子集建模,计算其回归系数,并按回归系数绝对值大小对各子模型中的变量进行排序;(3)按频数统计方法对波长排序;(4)对上步中排序后的波长以逐步累加进入最佳变量子集的方式进行交互验证,用以选择最佳变量集。将方法用于生物样品溶液和烟草样品近红外光谱的变量筛选,最终分别从原始的1234及1557个变量中选择了27和68个特征变量,对独立测试集进行预测的RMSEP分别从全谱变量的0.02716和0.06411降低为0.02372和0.03977。方法可有效地对近红外光谱进行变量筛选。  相似文献   

6.
采用CARS(Competitive adaptive reweighted sampling)变量筛选方法建模,显著提高了液态奶中蛋白质与脂肪近红外模型的预测精度。用蒙特卡罗采样(Monte-Carlo sampling)方法先剔除奇异样本,再对光谱进行中心化与Karl Norris滤波降噪处理,通过CARS方法筛选出与样本性质密切相关的变量,建立预测蛋白质与脂肪含量的偏最小二乘法(PLS)校正模型,并与未选变量的PLS模型进行比较。以定标集相关系数(r2)及交互验证均方残差(RMSECV)和预测误差均方根(RMSEP)作为判定依据,确定了蛋白质与脂肪的最佳建模条件。蛋白质与脂肪校正模型的相关系数分别为0.975 0、0.995 1,RMSECV分别为0.194 8、0.136 3,RMSEP分别为0.113 3、0.140 1,预测结果优于未选变量的PLS模型及其他选变量方法,有效简化了模型,适于液态奶中脂肪和蛋白质的快速、无损检测。  相似文献   

7.
采用近红外光谱(NIR)透射法对乙醇混合燃料各成分进行定量分析;其中乙醇体积分数为84.5%~98.2%,汽油体积分数0~15%;通过偏最小二乘法(PLS)建立模型,乙醇含量NIR模型校正集测定系数(R^2)为0.9969,模型校正集标准差(SEE)和预测集标准差(SEP)分别为0.23和0.38,汽油含量NIR模型校正集测定系数为0.9939,模型校正集标准差和预测集标准差分别为0.38和0.39,对含量较小的干扰物质丙酮预测结果也理想;近红外和多元校正技术可作为乙醇混合燃料中成分含量测定简单、快速方法之一。  相似文献   

8.
采用气相色谱-质谱联用技术结合化学计量学,针对高维小样本的疾病代谢组学图谱建立高性能的戊二酸血症Ⅰ型(GA-Ⅰ)早期检测模型。基于偏最小二乘判别分析(PLS-DA)的共线性处理和数据解释优势,自助抽样法(Bootstrap)通过数据扰动方式集成多个模型的变量选择能力,挑选出能够持续被筛选的变量实现稳健特征筛选(BS-PLSDA)。对于GA-Ⅰ的尿液代谢组学图谱,在两种逐步增大训练集之间样本差异的比例划分(7:3和6:4)下,载荷(LW)、变量投影重要性(VIP)、显著性多元相关(sMC)3种信息向量对应的BS-PLSDA均优于其单独PLS-DA建模的特征变量筛选稳健性。在样本划分比例为7:3时,BS-VIP-PLSDA的Kuncheva指数高达0.807 5。筛选出的稳健特征变量与文献报道的诊断指标一致,不仅真正解释组别间的差异与GA-Ⅰ的代谢机理密切相关,且BS-LW-PLSDA、BS-VIP-PLSDA和BS-sMC-PLSDA展示了良好的预测性能,受试者工作特征曲线下面积均值分别为0.773 9、0.854 8和0.847 1,马修斯相关系数均值分别为0.671 9、0.783 ...  相似文献   

9.
将稳定度自适应重加权采样特征变量选择算法用于支持向量机定性分析(Support vector machine-stability competitive adaptive reweighted sampling,SVM-SCARS)。该算法通过对数据多次采样建模计算各变量的稳定度值,稳定度值能更加客观准确地评估变量在建模中的作用,因此可作为变量重要性的评价依据。通过循环迭代方式,采用自适应重加权采样技术逐步筛选变量,然后以每次循环所得变量子集建立SVM模型,并以模型交叉验证分类正确率(Correct classification rate of cross validation,CCRCV)评估子集优劣,确定最优特征变量子集。将该算法结合漫反射近红外光谱技术建立了制浆造纸常用木材的树种识别模型,实现了对4种桉木和2种相思木的快速识别分类。最终共筛选出15个特征变量建立分类模型,模型对各树种分类的正确率达97.9%,具有较好的分类效果。与全光谱模型和递归特征消除支持向量机模型相比,SVM-SCARS能够筛选出更少的特征变量,且模型具有更好的预测性能和稳定性。研究结果表明,SVM-SCARS算法能够有效优化光谱特征变量,提高近红外在线分析模型在木材材性分析中的稳健性和适用性。  相似文献   

10.
建立了一种新的基于过程分析技术(PAT)和质量源于设计(QbD)设计空间的中药制药过程终点分析与控制方法.以近红外(NIR)光谱技术为PAT工具, 采集正常操作条件下制药过程的多批次NIR光谱; 采用主成分分析结合移动块相对标准偏差(PCA-MBRSD)法, 确定每一批次过程的理想终点样本(DEPs), 由多批DEPs的光谱信息构成过程终点设计空间; 在过程终点设计空间确定的范围内, 建立多变量统计过程控制(MSPC)模型, 利用多变量Hotelling T2和SPE控制图对过程终点进行判断.应用上述方法, 进行了金银花醇沉加醇过程终点检测研究, 结果表明该方法灵敏、准确, 适宜于中药制药过程终点检测.  相似文献   

11.
Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed.In the approach,a subset of non-collinear variables is generated using successive projections algorithm(SPA) for each variable in the reduced spectra by uninformative variables elimination(UVE).Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo(MC) re-sampling,and the sub-model that produces minimal error in cross validation is selected as a member model.With repetition of the MC re-sampling,a series of member models are built and a consensus model is achieved by averaging all the member models.Since member models are built with the best variable subset and the randomly selected calibration subset,both the quality and the diversity of the member models are insured for the consensus model.Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method.The superiority of the method in both accuracy and reliability is demonstrated.  相似文献   

12.
This paper proposes an analytical method for simultaneous near-infrared (NIR) spectrometric determination of α-linolenic and linoleic acid in eight types of edible vegetable oils and their blending. For this purpose, a combination of spectral wavelength selection by wavelet transform (WT) and elimination of uninformative variables (UVE) was proposed to obtain simple partial least square (PLS) models based on a small subset of wavelengths. WT was firstly utilized to compress full NIR spectra which contain 1413 redundant variables, and 42 wavelet approximate coefficients were obtained. UVE was then carried out to further select the informative variables. Finally, 27 and 19 wavelet approximate coefficients were selected by UVE for α-linolenic and linoleic acid, respectively. The selected variables were used as inputs of PLS model. Due to original spectra were compressed, and irrelevant variables were eliminated, more parsimonious and efficient model based on WT-UVE was obtained compared with the conventional PLS model with full spectra data. The coefficient of determination (r2) and root mean square error prediction set (RMSEP) for prediction set were 0.9345 and 0.0123 for α-linolenic acid prediction by WT-UVE-PLS model. The r2 and RMSEP were 0.9054, 0.0437 for linoleic acid prediction. The good performance showed a potential application using WT-UVE to select NIR effective variables. WT-UVE can both speed up the calculation and improve the predicted results. The results indicated that it was feasible to fast determine α-linolenic acid and linoleic acid content in edible oils using NIR spectroscopy.  相似文献   

13.
The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named “consensus SPA-MLR” (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques.  相似文献   

14.
Sample selection is often used to improve the cost-effectiveness of near-infrared (NIR) spectral analysis. When raw NIR spectra are used, however, it is not easy to select appropriate samples, because of background interference and noise. In this paper, a novel adaptive strategy based on selection of representative NIR spectra in the continuous wavelet transform (CWT) domain is described. After pretreatment with the CWT, an extension of the Kennard–Stone (EKS) algorithm was used to adaptively select the most representative NIR spectra, which were then submitted to expensive chemical measurement and multivariate calibration. With the samples selected, a PLS model was finally built for prediction. It is of great interest to find that selection of representative samples in the CWT domain, rather than raw spectra, not only effectively eliminates background interference and noise but also further reduces the number of samples required for a good calibration, resulting in a high-quality regression model that is similar to the model obtained by use of all the samples. The results indicate that the proposed method can effectively enhance the cost-effectiveness of NIR spectral analysis. The strategy proposed here can also be applied to different analytical data for multivariate calibration.  相似文献   

15.
Near-infrared (NIR) spectrometry is now widely used in various fields and great attention is paid to the application of it to addressing complex problems, which brings about the need for the calibration of systems that fail to exhibit satisfactional linear relationship between input-output data. In this work we present a novel method to build a multivariate calibration model for NIR spectra, i.e. genetic algorithm-radial basis function network in wavelet domain (WT-GA-RBFN), which combines the advantages of wavelet transform and genetic algorithm. The variable selection is accomplished in two stages in wavelet domain: at the first stage, the variables are pre-selected (compressed) by variance and at the second stage the variables are further reduced by a special designed GA. The proposed method is illustrated through presenting its application to three NIR data sets in different fields and the comparison to PLS model.  相似文献   

16.
利用近红外光谱技术对食用植物油中反式脂肪酸(Trans fatty acids,TFA)含量进行快速定量检测,并通过波段选择、预处理方法、变量筛选及建模方法对TFA含量预测模型进行优化.采用AntarisⅡ傅里叶变换近红外光谱仪在4000~10000 cm-1光谱范围采集98个食用植物油样本的近红外透射光谱,然后采用气相色谱法测定TFA的真实含量.首先,对样本原始光谱进行波段、预处理方法优选;在此基础上,采用竞争自适应重加权法(Competitive adaptive reweighted sampling,CARS)筛选TFA相关的重要变量,最后应用主成分回归、偏最小二乘和最小二乘支持向量机方法分别建立食用植物油中TFA含量的预测模型.研究结果表明,近红外光谱技术检测食用植物油中的TFA含量是可行的,优化后的最佳预测模型的校正集和预测集R2分别为0.992和0.989,RMSEC和RMSEP分别为0.071%和0.075%.最佳预测模型所用的变量仅26个,占全波段变量的0.854%.此外,与全波段偏最小二乘预测模型相比,其预测集R2由0.904上升为0.989,RMSEP由0.230%下降为0.075%.由此表明,模型优化非常必要,CARS能有效筛选TFA相关的重要变量,极大减少建模变量数,从而简化预测模型,并较大提高预测模型的精度和稳定性.  相似文献   

17.
A PLS model for prediction of somatic cell count (SCC) based on near-infrared (NIR) spectra of unhomogenized milk is presented in the study. Samples of raw milk were collected from cows in the early lactation period (from 7th to 29th day after parturition). The NIR spectra were measured in the region 400–1100 nm. As reference method a fluoro-opto-electronic method was applied. Different preprocessing methods were investigated. The robust version of PLS regression was applied to handle outliers present in the dataset and the uninformative variable elimination–partial least squares (UVE–PLS) method was used to eliminate uninformative variables. The final model is acceptable for prediction of SCC in raw milk.  相似文献   

18.
Metronidazole is a widely used antibacterial and amoebicide drug. The feasibility of the classification of metronidazole samples with respect to their brands was investigated by near-infrared (NIR) spectroscopy along with chemometrics. A total of 92 samples of different lots and four brands were collected for measurements. First, principal component analysis was conducted to visualize the difference between metronidazole samples of different brands. Then, based on an effective classifier-independent method, i.e., joint mutual information, only the 30 most important variables were selected for modeling. From the independent test set, the partial least-squares discriminant analysis model based on the reduced variable set was compared with the corresponding full-spectrum model using all variables, which indicates the model based on the reduced variable set outperforms the full-spectrum model. It appears that the combination of NIR spectroscopy, joint mutual information, and partial least-squares discriminant analysis is a potential method for the classification of metronidazole from different brands and can, therefore, be used in the screening of counterfeit pharmaceutical products.  相似文献   

19.
主成分分析-支持向量回归建模方法及应用研究   总被引:14,自引:5,他引:14  
将主成分分析(PCA)用于近红外光谱的特征提取,并与支持向量回归(SVR)相结合,实现了主成分分析-支持向量回归(PCA-SVR)用于近红外光谱定量分析的建模方法。与单纯的SVR方法相比,不仅提高了运算速度,而且提高了模型的预测准确度。将PCA-SVR方法用于烟草样品中总糖和总挥发碱含量的测定,所得结果的预测均方根误差分别为1.323和0.0477;回收率分别为91.8%~112.6%和88.9%~120.2%。  相似文献   

20.
小波变换-分段直接校正法用于近红外光谱模型传递研究   总被引:7,自引:0,他引:7  
提出了一种新的传递算法(WT-PDS)———小波变换-分段直接校正法,并详细讨论了模型传递参数和传递结果。首先利用小波变换对光谱进行压缩处理,采用PDS算法消除不同仪器之间压缩数据的差异,最后利用经校正的压缩数据进行分析,实现模型传递。本方法能够扣除不同仪器之间的大部分差异,大幅度改善分析精度。传递后模型分析精度与源机模型稳健性紧密相关。如果源机模型稳健性强,则能够实现不同仪器之间的共享。本方法能够实现源机的0#轻柴十六烷值、凝点、馏出温度;-10#轻柴十六烷值、凝点以及-10#军柴凝点和馏出温度共10个模型在5台仪器之间共享,简化了建模的成本。与传统的PDS相比,WT-PDS方法具有传递和建模变量少、速度快、光谱校正性能高等优点,而其模型分析精度与传统PDS基本一致。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号