首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 78 毫秒
1.
基于群体智能的灰狼优化(GWO)算法具有参数少、结构简单、易于实现的优点,但在光谱领域的应用较少。该研究将GWO算法引入近红外光谱的变量筛选中,以玉米数据为例,考察了GWO算法中狼群性能、迭代次数、狼群数量及运算效率,并建立了偏最小二乘(PLS)模型对玉米样品中蛋白质、脂肪、水分以及淀粉含量的测定。结果显示,GWO算法运算效率很高,经过参数调优后建立PLS模型,其蛋白质、脂肪、水分及淀粉的保留变量数分别为19、19、14、34,预测均方根误差(RMSEP)从全波长PLS建模的0.2458、0.1224、0.3398、1.1058分别下降到0.1477、0.0801、0.1762、0.7398,分别下降了40%、35%、48%、33%,相关系数也相应地提高。因此,GWO算法不仅优化速度快,选择变量数少,还可以显著提高PLS模型的预测精度,是一种近红外光谱变量选择的有效方法。  相似文献   

2.
该文针对近红外光谱因冗余变量导致的标定模型预测性能差的问题,提出了一种迭代缩减窗口自助软收缩(ISWBOSS)算法。该方法使用窗口对变量进行划分,随机抽取窗口并利用其中的变量建立子模型,计算窗口内变量回归系数的归一化并作为权重继续进行加权采样,从而逐步实现变量空间的软收缩。同时在迭代过程中不断缩减窗口大小对特征变量进行精确搜索。通过在玉米数据集上进行验证,并与全谱法、遗传算法、竞争自适应重加权采样法和自助软收缩法建立的偏最小二乘模型对比,结果表明,新方法不论在准确性还是稳定性上都具有显著优势。以玉米蛋白质含量预测为例,与自助软收缩算法相比,ISWBOSS的预测均方根误差从0.0418降至0.0103,且达到最优模型所需的迭代次数更少,运算效率更高。该方法对提高近红外光谱标定模型的性能具有一定的指导意义。  相似文献   

3.
程介虹  陈争光 《分析化学》2021,49(8):1402-1409
连续投影算法(SPA)作为一种波长选择算法,用于近红外光谱的定量分析中以简化模型复杂度,提高模型预测精度.由SPA算法的原理可知,SPA算法只能保证相邻两次投影所选择的两个波长之间具有较低的冗余性,但不保证所选变量一定是有效变量,即SPA筛选出的变量子集中可能包含一些无信息变量甚至是干扰变量.所以通过迭代保留信息变量(...  相似文献   

4.
为了提高近红外光谱定量分析的预测精度和建模效率,提出了一种基于交互式自模型的混合物分析的波长优选方法,根据光谱各波长变量的纯度值和标准差值,选择含有用信息的波长变量,并引入相关权函数解决变量间共线性问题.通过依次迭代选择的变量建立定量校正模型,由交互验证均方根预测误差(RMSECV)确定最佳波长变量个数.应用该波长变量优选方法对具有不同葡萄糖含量的两组(四成分葡萄糖水溶液实验和人体血浆实验)近红外光谱数据进行分析,两组数据中分别只选择了全部变量的0.3%建立定量校正模型,其验证集葡萄糖浓度的均方根预测误差(RMSEP)分别减少为669和15 mg/L.与全谱范围及优选波段建立的定量校正模型比较,本方法能够通过波长变量优选最小化冗余信息、提高预测精度及建模效率.  相似文献   

5.
变量选择经常被用于优化近红外光谱线性校正模型,消除冗余信息,提升回归的准确性和可解释性。该文研究并设计了一种基于蒙特卡洛的方法,用于评估不同线性校正方法在变量选择的子空间中能达到的最优程度,寻找变量选择对线性校正模型的优化极限。该方法通过获得验证指标——预测均方根误差(RMSEP)的分布图,揭示变量选择方法在数据集上的优化效果与优化极限。将该方法应用于3组样品的近红外光谱建模研究,结果表明:在烟草-果胶数据集上的可优化率约为24.98%,RMSEP降低了15.2%;在小麦-蛋白质数据集上的可优化率约为13.90%,RMSEP降低了9.5%;在玉米-淀粉数据集上的可优化率约为14.05%,RMSEP降低了57.1%。应用该方法可以快速得到变量选择方法在模型上的优化极限,为变量选择方法的设计、应用和评估提供参考。  相似文献   

6.
陈笑  宦克为  赵环  范恒晔  韩雪艳 《分析化学》2021,49(10):1743-1749
近红外光谱分析技术已被广泛应用于食品检测及定量分析等领域.变量选择作为近红外光谱建模分析中的关键步骤,对于提高模型的稳定性和预测性能具有重要作用.本研究提出了一种近红外光谱变量选择方法,即变量频次加权自助采样法(Variable frequency weighted bootstrap sampling,FWBS),通...  相似文献   

7.
彩色相机的颜色校正是实现成像色彩一致性的必要保障手段。传统的相机颜色校正中,对测量数据多采用多项式回归分析来确定颜色定标系数,存在着精度不高的缺点,因此,本文对测量数据提出了基于LASSO的高阶多项式回归拟合方法,利用LASSO压缩系数的特点,在保证计算复杂度的前提下,有效提高了回归模型的校正精度。在D65标准光源下对ColorChecker 24色卡进行了实际成像实验,并用CIELAB色差公式表征了校正效果,实验结果表明,新方法的校正效果明显优于传统的线性回归、二次多项式回归方法,平均色差指标可以达到5个CIELAB色差单位。  相似文献   

8.
利用双脉冲激光诱导击穿光谱(LIBS)技术对溶液中的倍硫磷含量进行定量检测。采用二通道高精度光谱仪采集不同浓度倍硫磷样品在206.28~481.77nm波段的LIBS光谱,并对光谱进行多元散射校正(MSC)、标准正态变量变换(SNV)及3点平滑预处理,根据偏最小二乘(PLS)建模确定最优的预处理方法。在此基础上,利用竞争性自适应重加权算法(CARS)筛选与倍硫磷相关的重要变量,然后应用PLS回归建立溶液中倍硫磷含量的定量分析模型,并与单变量定量分析模型及未变量选择的PLS定量分析模型进行比较。结果表明,相比单变量定量分析模型及原始光谱PLS定量分析模型,CARS-PLS定量分析模型的性能更优,其模型的校正集和预测集的决定系数及平均相对误差分别为0.9694、15.537%和0.9959、5.016%。此外,与原始光谱PLS模型相比,CARS-PLS模型仅使用其中1.9%的波长变量,但预测集平均误差却由9.829%下降为5.016%。由此可见,LIBS技术检测溶液中的倍硫磷含量具有一定的可行性,且CARS方法能简化定量分析模型,提高模型的预测精度。  相似文献   

9.
汪若馨  闫广河  刘鹏  张妍  卞希慧 《分析化学》2024,52(11):1717-1725
近红外光谱具有简单、快速和无损等特点,已成为广泛采用的复杂体系的定性和定量分析方法.然而近红外光谱通常包含大量与目标组分不相关的冗余波长,导致预测模型的预测性能变差,因此在建模前需对光谱变量进行选择.本研究首次将蜉蝣算法(Mayfly algorithm,MA)离散化并用于近红外光谱定量分析.MA模拟蜉蝣的求偶与交配行为,首先设置相同数量的雌性和雄性蜉蝣个体,对蜉蝣进行位置更新并离散.雄性蜉蝣吸引雌性蜉蝣通过\"门当户对\"的交配以及突变的方式产生子代,子代数量固定为20.将得到的子代加入原始种群中,根据总种群数保留相应数量的最优个体,使种群数在每次迭代后保持不变,形成的新一代种群进行下一次迭代.重复上述过程,直至达到最大迭代次数.采用玉米和掺伪植物油的近红外光谱数据验证了MA算法的性能.对MA算法中重力系数、迭代次数和种群数量3个参数进行优化.采用MA选择后的变量和待分析组分的含量建立偏最小二乘(Partial least squares,PLS)模型,并与全光谱PLS模型进行对比.结果显示,MA-PLS模型对玉米数据集中油、水分、蛋白质和淀粉含量预测的预测均方根误差(Root mean square error of prediction,RMSEP)比PLS模型分别下降了30.59%、40.24%、36.96%和27.93%,对掺伪植物油数据集中紫苏籽油、大豆油、玉米油和棉籽油含量预测的RMSEP分别下降了83.85%、90.90%、81.60%和92.18%.此外,MA-PLS所使用的变量数也显著少于PLS模型.因此,MA算法能够有效降低PLS模型的复杂度,提高PLS模型预测的准确性.  相似文献   

10.
随着大量分子描述符应用于QSAR/QSPR,如何筛选出具有良好稳定性和预测能力的描述符集,成为亟待解决的一个瓶颈问题.将63个有机化合物的1664个描述符经过初步预选后,利用偏最小乘(PLS)方法进行变量筛选,获得42个重要描述符;随机选择43个有机物,针对透聚乙烯膜性能进行训练研究,得优良估计能力和良好稳定性模型(A=6,r2=0.9647,RMSE=0.213,q2=0.8364,RMSV=0.467);对模型外部20个有机物进行预测,表明模型具有良好预测能力(rp2=0.9306,RMSP=0.326).PLS变量筛选法可以快速有效地筛选与活性密切相关的重要描述符,进而构建具有良好稳定性和预测能力的QSAR模型.  相似文献   

11.
In the past decade, there has been an increase in the use of sparse multivariate calibration methods in chemometrics. Sparsity describes a parsimonious state of model complexity and can be defined in terms of a subset of samples or covariates (e.g., wavelengths) that are used to define the calibration model. With respect to their classical counterparts such as principal component regression or partial least squares, sparse models are more easily interpretable and have been shown to exhibit non‐inferior prediction performance. However, sparse methods are still not as fast as the classical methods in spite of recent numerical advances. In addition, for many chemometricians, sparse methods are still “black‐box” algorithms whose internal workings are not well understood. In this paper, we describe a simple framework whereby classical multivariate calibration methods can be iteratively used to generate sparse models. Moreover, this approach allows for either wavelength or sample sparsity. We demonstrate the effectiveness of this approach on two spectroscopic data sets. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

12.
张进  胡芸  周罗雄  李博岩 《分析测试学报》2020,39(10):1196-1203
近红外光谱是一种绿色、快捷的分析技术,在科学研究、工业生产以及日常检测中得到广泛应用。化学计量学算法的应用在近红外光谱技术的发展过程中发挥了重要作用。化学计量学方法通过寻找测量变量之间的相关性,构建数学模型,量化样本间的差异性,并发现事物变化的内在规律,实现较合理准确的未知预测。这也是“大数据”战略的重要环节和主旨所在。该文针对近红外光谱吸收信号较弱、谱峰重叠严重,以及光谱测量过程中易受背景、噪声、无信息变量和外界环境因素干扰等,导致借助化学计量学方法建立的光谱与研究目标的定性定量分析模型变差问题,总结了近年来在近红外光谱领域所提出的一些化学计量学新方法,包括光谱预处理、变量选择、多元校正和模型转移,从不同角度阐述了这些方法在消除近红外光谱模型的干扰因素,提高模型的可靠性、预测准确性和适用性等方面的作用。  相似文献   

13.
利用高光谱技术对培养基上细菌(大肠杆菌、李斯特菌和金黄色葡萄球菌)菌落进行快速识别和分类。采集琼脂培养基上细菌菌落的高光谱反射图像(390~1040 nm),在对波段差图像进行大津阈值分割的基础上自动提取细菌菌落光谱,并建立细菌分类检测的全波长和简化偏最小二乘判别( PLS-DA)模型。全波长模型对预测集样本的分类准确率和置信预测分类准确率分别为100%和95.9%。此外,利用竞争性自适应重加权算法( CARS)、遗传算法( GA)和最小角回归算法( LARS-Lasso)进行波长优选并建立对应简化模型。其中,CARS简化模型在精度、稳定性及分类准确率方面均优于GA和LARS-Lasso简化模型,其对预测集样本的分类准确率和置信预测分类准确率分别达到了100%和98.0%。研究表明,高光谱是一种细菌菌落高精度、快速、无损识别检测的有效方法。简化模型中优选的波长可以为开发低成本检测仪器提供理论依据。  相似文献   

14.
    
Audio magnetic tapes manufactured using polyester urethane are known to become nonplayable over time due to the degradation of the magnetic layer. Attempting to play degraded tapes to digitize them can cause extensive damage to the tape as well as to the play back device. For this reason, most of the magnetic tapes in cultural heritage institutions are in critical state. The purpose of our study is to preserve historical recordings in magnetic tapes by developing a nondestructive technique to determine degradation status. Our approach is to combine attenuated total reflectance Fourier transform infrared spectroscopy (ATR FT-IR) with chemometric techniques, especially neural networks and least absolute shrinkage and selection operator (Lasso). The model built using neural networking was able to successfully classify playable and nonplayable with 97% to 98% accuracy when similar tape brands/models were in the training and the test set. With different brands/models in the test set, neural network model performed poorly. However, Lasso showed 95.5% accuracy for similar brand/models and 80.5% accuracy for different tape brands/models. This suggests that Lasso is the better technique to determine if a tape is degraded or not.  相似文献   

15.
Flow injection analysis (FIA) with multiwavelength scanning of the FIA peaks using a diode array detector (DAD) has been combined with a multivariate calibration approach applying the partial least squares (PLS) method for the data evaluation. In this way, various side effects like dilution of the reagent, high blank, absorbance changes due to the pH gradient throughout the peak and/or the other interferences can be accounted for. Thus, even with a simple FIA manifold instrumentation the satisfactory results of multicomponent analysis are obtained. The method described has been checked on analysis of binary (Ca and Mg) and ternary (Ca, Mg and Cu) mixtures with pyridylazo resorcinol (PAR) as reagent and applied for rapid determination of calcium and magnesium in dialysis liquids and waters.  相似文献   

16.
This article studies calibration maintenance and transfer to build a statistical model that is able to predict analyte concentrations by a set of spectra. Noticing that the wavelength atoms are naturally ordered in a meaningful way, we propose a novel robust fused LASSO (RFL) based on high‐dimensional sparsity techniques and a recent Θ‐IPOD technique for robustification. This new approach can attain simultaneous wavelength selection and grouping as well as outlier identification, without any human intervention. An efficient and scalable algorithm is developed on the basis of the alternating direction method of multipliers. The obtained RFL model is sparse and shows improved prediction performance over the LASSO and ridge regression. Our results reveal that wavelengths can be combined into blocks, in a smart manner, to enhance the interpretability and reliability for super‐resolution spectral analysis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

17.
18.
The application of Raman spectroscopic techniques combined with multivariate chemometrics signal processing promise new means for the rapid multidimensional analysis of metabolites non‐destructively, with little or no sample preparation and little sensitivity to water. However, Rayleigh scattering, fluorescence and uncontrolled variance present substantial challenges for the accurate quantitative analysis of metabolites at physiological levels in biologically varying samples. Effective strategies include the application of chemometrics pretreatments for reducing Raman spectral interference. However, the arbitrary application of individual or combined pretreatment procedures can significantly alter the outcome of a measurement, thereby complicating spectral analysis. This paper evaluates and compares six signal pretreatment methods for correcting the baseline variances, together with three variable selection methods for eliminating uninformative variables, all within the context of multivariate calibration models based on partial least squares (PLS) regression. Raman spectra of 90 artificial bio‐fluid samples with eight urine metabolites at near‐physiological concentrations were used to test these models. The combination of multiplicative scatter correction (MSC), continuous wavelet transform (CWT), randomization test (RT) and PLS modeling presented the best performance for all the metabolites. The correlation coefficient (R) between predicted and prepared concentration reached as high as 0.96.  相似文献   

19.
Consensus methods have presented promising tools for improving the reliability of quantitative models in near-infrared(NIR) spectroscopic analysis.A strategy for improving the performance of consensus methods in multivariate calibration of NIR spectra is proposed.In the approach,a subset of non-collinear variables is generated using successive projections algorithm(SPA) for each variable in the reduced spectra by uninformative variables elimination(UVE).Then sub-models are built using the variable subsets and the calibration subsets determined by Monte Carlo(MC) re-sampling,and the sub-model that produces minimal error in cross validation is selected as a member model.With repetition of the MC re-sampling,a series of member models are built and a consensus model is achieved by averaging all the member models.Since member models are built with the best variable subset and the randomly selected calibration subset,both the quality and the diversity of the member models are insured for the consensus model.Two NIR spectral datasets of tobacco lamina are used to investigate the proposed method.The superiority of the method in both accuracy and reliability is demonstrated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号