首页 | 本学科首页   官方微博 | 高级检索  
     检索      

中红外和近红外数据融合的香型风格判别
引用本文:沙云菲,黄雯,王亮,刘太昂,岳宝华,李敏杰,尤静林,葛炯,谢雯燕.中红外和近红外数据融合的香型风格判别[J].光谱学与光谱分析,2021,41(2):473-477.
作者姓名:沙云菲  黄雯  王亮  刘太昂  岳宝华  李敏杰  尤静林  葛炯  谢雯燕
作者单位:上海烟草集团有限责任公司技术中心,上海 200082;上海大学化学系,上海 200444
基金项目:中国烟草总公司科技重大专项(中烟办[2016]259);国家自然科学基金青年基金项目(21706156)资助。
摘    要:烤烟香型的判别一直是烟草行业的关注焦点。利用中红外和近红外光谱对189份不同香型的烟叶进行分析。分别从中红外谱图数据中提取21个特征波数处以及近红外谱图数据中13个特征波数处的吸光值作为影响因素。通过主成分分析方法分别对选取的中红外、近红外数据进行烟叶清香型、中间香型和浓香型三种香型风格的定性分析。结果表明基于中红外和近红外数据PCA投影图中三种香型混淆严重,区分界面不清晰。随后,将中红外、近红外数据进行融合,将提取的34个特征波数处的吸光值同时代入主成分分析, 得到基于中红外和近红外融合数据的PCA投影图。该投影图可以将不同香型的烟叶明显地区分出来。随后利用后退法和遗传算法对中红外和近红外融合后的34个吸光度值进行变量选择,后退法选择出了24个变量,遗传算法选择出了19个变量。对比34,24和19个变量的烟叶三种香型风格的主成分投影图,遗传算法虽然选择了比较少的变量,但其仍然可以将烟叶进行准确的分类。利用遗传算法对中红外和近红外融合后数据进行变量选择,剔除对烟叶香型分类影响小的因素。最后,利用支持向量机建立烟叶清香型、中间香型和浓香型分类判别模型。该模型的建模结果准确率为92.72%,其中清香型、中间香型和浓香型的准确率分别为93.75%,92.11%和91.84%。内部交叉验证留一法结果准确率为88.74%,其中清香型、中间香型和浓香型的准确率分别为90.63%,86.84%和87.76%。对未知样本预报结果的准确率为86.84%,其中清香型、中间香型和浓香型的准确率分别为88.24%,85.71%和85.71%。无论是建模结果、留一法结果和预报结果其准确率都大于85%。研究结果表明中红外和近红外数据融合可以提供更多的特征信息,利用这些信息可以建立烟叶香型风格的分类判别模型,为烟叶香型风格快速鉴别提供帮助。

关 键 词:中红外光谱  近红外光谱  烤烟  数据融合
收稿时间:2020-02-17

Merging MIR and NIR Spectral Data for Flavor Style Determination
SHA Yun-fei,HUANG Wen,WANG Liang,LIU Tai-ang,YUE Bao-hua,LI Min-jie,YOU Jing-lin,GE Jiong,XIE Wen-yan.Merging MIR and NIR Spectral Data for Flavor Style Determination[J].Spectroscopy and Spectral Analysis,2021,41(2):473-477.
Authors:SHA Yun-fei  HUANG Wen  WANG Liang  LIU Tai-ang  YUE Bao-hua  LI Min-jie  YOU Jing-lin  GE Jiong  XIE Wen-yan
Institution:1. Technology Center of Shanghai Tobacco Group Co., Ltd., Shanghai 200082, China 2. Department of Chemistry, Shanghai University, Shanghai 200444, China
Abstract:Tobaccos flavor type’s determination is an important field tobacco industry.In this work,189 tobacco samples with different flavor were tested by middle infrared(MIR)spectrum and near-infrared(NIR)spectrum.After the test,21 characteristic absorption value from a certain wavelength in the MIR spectrum and 13 characteristic absorption value from a certain wavelengthin the IR spectrum were selected as main variants.Then the characteristic data extracted from MIR and IR spectrum were submitted to the principal component analysis(PCA),respectively.The PCA pattern showed a poor classification result by using MIR and IR data solely.After that,the MIR and IR variants were submitted to PCA analysis as merged data.The PCA pattern calculated from merged data showed a good classification result.Through the data analysis,there different flavor Style(fen-flavor Style,medium flavor Style and robust flavor Style)can be classified clearly into their category.After PCA analysis,different mathematical algorithms as step-back algorithm and genetic algorithm were applied to select 34 variants that used in PCA model.24 variants and 19 variants were selected by step-back algorithms and genetic algorithms,respectively.Compared to the projection pattern by using different variant selected by a different algorithm,we found that though the genetic algorithms used the least variants,the classification result is as good as PCA algorithms and step-back algorithms.After that,genetic algorithms were chosen to make projection drawing that separated three different flavors into different planes by using least variants chosen from MIR and IR merged data.Finally,a support vector classification(SVC)model was built to determine different tobacco flavor by using the variants selected by the genetic algorithm.The accuracy of the model was 92.72%,the accuracy in discriminating fen-flavorstyle,medium flavorstyle and robust flavorstyle were 93.75%,92.11%and 91.84%.The accuracy of predicted outputs was tested by the leave-one-out cross validation(LOOCV).And the accuracy of LOOCV was 88.24%,the accuracy in discriminating fen-flavorstyle,medium flavorstyle and robust flavorstyle were 90.63%,86.84%,and 87.76%.The accuracy in prediction of the unknown sample was 86.84%and the accuracy in discriminating fen-flavorstyle,medium flavorstyle and robust flavorstyle were 88.24%,85.71%and 85.71%.The results of accuracy are above 85%in model test,LOOCV teat and the prediction of unknown sample.The result shows that the mixing data from the MIR spectrum and NIR spectrum can provide more information in the mathematical model building and provide an efficient way in fast tobacco flavor discrimination.
Keywords:Middle infrared spectrum  Near infrared spectrum  Tobacco flavor  Data fusion
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号