首页 | 本学科首页   官方微博 | 高级检索  
     

基于互信息的遗传算法在光谱谱段选择中应用
作者单位:中国海洋大学信息科学与工程学院,山东 青岛 266100
基金项目:国家科技支撑计划课题(2015BAF12B01)资助
摘    要:在近红外光谱分析技术中,建立一个准确、稳健的定量模型至关重要。全光谱建模会增加建模和预测时间,降低模型的稳健性和预测精度,因此有效的变量选择方法对于模型构建至关重要。针对该问题,提出了基于互信息的遗传算法(GAs-MI)对特征变量进行选择,互信息筛选掉大量无关信息和冗余信息,遗传算法进一步选择出高辨别力的特征;并在遗传算法的变异过程中引入Shapley值方法,减少了人为设定参数的随机性。为了验证算法的有效性,选取有代表性的273个烟叶样本为实验材料,随机选择其中182个样本实现对烟叶总烟碱的PLS定量建模,剩余样本作为测试集,以相关系数(R)、交互验证均方差(RMSECV)和预测均方根误差(RMSEP)为模型评价指标。实验结果表明,通过该方法选择的波长建立的模型更加简单、预测能力更强。

关 键 词:近红外光谱  互信息  Shapley值  遗传算法  波长选择  
收稿时间:2017-02-28

Research on Genetic Algorithm Based on Mutual Information in the Spectrum Selection
KONG Qing-qing,GONG Hui-li,DING Xiang-qian,LIU Ming. Research on Genetic Algorithm Based on Mutual Information in the Spectrum Selection[J]. Spectroscopy and Spectral Analysis, 2018, 38(1): 31-35. DOI: 10.3964/j.issn.1000-0593(2018)01-0031-05
Authors:KONG Qing-qing  GONG Hui-li  DING Xiang-qian  LIU Ming
Affiliation:College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China
Abstract:It is vital to establish an accurate and robust quantitative model in near-infrared spectroscopy. The whole spectrum modeling can increase the computational time of modeling and forecasting, and reduce the robustness and precision. Therefore the effective variable selection method is very important for model construction. To address this problem, this paper proposed a genetic algorithm based on mutual information (GAs-MI) to select features. Mutual information filtered out a large number of unrelated information and redundant information. Genetic algorithm further selected the features with high discernment. Shapley value method was introduced to reduce the randomness of artificial setting parameters in the mutation process of genetic algorithm. In order to validate the validity of the algorithm, 273 representative tobacco samples were selected as the experimental materials. 182 samples were randomly selected to construct the PLS quantitative model of tobacco nicotine,and the remaining samples were used as the test set. The Correlation Coefficient (R), the Root Means Square Error of Cross Validation (RMSECV) and the Root Mean Square Error of Prediction (RMSEP) were used as the model evaluation indexes. The experimental results showed that the model established by the selected wavelength was simpler and more predictive.
Keywords:Near infrared spectrum  Mutual information  Shapley value  Genetic algorithm  Wavalength selection  
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号