首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于XGBoost与可见-近红外光谱的煤矸识别方法
引用本文:李 瑞,李 博,王学文,刘 涛,李廉洁,樊书祥.基于XGBoost与可见-近红外光谱的煤矸识别方法[J].光谱学与光谱分析,2022,42(9):2947-2955.
作者姓名:李 瑞  李 博  王学文  刘 涛  李廉洁  樊书祥
作者单位:1. 太原理工大学机械与运载工程学院,山西 太原 030024
2. 北京农业智能装备技术研究中心,北京 100097
基金项目:国家自然科学基金项目(51804207,51875386),山西省“1331”工程项目资助
摘    要:煤矸智能识别是实现综放开采智能化亟待研发的新技术;可见-近红外光谱技术具有环保、实时等优势,满足煤矸智能分选的要求。为解决基于可见-近红外光谱的煤矸识别问题,引入在数据科学竞赛中表现出色的极端梯度提升树(XGBoost)算法。搭建可见-近红外光谱实验平台采集来自山西西铭、陕西神木、内蒙古巴隆图煤矿的块状煤与矸石样品在370~1 049 nm波段的反射光谱;利用黑白校正、始末波段去除、SG卷积平滑和标准正态变量变换(SNV)对采集的原始光谱进行预处理,以减少光照不均、噪声以及光程差的影响。依据三个煤矿煤与矸石样品反射光谱的差异划分实验组和测试组,实验组差异微小,用于对比不同模型的性能,挑选最佳算法;测试组差异较明显,用于测试最佳算法在其他煤矿下的表现,检验算法对不同煤矿的适用性。在实验组的实验中,首先基于XGBoost算法建立煤与矸石分类模型,并引入常用的机器学习分类算法k近邻法(KNN)、随机森林(RF)、支持向量机(SVM)做对比,结果表明XGBoost的表现最佳,十折交叉验证的平均准确度(ACC10)、分类准确度(ACC)与AUC值分别达到0.957 2,0.970 5与0.971 6,体现出较强的稳定性与分类能力。其次为降低数据维度减少模型运算量,使用递归特征选择(RFE)、连续投影算法(SPA)与竞争性自适应重加权算法(CARS)分别进行特征波长的选择并与上述四种分类算法结合构建简化分类模型,经测试RFE与XGBoost组合的简化模型表现最佳,ACC10,ACC与AUC值分别为0.965 7,0.980 3与0.980 3且数据维度降至9,在降低数据维度的同时提高了模型的稳定性与分类能力。在测试组的实验中,基于优选出的XGBoost与RFE-XGB算法建立的模型,同样可以实现对其他矿区煤与矸石稳定精确地识别,且简化模型表现更好,与实验组结果一致。

关 键 词:XGBoost  可见-近红外光谱  煤矸石分选  黑色背景  无损检测  
收稿时间:2021-10-19

A Classification Method of Coal and Gangue Based on XGBoost and Visible-Near Infrared Spectroscopy
LI Rui,LI Bo,WANG Xue-wen,LIU Tao,LI Lian-jie,FAN Shu-xiang.A Classification Method of Coal and Gangue Based on XGBoost and Visible-Near Infrared Spectroscopy[J].Spectroscopy and Spectral Analysis,2022,42(9):2947-2955.
Authors:LI Rui  LI Bo  WANG Xue-wen  LIU Tao  LI Lian-jie  FAN Shu-xiang
Institution:1. College of Mechanical and Vehicle Engineering, Taiyuan University of Technology, Taiyuan 030024, China 2. Beijing Research Center of Intelligent Equipment for Agriculture, Beijing 100097, China
Abstract:Intelligent recognition of coal and gangue is a new technology that needs to be developed urgently to realize the intelligentization of fully mechanized caving mining. Visible-near infrared spectroscopy technology has many advantages such as environmental friendly and real-time, which meets the requirements of intelligent separation of coal and gangue. The Extreme Gradient Boosting Tree (XGBoost) algorithm which performed well in data science competitions, was introducedto achieve the recognition of coal and gangue based on visible-near infrared spectroscopy. Firstly, a visible-near infrared spectroscopy experimental platform was built to collect the reflectance spectra of lump coal and gangue samples from Shanxi Ximing, Shaanxi Shenmu, and Inner Mongolia Balongtu coal mines in the range of 370~1 049 nm. The collected original spectra were preprocessed through black and white correction, method of removing the start and end bands, Savitzky-Golay (SG) smoothing and standard normal variable transformation (SNV) to reduce the effects of uneven illumination, noise and optical path difference. Secondly, the experimental group and test group were divided according to the difference of reflection spectrum of samples from different mines. The experimental group had a minor difference, which was used to compare the performance of different models and select the best algorithm; the difference of test groups was obvious, which was used to test the performance of the best algorithm in other coal mines and verified the applicability of the algorithm to different coal mines. In the experiment of the experimental group, the coal and gangue classification model was established based on the XGBoost algorithm, and the commonly used machine learning classification algorithms k-nearest neighbor method (KNN), random forest (RF), support vector machine (SVM), which were introduced for comparison. The results showed XGBoost performed best. The average accuracy of 10-fold cross-validation (ACC10), classification accuracy (ACC), and AUC values respectively reached 0.957 2, 0.970 5, and 0.971 6, showing strong stability and classification capabilities. Then in order to reduce the data dimension and calculations, recursive feature elimination (RFE), successive projections algorithm (SPA) and competitive adaptive reweighted sampling (CARS) were used to select the characteristic wavelength and combined with the above four classification algorithms to construct a simplified classification model, respectively. The simplified model of the combination of RFE and XGBoost(RFE-XGB) performed best in the test. The ACC10, ACC, AUC was 0.965 7, 0.980 3, 0.980 3, respectively, and the data dimension reduced to 9. Simplified model improved the stability and classification ability of the model while reducing the data dimension. In the experiment of the test group, the model based on XGBoost and RFE-XGB algorithms can also achieve stable and accurate recognition of coal and gangue in other coal mines, and the simplified model performed better, which was consistent with the results of the experimental group.
Keywords:XGBoost  Visible and near-infrared  Coal and gangue separation  Black background  Nondestructive detection  
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号