首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于IFSR异常样本剔除的落叶松木材密度近红外优化模型的研究
引用本文:张哲宇,李耀翔,王志远,李春旭.基于IFSR异常样本剔除的落叶松木材密度近红外优化模型的研究[J].光谱学与光谱分析,2022,42(11):3395-3402.
作者姓名:张哲宇  李耀翔  王志远  李春旭
作者单位:东北林业大学工程技术学院,黑龙江 哈尔滨 150040
基金项目:国家重点研发计划项目(2017YFC0504103),双一流专项-创新人才培养项目(000/41113102),黑龙江省应用技术研究与开发计划项目(GA19C006,GA21C030)资助
摘    要:木材密度可以反映木材的干缩性、抗压抗拉强度等多种物理性质,是重要的木材物理特性。采用近红外光谱技术能够实现木材密度的快速预测,可克服传统检测方法耗费人力、物力、时间的弊端,但建模结果往往受异常样本的影响。为准确识别并剔除样本集中的异常样本,提出一种孤立森林结合学生化残差方法(IFSR),在利用孤立森林集成特征的优点基础上考虑样本对模型的影响度,可同时检测异常样本与强影响样本。该研究对181个落叶松木材样本的近红外光谱及其在常温下的气干密度进行了测定。通过对比多种方法预处理和特征选择方法,确定采用标准正态变量变化(SNV)+去趋势处理(DT)+均值中心化(MC)+标准化(Auto)方法进行预处理,采用竞争性自适应重加权算法(CARS)进行特征波段选择,消除噪声及无关信息对算法的影响,简化数据集,提高算法剔除异常样本的准确性。为验证IFSR方法剔除异常样本的能力,将其与蒙特卡洛交互验证(MCCV)、马氏距离(MD)等其他六种异常检测方法对比分析,建立偏最小二乘(PLS)模型对其进行异常检测性能评价。同时在上述基础上采用粒子群寻优-支持向量机回归(PSO-SVR),BP神经网络(BPNN)与PLS分别建立落叶松木材密度近红外预测模型。结果表明,IFSR结合PSO-SVR方法得到的优化模型预测能力最强,IFSR可有效剔除奇异样本,提高模型精度。

关 键 词:近红外  木材密度  异常值检测  孤立森林算法  支持向量机回归  
收稿时间:2021-09-23

NIR Model Optimization Study of Larch Wood Density Based on IFSR Abnormal Sample Elimination
ZHANG Zhe-yu,LI Yao-xiang,WANG Zhi-yuan,LI Chun-xu.NIR Model Optimization Study of Larch Wood Density Based on IFSR Abnormal Sample Elimination[J].Spectroscopy and Spectral Analysis,2022,42(11):3395-3402.
Authors:ZHANG Zhe-yu  LI Yao-xiang  WANG Zhi-yuan  LI Chun-xu
Institution:College of Engineering and Technology, Northeast Forestry University, Harbin 150040, China
Abstract:Wood density is an important physical property of wood which can reflect a variety of physical properties such as wood shrinkage, compressive and tensile strength. Using near-infrared spectroscopy technology can rapidly predict wood density, which can overcome the disadvantages of traditional detection methods that consume workforce, material resources and time. However, the modeling results are often affected by abnormal samples. In order to accurately identify and eliminate abnormal samples in the sample set, an isolation forest combined with the studentized residual method (IFSR) was proposed. Based on the advantages of integrated features of isolated forests, the influence of samples on the model is considered, and abnormal samples and strong influence samples can be detected simultaneously. This study measured the near-infrared spectra of 181 larch wood samples and their air-dry density at room temperature. By comparing a variety of preprocessing and feature selection methods, the preprocessing method was determined to adopt the standard normal variable change (SNV) + detrending processing (DT) + mean centralization (MC) + standardization (Auto) method and the feature wavelength selection was determined to adopt competitive adaptive reweighted sampling (CARS) method. Eliminated the influence of noise and irrelevant information on the algorithm, simplified the dataset, and improved the algorithm’s accuracy in removing abnormal samples. In order to verify the ability of the IFSR method to eliminate abnormal samples, it was compared with the other six anomaly detection methods such as Monte Carlo Interactive Verification (MCCV), Mahalanobis Distance (MD), etc. The partial least squares (PLS) model was established to evaluate its anomaly detection performance. At the same time, the particle swarm optimization-support vector machine regression (PSO-SVR), BP neural network (BPNN) and PLS were used to establish the near-infrared prediction model of larch wood density. The results show that the optimized model obtained by IFSR combined with PSO-SVR has the strongest predictive ability, and IFSR can effectively eliminate singular samples and improve the model’s accuracy.
Keywords:Near-infrared spectrum  Wood density  Outlier detection  Isolation forest algorithm  Support vector regression  
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号