首页 | 本学科首页   官方微博 | 高级检索  
     检索      

近红外光谱结合随机森林算法:一种快速有效的附子产地溯源策略
引用本文:龚 圣,朱雅宁,曾陈娟,马秀英,彭 成,郭 力.近红外光谱结合随机森林算法:一种快速有效的附子产地溯源策略[J].光谱学与光谱分析,2022,42(12):3823-3829.
作者姓名:龚 圣  朱雅宁  曾陈娟  马秀英  彭 成  郭 力
作者单位:1. 成都中医药大学药学院西南特色中药资源国家重点实验室,四川 成都 611137
2. 雅安三九制药有限公司,四川 雅安 625000
3. 四川佳能达攀西药业有限公司,四川 布拖 616350
基金项目:国家重点研发计划项目(2017YFC1701804),国家自然科学基金重大项目(81891012)资助
摘    要:可靠的原产地认证方法对于保护指定产地的高价值中药材(例如道地药材、地理标志产品等)至关重要。附子作为著名的传统中药和川产道地药材,疗效显著,临床应用广泛,在国内外市场需求量很大。不同产地的附子疗效和价格有所不同,大众很难通过传统经验进行准确鉴别,基于植物代谢组学模式下的质谱检测技术,测试样本制备过程繁琐冗长、操作复杂、检测时间长,且重现性偏低。近红外光谱作为一种成熟、快速、无损的检测技术,被机器学习集成后为中药材在线质量监管和控制带来新途径。基于近红外光谱技术结合随机森林算法建立了一种不同产地附子无损鉴别模型。在四川、陕西和云南等主要栽培区域共采集了255份附子样本,采用傅里叶变换近红外光谱获得所有样本的漫反射光谱信息。采用单一和组合光谱预处理方式以消除光谱中的多种干扰,并筛选出最佳预处理方式,以此为输入指标建立随机森林模型。采用灵敏度、特异度和平衡精度等指标评价了模型的综合性能。结果表明:Savitzky-Golay平滑+多元散射校正为最佳预处理方式;仅采用全波长数据,RF模型对3组省级的样本的预测准确率超过了90%,预处理后预测准确率达98.39%;对于市/县一级样本,RF模型同样具有优秀的判别能力,准确率大于75%。模型对道地产区周边栽培区域的样本,识别率达100%。过滤出前100个特征波数,重新优化模型,模型对各市/县级区域的识别精度超过85%,尤其是对一些产自高原样本的识别能力得到了明显提升。研究中采用了环境友好型溯源策略,分析速度更快,样品损失更少,精度更高,为不同产地附子快速、高效的鉴别提供了新模式,为后续附子及其相关炮制品的鉴别和溯源提供了参考。

关 键 词:附子  产地  溯源  近红外光谱  机器学习  随机森林  
收稿时间:2021-11-14

Near-Infrared Spectroscopy Combined With Random Forest Algorithm: A Fast and Effective Strategy for Origin Traceability of Fuzi
GONG Sheng,ZHU Ya-ning,ZENG Chen-juan,MA Xiu-ying,PENG Cheng,GUO Li.Near-Infrared Spectroscopy Combined With Random Forest Algorithm: A Fast and Effective Strategy for Origin Traceability of Fuzi[J].Spectroscopy and Spectral Analysis,2022,42(12):3823-3829.
Authors:GONG Sheng  ZHU Ya-ning  ZENG Chen-juan  MA Xiu-ying  PENG Cheng  GUO Li
Institution:1. State Key Laboratory of Southwestern Chinese Medicine Resources, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China 2. Yaan Sanjiu Pharmaceutical Co., Ltd., Yaan 625000, China 3. Sichuan Jianengda Panxi Pharmaceuticals Industry Co., Ltd., Butuo 616350, China
Abstract:Effective and reliable methods of origin certification are essential for protecting high-value Chinese medicinal materials (e.g geo-authentic Chinese medicinal materials, geographical indication products, etc.) from designated regions. As a famous traditional Chinese medicine and a geo-authentic Chinese medicinal material produced in Sichuan Province, Aconiti Lateralis Radix Praeparata (Fuzi) has a remarkable curative effect and wide clinical application is in great demand in domestic and international markets. The efficacy and price of the Fuzi of different origins vary, and it is difficult for the public to identify them through traditional experience accurately. Mass spectrometry-based on plant metabolomics is a tedious and lengthy test sample preparation process, complicated operation, long detection time, and low reproducibility. Near-infrared (NIR) spectroscopy, a mature, fast and nondestructive detection technique was integrated with machine learning to bring new ways for online quality supervision and control of Chinese medicinal materials. Therefore, a non-destructive identification model based on NIR spectroscopy combined with a random forest (RF) algorithm was developed for different origins of Fuzi. A total of 255 samples of Fuzi were collected from the major cultivation regions of Sichuan, Shaanxi and Yunnan, and the diffuse reflectance spectral information of all samples was obtained using Fourier transform NIR spectroscopy. Single and combined spectral preprocessing methods are used to eliminate multiple interferences in the spectra, and the best preprocessing method is screened and used as an input indicator to build an RF model. The comprehensive performance of the RF model was evaluated using sensitivity, specificity and balanced accuracy. The results showed that Savitzky-Golay 11-point smoothing combined with multivariate scattering correction was the best preprocessing method.Using only the full wavelength data, the prediction accuracy of the RF model for the three groups of provincial samples was also checked over 90%, and the prediction accuracy after preprocessing reached 98.39%. For the city/county level samples, the RF model also had the excellent discriminative ability, greater than 75% accuracy. The RF model achieved 100% recognition rate for samples from cultivation areas around the traditional production areas. The top 100 feature wave numbers were filtered out, and the model was re-optimized, and the recognition accuracy of the model for each city/county level region was over 85%, especially for some samples from the highlands was significantly improved. In this study, an environment-friendly traceability strategy with faster analysis, less sample loss and higher precision was adopted, providing a new model for the rapid and efficient identification of Fuzi of different origins and a reference for the subsequent identification and traceability of Fuzi and its related processed products.
Keywords:Fuzi  Origin  Traceability  Near-infrared spectroscopy  Machine learning  Random forest  
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号