首页 | 本学科首页   官方微博 | 高级检索  
     检索      

多波长透射光谱特征提取结合支持向量机的水体细菌识别方法研究
作者单位:中国科学院环境光学与技术重点实验室,中国科学院安徽光学精密机械研究所,安徽 合肥 230031;中国科学技术大学,安徽 合肥 230026;安徽省环境光学监测技术重点实验室,安徽 合肥 230031;中国科学院环境光学与技术重点实验室,中国科学院安徽光学精密机械研究所,安徽 合肥 230031;安徽省环境光学监测技术重点实验室,安徽 合肥 230031
基金项目:国家自然科学基金项目(61875254,61705237,61805254),安徽省重点研发计划项目(1804a0802192)资助
摘    要:实现水体致病菌的快速识别检测对防控由水体微生物污染引起的大规模疾病爆发有重要的现实意义。生化鉴定、核酸检测等常规细菌检测方法存在耗费时间长、需要精密的实验仪器等特点,不足以满足水体细菌微生物的快速实时在线监测。由于细菌的多波长透射光谱包含较丰富的特征信息,并且这项光谱检测技术具有快速简便、无接触、无污染等优点,近年来成为细菌检测研究的热点。以肺炎克雷伯氏菌、金黄色葡萄球菌、鼠伤寒沙门氏菌、铜绿假单胞菌和大肠埃希氏菌为研究对象,通过对细菌光谱作归一化处理和方差分析得到光谱变动最显著的特征波长区间,在该区间提取200 nm处的吸光度值及短波段的斜率值作为光谱特征值,结合支持向量机对不同种类细菌进行预测。结果表明,多波长透射光谱的归一化预处理能够有效消除浓度影响,并保留完整的原始光谱信息;通过方差分析法得到特征波长区间为200~300 nm波段,在此区间内提取的五种细菌的归一化光谱趋势图的特征值分别为:200 nm处吸光度值为0.006 5,0.005 1,0.007 5,0.007 5和0.008 5,200~245 nm波段的斜率值为-62.45,-35.94,-81.30,-82.67和-103.49,250~275 nm波段处的斜率值为-15.48,-14.82,-20.91,-13.92和-26.21,280~300 nm波段处的斜率值为-29.96,-24.62,-33.71,-36.09和-30.88。对样本提取特征值并随机划分训练集和测试集,支持向量机选择惩罚因子模型以及线性核函数,通过寻优算法确定最佳的惩罚因子参数c和核函数参数g,对测试集样本进行测试,得到细菌种类的识别结果,五种细菌的预测准确率均达到100.0%。综上所述,水体致病菌的多波长透射光谱通过合适的数据预处理能够提取出具有明显差异性的光谱特征值,该光谱特征值结合支持向量机能够有效用于不同细菌种类的识别,该方法为水体细菌快速识别和实时在线监测提供了重要的技术支持。

关 键 词:多波长透射光谱  细菌  特征提取  支持向量机  分类识别
收稿时间:2020-09-10

Study on Multi-Wavelength Transmission Spectral Feature Extraction Combined With Support Vector Machine for Bacteria Identification
Authors:FENG Chun  ZHAO Nan-jing  YIN Gao-fang  GAN Ting-ting  CHEN Xiao-wei  CHEN Min  HUA Hui  DUAN Jing-bo  LIU Jian-guo
Institution:1. Key Laboratory of Environmental Optics and Technology, Anhui Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Hefei 230031, China 2. University of Science and Technology of China, Hefei 230026, China 3. Key Laboratory of Optical Monitoring Technology for Environment, Anhui Province, Hefei 230031, China
Abstract:The realization of rapid identification of pathogenic bacteria has important practical significance for preventing large-scale disease outbreaks caused by microbial pollution in water bodies. Conventional bacterial detection methods such as biochemical identification and nucleic acid detection have the characteristics of time-consuming and precise experimental equipment, which are insufficient for the rapid and real-time online monitoring of bacteria. Since the multi-wavelength transmission spectrum of bacteria contains abundant characteristic information, and this spectral detection technology has the advantages of fast, simple, non-contact, and non-polluting, it has become a hot spot in bacterial detection research in recent years. This article takes Klebsiella pneumoniae, Staphylococcus aureus, Salmonella typhimurium, Pseudomonas aeruginosa and Escherichia coli as research objects. The characteristic wavelength range with the most significant spectral change is obtained by normalization and the analysis of variance method, and the characteristic spectral values such as the absorbance value at 200nm and the slope value of the short waveband are extracted from this range, and the support vector machine is used to predict different types of bacteria. The results show that the normalization of the multi-wavelength transmission spectrum can effectively eliminate the concentration effect and retain the complete original spectral information. The characteristic wavelength range of 200~300 nm is obtained by analysis of variance. The characteristic values of the normalized spectral trend graphs of the five bacteria extracted in this interval are: The absorbance values at 200 nm are 0.006 5, 0.005 1, 0.007 5, 0.007 5, and 0.008 5. The slope values at the 200~245 nm band are -62.45, -35.94, -81.30, -82.67, and -103.49, and the slope values at the 250~275 nm band are -15.48, -14.82, -20.91, -13.92 and -26.21, the slope values at the 280~300 nm band are -29.96, -24.62, -33.71, -36.09 and -30.88, respectively. Feature values were extracted from the samples and randomly divided into a training sets and test sets. The penalty factor model and the linear kernel function were selected for SVM, the best penalty factor parameter c and kernel function parameter g were determined through the optimization algorithm. The prediction accuracy rates of the five species of bacteria all reach 100.0%. In summary, theobvious spectral characteristic values of the multi-wavelength transmission spectrum of bacteriacan be extracted through proper data preprocessing. The spectral feature value combined with the support vector machine can be effectively used for the identification of different bacterial species. This method provides important technical support for rapid identification and real-time online monitoring of water bacteria.
Keywords:Multiwavelength transmission spectrum  Bacterias  Feature extraction  Support vector machine  Classification and identification  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号