首页 | 本学科首页   官方微博 | 高级检索  
     检索      

2D-Haar声学特征超向量快速生成方法
引用本文:谢尔曼,罗森林,潘丽敏.2D-Haar声学特征超向量快速生成方法[J].北京理工大学学报,2016,36(3):295-301.
作者姓名:谢尔曼  罗森林  潘丽敏
作者单位:北京理工大学信息系统及安全对抗实验中心, 北京 100081
基金项目:国家"二四二"计划项目(2005C48);北京理工大学科技创新计划项目(2011CX01015)
摘    要:针对大数据量音频的高速处理,提出一种快速的声学特征超向量生成方法,有效提高音频识别系统的识别速度和精度.所提方法首先将多个连续音频帧的常用声学特征构成声学特征图,进而使用低复杂度的运算方法在其中快速提取维数达数十万的Haar-like声学特征;然后使用AdaBoost.MH算法,筛选出具有较高代表性的Haar-like声学特征模式组合,用以构成声学特征超向量;进而提出Random AdaBoost特征筛选方法,进一步提高特征筛选速度.实验结果表明,在音频事件识别、说话人识别、说话人性别识别3种场合下,使用Haar-like声学特征可以使SVM、C5.0、AdaBoost等识别算法获得比MFCC、PLP、LPCC等常用声学特征更高的识别准确率,同时可以获得7~20倍的训练速度提升和5~10倍的识别速度提升. 

关 键 词:音频处理    音频识别    2D-Haar声学特征超向量    Haar-like声学特征    AdaBoost.MH
收稿时间:2013/12/11 0:00:00

2D-Haar Acoustic Super Feature Vector Fast Generation Method
XIE Er-man,LUO Sen-lin and PAN Li-min.2D-Haar Acoustic Super Feature Vector Fast Generation Method[J].Journal of Beijing Institute of Technology(Natural Science Edition),2016,36(3):295-301.
Authors:XIE Er-man  LUO Sen-lin and PAN Li-min
Institution:Information System and Security and Countermeasures Experimental Center, Beijing Institute of Technology, Beijing 100081, China
Abstract:A fast and efficient acoustic feature super vector generation method was proposed to effectively improve the recognition accuracy and speed yielded by traditional frame based acoustic features. This paper makes 3 contributions:firstly, certain number of acoustic feature vectors extracted from continuous audio frames was combined to be an acoustic feature image; secondly, AdaBoost. MH algorithm was used to select higher representative 2D-Haar pattern combinations to construct super feature vectors; thirdly, random feature selection method was proposed to further improve the processing speed. Experimental results show that under 3 kinds of audio recognition occasions such as audio events recognition, speaker recognition, speaker gender recognition, the use of 2D-Haar acoustic feature super vector can make SVM, C5.0, AdaBoost algorithms obtain higher recognition accuracy than ones that MFCC, PLP, LPCC and other traditional acoustic features yielded, and can make the training processing 7~20 times faster and the recognition processing 5~10 times faster.
Keywords:audio processing  audio recognition  2D-Haar feature super vector  2D-Haar acoustic feature  AdaBoost  MH
本文献已被 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号