首页 | 官方网站   微博 | 高级检索  
     

全局特征及弱尺度融合策略的小样本语音情感识别
引用本文:黄永明,章国宝,李雄,达飞鹏.全局特征及弱尺度融合策略的小样本语音情感识别[J].声学学报,2012,37(3):330-338.
作者姓名:黄永明  章国宝  李雄  达飞鹏
作者单位:1 东南大学自动化学院 南京 210096;
基金项目:国家863计划和国家自然科学基金资助项目。
摘    要:语音是一种短时平稳时频信号,因此大多数的研究者都通过分帧来提取情感特征。然而,分帧后提取的特征为局部特征,无法准确反应情感语音动态特性,故单纯采用局部特征往往无法构建鲁棒的情感识别系统。针对这个问题,先在不分帧的语音信号里通过多尺度最优小波包分解提取语句级全局特征,分帧后再提取384维的语句级局部特征,并利用Fisher准则进行降维,最后提出一种弱尺度融合策略来将这两种语句级特征进行融合,再利用SVM进行情感分类。基于柏林情感库的实验结果表明本文方法较单纯使用语句级局部特征最后识别率提高了4.2%到13.8%,特别在小样本的情况下,语音情感识别率波动较小。 

收稿时间:2010-12-08

Small sample size speech emotion recognition based on global features and weak metric learning
Affiliation:1. School of Automation, Southeast University Nanjing 210096;2 Institute of Image Processing & Pattern Recognition, Shanghai Jiao Tong University Shanghai 200240
Abstract:The emotional speech is a kind of non-stationary time and frequency signal,and it has been shown that local features extracted from each frame make great contribution to speech emotion recognition.However,it's inadequate to use only local features to build a robust speech emotion classification system,as local features extracted from speech divided into frames can not reflect the dynamic characteristics of emotion speech signal accurately.In this paper, utterance-level global features without dividing the emotion speech into frames based on multi-scale optimal wavelet packet decomposition,and 384-dimensional utterance-level local features,are extracted together to improve the robustness and recognition rate of classification system.Given less training samples,while the dimensions of eigenvectors being reduced by Fisher discriminant,a fusion strategy with metric learning,which is called weak metric learning in this work,is adopt for fusing global and local utterance-level features.The experimental results with LIBSVM show that our method achieves significant improvements about 4.2% to 13.8% with comparison to using local utterance-level feature merely,and the speech emotion recognition rate has less fluctuations especially in the case of small sample size. 
Keywords:
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号