首页 | 本学科首页   官方微博 | 高级检索  
     检索      

提高耳语音可懂度的非对称压缩语音增强方法
引用本文:周健,郑文明,王青云,赵力.提高耳语音可懂度的非对称压缩语音增强方法[J].声学学报,2014,39(4):501-508.
作者姓名:周健  郑文明  王青云  赵力
作者单位:1. 安徽大学计算智能与信号处理教育部重点实验室 合肥 230031;
基金项目:国家自然科学基金(61301295,61231002,61273266,61003131);安徽省自然科学基金(1308085QF100,1408085MF113);安徽大学博士科研启动经费资助
摘    要:提出两种基于非对称代价函数的耳语音增强算法,将语音增强过程中的放大失真和压缩失真区分对待。Modified ItakuraSaito (MIS)算法对放大失真给予更多的惩罚,而Kullback-Leibler (KL)算法则对压缩失真给予更多的惩罚。实验结果表明,在低于—6 dB的低信噪比情况中,经MIS算法增强后的耳语音的可懂度相比传统算法有显著提高;而KL算法则获得了同最小均方误差语音增强算法近似的可懂度提高效果,证实了耳语音中的放大失真和压缩失真对于耳语音可懂度的影响并不相同,低信噪比时较大的压缩失真有助于提高耳语音可懂度,而高信噪比时的压缩失真对耳语音可懂度影响较小。 

关 键 词:耳语音  可懂度  语音增强  非对称  放大失真  代价函数  噪声谱  信噪比  高斯噪声  最小均方误差  
收稿时间:2012-12-28

An asymmetric attenuated speech enhancement approach for improving intelligibility of noisy whisper
Institution:1. Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University Hefei 230031;2. Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University Nanjing 210096;3. Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University Nanjing 210096
Abstract:Two asymmetric cost function for whispered speech enhancement methods are proposed. The cost of the amplification distortion and the attenuation distortion are different in both methods. The Modified Itakura-Saito (MIS) distance function gives more penalties to speech amplification distortion while the Kullback-Leibler (KL) divergence function gives more penalties to speech attenuation distortion. The experimental results show that the MIS method gains larger intelligibility improvement of the whispered speech than the conventional speech enhancement algorithms in much lower Signal to Noise Ratio (SNR) less than -6 dB, and the KL method has similar intelligibility improvement performance to the Minimum Mean Square Error (MMSE) speech enhancement method. The results confirm that the amplification distortion and the attenuation distortion have different effects on the intelligibility of the enhanced whisper. Specifically, larger attenuation distortion can improve speech intelligibility in lower SNR condition and it has a little influence on speech intelligibility in high SNR condition. 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号