首页 | 本学科首页   官方微博 | 高级检索  
     检索      

用于无监督语音降噪的听觉感知鲁棒主成分分析法
引用本文:闵刚,邹霞,韩伟,张雄伟,谭薇.用于无监督语音降噪的听觉感知鲁棒主成分分析法[J].声学学报,2017,42(2):246-256.
作者姓名:闵刚  邹霞  韩伟  张雄伟  谭薇
作者单位:1. 解放军理工大学指挥信息系统学院 南京 210007;
基金项目:国家自然科学基金项目(61471394,61402519)和江苏省自然科学基金项目(BK20140071,BK20140074)资助
摘    要:针对现有稀疏低秩分解语音降噪方法对人耳听觉感知特性应用不充分、语音失真易被感知的问题,提出了一种用于语音降噪的听觉感知鲁棒主成分分析法。由于耳蜗基底膜对于频率感知具有非线性特性,该方法采用耳蜗谱图作为语噪分离的基础。此外,选用符合人耳听觉感知特性的板仓-斋田距离度量作为优化目标函数,在稀疏低秩建模过程中引入非负约束以使分解分量更符合实际物理含义,并在交替方向乘子法框架下推导了具有闭合解形式的迭代优化算法。文中方法在语音降噪时是完全无监督的,无需预先训练语音或噪声模型。多种类型噪声和不同信噪比条件下的仿真实验验证了该方法的有效性,噪声抑制效果较目前同类算法更为显著,且降噪后语音的可懂度和总体质量有所提高、至少相当。 

关 键 词:语音降噪    听觉感知    鲁棒主成分分析    耳蜗谱图
收稿时间:2015-11-26

Unsupervised speech denoising via perceptually motivated robust principal component analysis
Institution:1. College of Command Information Systems, PLA University of Science and Technology Nanjing 210007;2. Xi'an Communications Institute Xi'an 710106
Abstract:To overcome the shortcomings in the existing sparse and low-rank speech denoising method that the auditory perceptual properties are not fully exploited and the speech distortion is easily perceived,a perceptually motivated robust principal component analysis(ISNRPCA) method is presented.To reflect the nonlinear property for frequency perception of the basilar membrane,cochleagram is utilized as inputs of ISNRPCA.ISNRPCA uses the perceptually meaningful Itakura-Saito measure as its optimization objective function.Moreover,nonnegative constraints are also imposed to regularize the decomposed terms with respect to their physical meaning.We propose an alternating direction method of multipliers(ADMM) to solve the optimization problem of ISNRPCA.ISNRPCA is totally unsupervised,neither the speech nor the noise model needs to be trained beforehand.Experimental results under various noise types and different SNRs demonstrate that ISNRPCA shows promising results for speech denoising.Compared to the state of the art baselines,this method achieves better performance on noise suppression and demonstrates at least comparable intelligibility and overall speech quality. 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号