首页 | 本学科首页   官方微博 | 高级检索  
     

基于噪声追踪的二值时频掩蔽到浮值掩蔽的泛化算法
引用本文:梁山, 刘文举, 江巍. 基于噪声追踪的二值时频掩蔽到浮值掩蔽的泛化算法[J]. 声学学报, 2013, 38(5): 632-637. DOI: 10.15949/j.cnki.0371-0025.2013.05.013
作者姓名:梁山  刘文举  江巍
作者单位:1.中国科学院自动化研究所模式识别国家重点实验室 北京 100190
基金项目:国家自然科学基金(91120303,61273267,90820011)资助
摘    要:虽然浮值掩蔽比二值掩蔽有更好的语音分离效果,但是由于理想浮值掩蔽难以直接估计,现有的语音分离系统通常以理想二值掩蔽估计作为计算目标。我们提出了一个二值掩蔽到浮值掩蔽的泛化算法。由于实现浮值掩蔽估计的关键在于噪声能量追踪,我们首先采用指数分布刻画以混合谱和噪声能量以混合能量及二值掩蔽为观测的条件分布。其次,采用高斯马尔柯夫条件随机场刻画噪声估计在连续几帧内的关联。最后,采用马尔柯夫链-蒙特卡洛计算噪声能量最小均方误差估计并进一步计算浮值掩蔽。实验表明,相比于基于二值掩蔽估计的常规算法,我们所提出的算法在信噪比增益和客观感知质量两方面都有显著提高。

关 键 词:掩蔽  背景噪声  噪声能量  追踪算法  时频分解  信噪比增益  条件分布  语音  指数分布  感知质量
收稿时间:2012-06-05
修稿时间:2012-09-27

A generalization algorithm from binary time-frequency masking to ratio masking based on noise-tracking
LIANG Shan, LIU Wenju, JIANG Wei. A generalization algorithm from binary time-frequency masking to ratio masking based on noise-tracking[J]. ACTA ACUSTICA, 2013, 38(5): 632-637. DOI: 10.15949/j.cnki.0371-0025.2013.05.013
Authors:LIANG Shan  LIU Wenju  JIANG Wei
Affiliation:1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences Beijing 100190
Abstract:Although ratio mask may achieve better speech separation results than that by binary mask,present speech separation systems usually set Ideal Binary Mask(IBM) as the computational goal due to the fact that it's very difficult to estimate Ideal Ratio Mask(IRM) directly.In this paper,a generalization algorithm from the binary mask to ratio mask is proposed.Since the key issue in IRM estimation is the noise tracking,we firstly use exponential distribution to model the noise power with binary mask and mixture power as conditions.Then,we use a Gaussian Markov Random Field(GMRF) to model the correlation of noise estimation between adjacent units.Finally,we apply Markov Chain Monte Carlo method to compute the minimum mean square error estimation of noise power and ratio mask.Systematic experiments show that the proposed algorithm outperforms a common binary masking based method in terms of SNR gain and PESQ scores. 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号