首页 | 本学科首页   官方微博 | 高级检索  
     检索      

噪声情况下采用稀疏非负矩阵分解与深度吸引子网络的人声分离算法
引用本文:葛宛营,张天骐,范聪聪,张天.噪声情况下采用稀疏非负矩阵分解与深度吸引子网络的人声分离算法[J].声学学报,2021,46(1):55-66.
作者姓名:葛宛营  张天骐  范聪聪  张天
作者单位:重庆邮电大学 通信与信息工程学院 重庆 400065
基金项目:国家自然科学基金项目(61671095,61371164,61702065,61701067,61771085);信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003);重庆市研究生科研创新项目(CYS17219);重庆市教育委员会科研项目(KJ130524,KJ1600427,KJ1600429)资助。
摘    要:为实现噪声情况下的人声分离,提出了一种采用稀疏非负矩阵分解与深度吸引子网络的单通道人声分离算法。首先,通过训练得到人声与噪声的字典矩阵,将其作为先验信息从带噪混合语音中分离出人声与噪声的系数矩阵;然后,根据人声系数矩阵中不同的声源成分在嵌入空间中的相似性不同,使用深度吸引子网络将其分离为各声源语音的系数矩阵;最后,使用分离得到的各语音系数矩阵与人声的字典矩阵重构干净的分离语音。在不同噪声情况下的实验结果表明,本文算法能够在抑制背景噪声的同时提高分离语音的整体质量,优于结合声噪人声分离模型的对比算法。 

关 键 词:语音分离    非负矩阵分解    系数矩阵    深度吸引子网络
收稿时间:2019-04-29

Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network
GE Wanying,ZHANG Tian.Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network[J].Acta Acustica,2021,46(1):55-66.
Authors:GE Wanying  ZHANG Tian
Institution:School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065
Abstract:The performance of monaural speech separation method is limited when the speech mixture is corrupted by background noise.To obtain the enhanced separated speeches from the noisy mixture,a monaural noisy speech separation method combining Sparse Non-negative Matrix Factorization(SNMF) and Deep Attractor Network(DANet)is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise signal.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the speech quality of reconstructed speeches by comparison with different baseline methods.
Keywords:
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号