噪声情况下采用稀疏非负矩阵分解与深度吸引子网络的人声分离算法 Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

噪声情况下采用稀疏非负矩阵分解与深度吸引子网络的人声分离算法

引用本文：	葛宛营,张天骐,范聪聪,张天.噪声情况下采用稀疏非负矩阵分解与深度吸引子网络的人声分离算法[J].声学学报,2021,46(1):55-66.

作者姓名：	葛宛营张天骐范聪聪张天

作者单位：	重庆邮电大学通信与信息工程学院重庆 400065

基金项目：	国家自然科学基金项目(61671095,61371164,61702065,61701067,61771085);信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003);重庆市研究生科研创新项目(CYS17219);重庆市教育委员会科研项目(KJ130524,KJ1600427,KJ1600429)资助。

摘要：	为实现噪声情况下的人声分离,提出了一种采用稀疏非负矩阵分解与深度吸引子网络的单通道人声分离算法。首先,通过训练得到人声与噪声的字典矩阵,将其作为先验信息从带噪混合语音中分离出人声与噪声的系数矩阵;然后,根据人声系数矩阵中不同的声源成分在嵌入空间中的相似性不同,使用深度吸引子网络将其分离为各声源语音的系数矩阵;最后,使用分离得到的各语音系数矩阵与人声的字典矩阵重构干净的分离语音。在不同噪声情况下的实验结果表明,本文算法能够在抑制背景噪声的同时提高分离语音的整体质量,优于结合声噪人声分离模型的对比算法。
关键词：	语音分离非负矩阵分解系数矩阵深度吸引子网络
收稿时间：	2019-04-29
Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network

GE Wanying,ZHANG Tian.Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network[J].Acta Acustica,2021,46(1):55-66.

Authors:	GE Wanying ZHANG Tian

Institution:	School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065

Abstract:	The performance of monaural speech separation method is limited when the speech mixture is corrupted by background noise.To obtain the enhanced separated speeches from the noisy mixture,a monaural noisy speech separation method combining Sparse Non-negative Matrix Factorization(SNMF) and Deep Attractor Network(DANet)is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise signal.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the speech quality of reconstructed speeches by comparison with different baseline methods.

Keywords:
本文献已被 CNKI 维普等数据库收录！
	点击此处可从《声学学报》浏览原始摘要信息
	点击此处可从《声学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏