面向自定义语音唤醒的关键词相关的单通道语音增强 Keyword-dependent monaural speech enhancement for open-vocabulary keyword spotting期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向自定义语音唤醒的关键词相关的单通道语音增强

引用本文：	刘作桢,吴愁,黎塔,赵庆卫.面向自定义语音唤醒的关键词相关的单通道语音增强[J].声学学报,2023,48(2):415-424.

作者姓名：	刘作桢吴愁黎塔赵庆卫

作者单位：	1.中国科学院声学研究所语言声学与内容理解重点实验室　北京　100190

基金项目：	中国科学院战略性先导科技专项(XDC08010300)资助

摘要：	提出一种面向自定义语音唤醒的单通道语音增强方法。该方法预先将关键词音素信息存入文本编码矩阵,并在常规语音增强模型基础上添加一个基于注意力机制的音素偏置模块。该模块利用语音增强模型中间特征从文本编码矩阵中获取当前帧的音素信息,并将其融入语音增强模型的后续计算中,从而提升语音增强模型对关键词相关音素的增强效果。在不同噪声环境下的实验结果表明,该方法可以更有效地抑制关键词部分噪声。同时所提出方法对比常规语音增强方法与其他文本相关语音增强方法,在自定义语音唤醒性能上可以分别获得14.3%和7.6%的相对提升。
关键词：	语音增强语音唤醒关键词相关深度学习
收稿时间：	2022-01-18
Keyword-dependent monaural speech enhancement for open-vocabulary keyword spotting

Affiliation:	1.Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences　Beijing　1001902.University of Chinese Academy of Sciences　Beijing　100049

Abstract:	A monaural speech enhancement algorithm for open-vocabulary keyword spotting is proposed. The algorithm stores the keyword phoneme information in the text encoding matrix in advance, and adds a phoneme bias module based on the attention mechanism on the basis of the conventional speech enhancement model. This module uses the intermediate features of the speech enhancement model to obtain the phoneme information of the current frame from the text encoding matrix, and integrates it into the subsequent calculation of the speech enhancement model, so that the model can obtain better enhancement performance on the specified keywords. The experimental results in different noise environments show that the proposed method can more effectively suppress the noise of keyword part and better recover the speech details. Meanwhile, the proposed method achieves an 14.3% relative improvement in open-vocabulary keyword spotting compared to conventional speech enhancement method, and an 7.6% relative improvement compared to other text-dependent speech enhancement method.

Keywords:

	点击此处可从《声学学报》浏览原始摘要信息
	点击此处可从《声学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏