用于噪声鲁棒性语音识别的子带能量规整感知线性预测系数 Sub-band power normalized perceptual linear predictive coefficients for robust automatic speech recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

用于噪声鲁棒性语音识别的子带能量规整感知线性预测系数

引用本文：	蔡尚,金鑫,高圣翔,潘接林,颜永红.用于噪声鲁棒性语音识别的子带能量规整感知线性预测系数[J].声学学报,2012,37(6):667-672.

作者姓名：	蔡尚金鑫高圣翔潘接林颜永红

作者单位：	1 中国科学院语言声学与内容理解重点实验室(声学研究所) 北京 100190;

基金项目：	国家自然科学基金资助项目(10925419,90920302,10874203,60875014,61072124,11074275)

摘要：	为了提高感知线性预测系数(PLP)在噪声环境下的识别性能,使用子带能量偏差减的方法,提出了一种基于子带能量规整的感知线性预测系数(SPNPLP)。PLP有效地集中了语音中的有用信息,在安静环境下自动语音识别系统使用PLP可以取得良好的识别率;但是在噪声环境中其识别性能急剧下降。通过使用能量偏差减的方法对PLP的子带能量进行规整,抑制背景噪声激励,提出了SPNPLP,增强自动语音识别系统在噪声环境下的鲁棒性。在一个语法大小为501的孤立词识别任务和一个大词表连续语音识别任务上做了测试,SPNPLP在这两个任务上,与PLP相比,汉字识别精度分别绝对提升了11.26%和9.2%。实验结果表明SPNPLP比PLP具有更好的噪声鲁棒性。
收稿时间：	2011-11-30
Sub-band power normalized perceptual linear predictive coefficients for robust automatic speech recognition

Institution:	1 Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences Beijing 100190;2 National Computer Network Emergency Response Technical Team/Coordination Center of China Beijing 100029

Abstract:	In order to improve the noise robustness of perceptual linear predictive (PLP) coefficients,one kind of features called sub-band power normalized perceptual linear predictive (SPNPLP) coefficients using power bias subtraction is presented.PLP captures the most useful information of speech and fits well with the assumptions used in hidden Markov models.Automatic speech recognition (ASR) systems with PLP have obtained satisfactory performance in benign environments.Nevertheless,performance of ASR drops dramatically in noisy environments.In this work,power bias subtraction that suppresses background excitation is introduced to normalize the sub-band power of PLP,and SPNPLP is proposed to increase the robustness of ASR against additive background noise.Recognition performances are evaluated on an isolated-word recognition task with 501 items and a large vocabulary continuous speech recognition(LVCSR) task.The average improvements upon the standard PLP are 11.26 and 9.2 respectively on these two tasks. The experimental results show that the proposed SPNPLP is consistently more robust than PLP.

Keywords:

	点击此处可从《声学学报》浏览原始摘要信息
	点击此处可从《声学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏