首页 | 官方网站   微博 | 高级检索  
     

连接时序分类准则声学建模方法优化
引用本文:王智超,张鹏远,潘接林,颜永红.连接时序分类准则声学建模方法优化[J].声学学报,2018,43(6):984-990.
作者姓名:王智超  张鹏远  潘接林  颜永红
作者单位:1. 中国科学院语言声学与内容理解重点实验室 北京 100190;
基金项目:国家重点研发计划重点专项(2016YFB0801203,2016YFB0801200)资助
摘    要:对基于连接时序分类准则(connectionist temporal classification,CTC)的端到端声学建模方法进行研究和优化。研究分析了不同声学特征、建模单元以及神经网络结构对CTC声学模型性能的影响,针对CTC模型中blank符号共享导致的建模缺陷提出了建模单元相关的非共享blank方法进行改进,并引入融合建模单元关联信息的模型初始化方法进一步提高CTC模型的性能。在300小时标准英文数据集Switchboard的实验结果显示,结合非共享blank、时延神经网络以及融合建模单元关联信息的初始化方法,CTC声学模型相对于基线系统在词错误率上取得绝对1.1%的下降,同时在训练速度上取得3.3倍的提高,实验结果证明本文针对端到端声学建模提出的优化方法是有效的。 

关 键 词:语音识别    声学模型    深度神经网络    连接时序分类K
收稿时间:2017-01-04

Optimization of acoustic modeling method with connectionist temporal classification criterion
Affiliation:1. The Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences Beijing 100190;2. Institute of Acoustics, Chinese Academy of Sciences Beijing 10019
Abstract:The end-to-end acoustic modeling method based on connectionist temporal classification (CTC) criterion is studied and optimized in this paper. We study on the performance of CTC acoustic models with different acoustic features, modeling units and architectures. A modeling unit related unshared blank method is proposed to improve the modeling defects caused by the blank sharing in the CTC model. And a model initialization method that put the association information between the modeling units into the neural network is introduced to further improves the performance of the CTC model. Experiments were carried out on the 300-hour Switchboard dataset. Results show that the proposed CTC model trained with non-shared blanks, time-delay neural networks and the initialization method with association information between the modeling units achieves an absolute 1.1% reduction in word error rate as well as a 3.3-time speedup over the baseline system. The experimental results show that the proposed method is effective for end-to-end acoustic modeling. 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号