面向少量语料的语音转换算法 An algorithm for voice conversion with limited corpus期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

面向少量语料的语音转换算法

引用本文：	谷东,简志华.面向少量语料的语音转换算法[J].声学学报,2018,43(5):864-872.

作者姓名：	谷东简志华

作者单位：	杭州电子科技大学通信工程学院杭州 310018

基金项目：	国家自然科学基金项目(61201301,61772166)浙江省自然科学基金项目(LY16F010012,LY16F020016)资助

摘要：	针对目标说话人可能存在语料不足的情况,本文提出了一种有限语料下的统一张量字典语音转换算法。从语料库中选取N个说话人作为语音张量字典的基础说话人,通过多序列动态时间规整算法使这N个说话人的平行语音段对齐,从而建立由N个二维基础字典构成的张量字典。在语音转换阶段,源、目标说话人语音都可以通过张量字典中各基础字典的线性组合,构造出各自的语音字典,实现了语音转换。实验结果表明,当基础说话人个数达到14时,只需要极少的目标说话人语料,便可获得与传统的基于非负矩阵分解转换算法相当的转换效果,这极大地方便了语音转换系统的应用。
关键词：	语音转换转换算法语料库统一张量动态时间规整非负矩阵分解说话人线性组合
收稿时间：	2017-04-07
An algorithm for voice conversion with limited corpus

Institution:	School of Communication Engineering, Hangzhou Dianzi University Hangzhou 310018

Abstract:	Under the condition of limited target speaker's corpus, this paper proposed a new voice conversion algorithm using unified tensor dictionary with limited corpus. Firstly, parallel speech of N speakers was selected randomly from the speech corpus to build the base of tensor dictionary. And then, after the operation of multi-series dynamic time warping for those chosen speech, N two-dimension basic dictionaries can be generated which constituted the unified tensor dictionary. During the conversion stage, the two dictionaries of source and target speaker were established by linear combination of the N basic dictionaries using the two speakers' speech. The experimental results showed that when the number of the basic speaker was 14, our algorithm can obtain the compared perfornmnce of the traditional NMF-based method with few target speaker corpus, which greatly facilitate the application of voice conversion system.

Keywords:
本文献已被 CNKI 等数据库收录！
	点击此处可从《声学学报》浏览原始摘要信息
	点击此处可从《声学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏