首页 | 本学科首页   官方微博 | 高级检索  
     

面向情感语音转换的韵律转换方法
引用本文:李贤, 於俊, 汪增福. 面向情感语音转换的韵律转换方法[J]. 声学学报, 2014, 39(4): 509-516. DOI: 10.15949/j.cnki.0371-0025.2014.04.015
作者姓名:李贤  於俊  汪增福
作者单位:1. 中国科学技术大学自动化系 合肥 230027;
基金项目:安徽省科技攻关计划语音专项(11010202192);国家自然科学基金(61303150);安徽省自主创新专项资金智能语音技术研发和产业化专项(13Z02008);中国博士后科学基金(2012M521248)资助
摘    要:面向情感语音转换,该文提出了一种韵律转换方法。该方法包含基频转换和时长转换两个部分,前者选择离散余弦变换(DCT)参数化基频,根据基频的层次结构特点,将基频分解为短语层和音节层两个层次,使用基于混合高斯模型(GMM)的转换方法对两个层次分别进行转换;后者使用基于分类回归树(CART)的方法以声韵母为基本单位对时长进行转换。一个包含三种基本情感的语料库用作训练和测试,客观评测以及主观评测实验结果显示该方法可有效进行情感韵律转换,其中悲伤情感在主观实验中达到了接近100%的正确率。

关 键 词:混合高斯模型  离散余弦变换  以声  测试者  特征向量  随机顺序  时间区域  对数域  均方根误差  高斯分布  
收稿时间:2013-03-15
修稿时间:2013-06-30

Prosody conversion for mandarin emotional voice conversion
LI Xian, WU Jun, WANG Zengfu. Prosody conversion for mandarin emotional voice conversion[J]. ACTA ACUSTICA, 2014, 39(4): 509-516. DOI: 10.15949/j.cnki.0371-0025.2014.04.015
Authors:LI Xian  WU Jun  WANG Zengfu
Affiliation:1. Dept. of Automation, University of Science & Technology of China Hefei 230027;2. National Engineering Laboratory of Speech and Language Information Processing Hefei 230027;3. Institute of Intelligent Machines, Chinese Academy of Sciences Hefei 230027
Abstract:A prosody conversion method was proposed for transforming neutral speech to some required target emotion, in which F0 was modeled by DCT and converted by GMM-based method at both phrase level and syllable level, while duration was converted by CART-based method at phoneme level. A corpus consisted of three basis emotions was used for training and testing. Objective evaluation and The listening test results showed that our method can convert emotional prosody effectively, the sad emotion conversion achieved accuracy of nearly 100% in listening test 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号