首页 | 本学科首页   官方微博 | 高级检索  
     检索      

声乐主旋律的自动提取
引用本文:陆 雄,夏秀渝,蔡 良,孙文慧.声乐主旋律的自动提取[J].太赫兹科学与电子信息学报,2019,17(3):482-488.
作者姓名:陆 雄  夏秀渝  蔡 良  孙文慧
作者单位:College of Electronic and Information Engineering,Sichuan University,Chengdu Sichuan 610065,China,College of Electronic and Information Engineering,Sichuan University,Chengdu Sichuan 610065,China,College of Electronic and Information Engineering,Sichuan University,Chengdu Sichuan 610065,China and College of Electronic and Information Engineering,Sichuan University,Chengdu Sichuan 610065,China
摘    要:提出一种基于多候选基频提取和歌声基频判别的声乐主旋律提取算法。该算法可以有效降低旋律定位虚警率,提高整体准确率。利用度量距离(DIS)算法对音乐进行音符切分,并用方差法实现浊音段检测;采用幅度压缩基音估计滤波器(PEFAC)多基频提取技术,通过计算音高显著度提取每个浊音帧的多个候选基频。最后用维特比算法跟踪浊音段主导基频轨迹,并用基频判别模型进行歌声主旋律判别。在MIR-1K数据集上进行的实验表明,在信干比为5 dB和0 dB的情况下,本文算法提取的声乐主旋律整体准确率分别达到了86.22%和77.4%,相比于其他算法至少提高了3.79%和2.01%。

关 键 词:主旋律  音符切分  维特比算法  基频判别模型
收稿时间:2018/4/23 0:00:00
修稿时间:2018/7/15 0:00:00

Automatic extraction of vocal music theme
LU Xiong,XIA Xiuyu,CAI Liang and SUN Wenhui.Automatic extraction of vocal music theme[J].Journal of Terahertz Science and Electronic Information Technology,2019,17(3):482-488.
Authors:LU Xiong  XIA Xiuyu  CAI Liang and SUN Wenhui
Abstract:This paper presents a vocal themes extraction algorithm based on multi-candidate fundamental frequency extraction and singing voice fundamental frequency discrimination. The algorithm can effectively reduce the voicing false alarm rate and improve the overall accuracy. First, using the Distance(DIS) metric distance algorithm to achieve note segmentation, and using the variance method to detect voiced segments. Then Pitch Estimation Filter with Amplitude Compression(PEFAC) multi- fundamental frequency extraction technology is utilized to extract multiple candidate fundamental frequencies of each voiced frame by calculating the pitch saliency. Finally, the dominant fundamental frequency trajectory of the voiced segment is tracked by the Viterbi algorithm, and the main melody of the singing voice is determined by the fundamental frequency discrimination model. Experiments conducted on the MIR-1K dataset show that the overall accuracies of the vocal themes extracted by the proposed algorithm reach 86.22% and 77.4%, respectively, at the signal to interference ratio of 5 dB and 0 dB, which are increased by at least 3.79% and 2.01% respectively compared to other algorithms.
Keywords:
点击此处可从《太赫兹科学与电子信息学报》浏览原始摘要信息
点击此处可从《太赫兹科学与电子信息学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号