首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于听觉图像的音乐流派自动分类
引用本文:李锵,李秋颖,关欣.基于听觉图像的音乐流派自动分类[J].天津大学学报(自然科学与工程技术版),2013(1):67-72.
作者姓名:李锵  李秋颖  关欣
作者单位:天津大学电子信息工程学院
基金项目:国家自然科学基金资助项目(61101225,60802049);天津大学自主创新基金资助项目(60302015)
摘    要:音乐流派的自动分类是音乐信息检索系统的重要组成部分.将听觉图像引入音乐流派的分类研究中,用听觉图像模型模拟人耳耳蜗结构,基于音乐流派分类研究常用的GTZAN数据库,将一维音频信号转换为二维听觉图像,对音乐听觉图像进行尺度不变特征转换(SIFT)及空间金字塔匹配(SPM),从局部到整体地提取图像的纹理特征,最后采用LibSVM中线性核函数的支持向量机对音乐流派进行分类.实验结果表明,与同样基于人耳耳蜗结构提出的美尔频率倒谱系数(MFCC)流派分类方法相比,基于听觉图像的流派分类正确率提高15%.

关 键 词:音乐流派分类  听觉图像  尺度不变特征转换  空间金字塔匹配

Automatic Music Genre Classification Based on Auditory Images
Li Qiang,Li Qiuying,Guan Xin.Automatic Music Genre Classification Based on Auditory Images[J].Journal of Tianjin University(Science and Technology),2013(1):67-72.
Authors:Li Qiang  Li Qiuying  Guan Xin
Institution:(School of Electronic Information Engineering,Tianjin University,Tianjin 300072,China)
Abstract:Automatic music genre classification is an important part of the music information retrieval system. The concept of "auditory image" is introduced into music genre classification in this paper. Auditory image model (AIM)converts the one-dimensional audio signal into two-dimensional auditory images by simulating the hu- man ear cochlear structures for the commonly database of GTZAN. And then, the methods of scale invariant feature transformation (SIFT) and space pyramid matching (SPM) are used to extract image features from the part to the whole. And the linear kernel support vector machine is chosen for classification since the dimension of features was high. Experimental results show that the genre classification accuracy based on the auditory images can be 15% higher than the Mel-frequency cepstral coefficients (MFCC) which is also based on the cochlear structure of the human ear.
Keywords:music genre classification  auditory image  scale invariant feature transformation (SIFT)  space pyramid matching (SPM)
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号