首页 | 本学科首页   官方微博 | 高级检索  
     

基于音节Lattice的汉语语音检索技术及其索引去冗余方法
引用本文:郑铁然, 韩纪庆. 基于音节Lattice的汉语语音检索技术及其索引去冗余方法[J]. 声学学报, 2008, 33(6): 526-533. DOI: 10.15949/j.cnki.0371-0025.2008.06.012
作者姓名:郑铁然  韩纪庆
作者单位:1.哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001
基金项目:国家自然科学基金,国家重点基础研究发展计划(973计划)
摘    要:针对网络中越来越多的语音数据,人们迫切地需要基于语义内容的快速、准确的语音检索技术。本文在基于音节Lattice的汉语语音检索研究中,针对传统的向量空间模型检索方法的不足,提出了一种基于词检出实现的语音检索方法。并针对Lattice索引存在的信息冗余问题,提出了一种基于音节后验概率直方图的索引去冗余方法。实验结果表明,本文的检索方法在性能上明显优于向量空间模型方法;而提出的索引去冗余方法达到了大规模缩减索引尺寸加快检索速度的目的。

收稿时间:2007-01-23
修稿时间:2008-04-15

Syllable lattice based Chinese speech retrieval techniques and removing redundancy method from indices
ZHENG Tieran, HAN Jiqing. Syllable lattice based Chinese speech retrieval techniques and removing redundancy method from indices[J]. ACTA ACUSTICA, 2008, 33(6): 526-533. DOI: 10.15949/j.cnki.0371-0025.2008.06.012
Authors:ZHENG Tieran  HAN Jiqing
Affiliation:1.School of Computer Science and Technology, Harbin Institute of Technology Harbin 150001
Abstract:Nowadays,the amount of spoken data becomes much larger on Internet.Thus content based,rapid and accurate speech retrieval techniques are desired.In the research of syllable lattice based Chinese speech retrieval, a retrieval method based on keyword spotting techniques is present,instead of the method based on vector space model.Then,a removing redundancy method is also proposed,which can distinguish useful information from redundant information by a syllable posterior probability histogram and then remove redundancy from lattice indices.Experiment shows that our retrieval method has much better performances than the method based on vector space model.Moreover, smaller indices size and faster searching speed are acquired by using the removing redundancy method.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号