首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于决策树的关键短语抽取
引用本文:刘玲玲,梁颖红,张永刚,韩艳,姚建民.基于决策树的关键短语抽取[J].江南大学学报(自然科学版),2010,9(1):71-74.
作者姓名:刘玲玲  梁颖红  张永刚  韩艳  姚建民
作者单位:1. 苏州大学,计算机学院,江苏,苏州,215006
2. 江苏省现代企业信息化应用支撑软件工程技术研究开发中心,江苏,苏州,215104
基金项目:江苏省现代企业信息化应用支撑软件工程技术研究开发中心项目(SX200907)
摘    要:针对关键短语抽取工作可以转化为某种分类问题,利用决策树构造分类器解决关键短语的抽取。统计分析表明,将文档中词的词频因子、首位置和词性作为决策树分类特征,并考虑词在文档中出现的位置信息,对词的特征值进行一定的调整,采用Bagging重采样技术进一步提高了决策树的抽取性能,使其完全匹配和部分匹配的F_检测率分别达到21.50%和54.49%。

关 键 词:关键短语  抽取  特征  决策树

Keyphrases Extraction Based on the Decision Tree
LIU Ling-ling,LIANG Ying-hong,ZHANG Yong-gang,HAN Yan,YAO Jian-min.Keyphrases Extraction Based on the Decision Tree[J].Journal of Southern Yangtze University:Natural Science Edition,2010,9(1):71-74.
Authors:LIU Ling-ling  LIANG Ying-hong  ZHANG Yong-gang  HAN Yan  YAO Jian-min
Institution:1.School of Computer Science and Technology;Soochow University;Suzhou 215006;China;2.Jiangsu Province Support Software Engineering R and D Center for Modern Information Technology Application in Enterprise;Suzhou 215104;China
Abstract:In the paper,we use decision tree to solve the keyphrases extraction problem for it can be thought as a kind of classification problem.Based on analyzing the scientific and technical literature,the features what we selected are the factor of frequency,the first position and the POS,which make a certain adjustment to the features of the word through the position information where the word appears in the documents.Finally,it makes the extraction performance further improved by means of the Bagging resampling ...
Keywords:keyphrases  extraction  feature  decision tree  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号