首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于词序列核的垃圾邮件过滤
引用本文:陈葡,谢晓尧,徐洋.基于词序列核的垃圾邮件过滤[J].武汉大学学报(理学版),2011,57(5).
作者姓名:陈葡  谢晓尧  徐洋
作者单位:贵州师范大学贵州省信息与计算科学重点实验室,贵州贵阳,550001
基金项目:贵州省工业攻关项目(黔科合GY字[2008]3009); 贵州省科学技术基金(黔科合J字[2011]2213)资助项目; 贵州师范大学资助博士科研项目
摘    要:针对传统的基于词频特征向量核方法的垃圾邮件过滤算法中忽略词汇间的序列信息而导致信息损失影响过滤精度的问题,本文将词序列核与SVM(support vector machines)算法结合,对垃圾邮件进行过滤,相应的实验表明,该方法提高召回率、正确率和精确率,从而提高了过滤精度.

关 键 词:词序列核  垃圾短信过滤  核方法  SVM

The Spam Filtering Based on Word-Sequence Kernel
CHEN Pu,XIE Xiaoyao,XU Yang.The Spam Filtering Based on Word-Sequence Kernel[J].JOurnal of Wuhan University:Natural Science Edition,2011,57(5).
Authors:CHEN Pu  XIE Xiaoyao  XU Yang
Institution:CHEN Pu,XIE Xiaoyao,XU Yang(Key Laboratory of Information and Computing Science of Guizhou Province,Guizhou Normal University,Guiyang 550001,Guizhou,China)
Abstract:In spam filter algorithm,to counter the situation where the traditional kernel methof basing on word frequency feature vectors the sequence information between vocabularies,which leads to the information loss and eventually influences the filtering precision,the article combines word nuclear with SVM algorithm,brings them into spam filtering and carries on the relevant experiment whose results show that the method is able to greatly improve the filtering precision.
Keywords:word-sequence kernel  spam filter  kernel algorithm  SVM  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号