基于N-gram统计模型的搜索引擎中文纠错 Chinese Spelling Correction in Search Engines Based on N-gram Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于N-gram统计模型的搜索引擎中文纠错

引用本文：	陈智鹏,吕玉琴,刘华生,刘刚,屠辉.基于N-gram统计模型的搜索引擎中文纠错[J].中国电子科学研究院学报,2009,4(3).

作者姓名：	陈智鹏吕玉琴刘华生刘刚屠辉

作者单位：	北京邮电大学,电子工程学院,北京,100876

摘要：	搜索引擎中的关键词纠错是提高检索效率的一项重要辅助功能。提出了一种完全通过分析上下文统计信息的方法，根据中文语言的特点，在建立N—gram统计模型并分析比较的基础上，再通过计算TF／IDF的权重来获得最优的纠错结果，最后通过实验验证了该方法实现了搜索引擎中对输入关键词的自动检查和纠错。
关键词：	搜索引擎输入纠错 N-gram模型 TF／IDF
Chinese Spelling Correction in Search Engines Based on N-gram Model

CHEN Zhi-peng,LV Yu-qin,LIU Hua-sheng,LIU Gang,TU Hui.Chinese Spelling Correction in Search Engines Based on N-gram Model[J].Journal of China Academy of Electronics and Information Technology,2009,4(3).

Authors:	CHEN Zhi-peng LV Yu-qin LIU Hua-sheng LIU Gang TU Hui

Institution:	Beijing University of Posts and Telecommunications School of Electronic Engineering;Beijing 100876;China

Abstract:	Key words spelling correction plays an important part in the improvement of efficiency in a search engine.In this article,a method that analyzes only the context-sensitive statistics is d iscussed.Accord ing to the characteristics of the Chinese language,this method is based on the establishment ofN-grams model and the analysis and comparison of it,and it involves the calculation of the TF /IDF weights to obtain the best error correction.This correction model is tested in actual practice and is proved effec...

Keywords:	search engine spelling correction N-grams model TF/IDF weight
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏