基于改进Faster R-CNN的自然场景文字检测算法 Natural scene text detection algorithm based on improved Faster R-CNN期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于改进Faster R-CNN的自然场景文字检测算法

引用本文：	杨宏志,庞宇,王慧倩.基于改进Faster R-CNN的自然场景文字检测算法[J].重庆邮电大学学报(自然科学版),2019,31(6):876-884.

作者姓名：	杨宏志庞宇王慧倩

作者单位：	重庆邮电大学光电信息感测与信息传输实验室,重庆,400065

基金项目：	国家自然科学基金(61301124,61471075,61671091);重庆科委自然科学基金(cstc2016jcyjA0347);重庆高校创新团队建设计划(智慧医疗系统与核心技术)

摘要：	自然场景中的文字受光照、污迹、文字较小等方面的影响，其检测难度较大，且传统的检测方法效果不好。在研究目标检测方法Faster RCNN的基础上，提出一种针对自然场景文字的改进方法。改进的模型由卷积神经网络特征提取模块，嵌套LSTM(nested long short-term memory,NLSTM)模块和区域候选网络(region proposal network,RPN)模块3部分组成，改进点主要是卷积神经网络特征提取模块增加了不同卷积层的空间特征融合，能够提取多层次的特征；增加嵌套LSTM模块能够学习长序列文本的序列特征，便于检测不定长度的文本序列；RPN模块通过设置宽为8像素，高度不定的锚点(anchor)，可以提取一系列可能存在的目标建议框，其对小目标文字效果较好?。在实验部分，通过对标准数据集(ICDAR 2013,Multilingual)的实验结果对比表明，所提出的改进算法在准确率和效率方面明显优于改进前的算法。通过实列测试，改进的模型对小目标文字检测效果也有所提升。
关键词：	区域候选网络空间特征长序列文本建议框准确率
收稿时间：	2018/10/11 0:00:00
修稿时间：	2019/10/8 0:00:00
Natural scene text detection algorithm based on improved Faster R-CNN

YANG Hongzhi,PANG Yu and WANG Huiqian.Natural scene text detection algorithm based on improved Faster R-CNN[J].Journal of Chongqing University of Posts and Telecommunications,2019,31(6):876-884.

Authors:	YANG Hongzhi PANG Yu and WANG Huiqian

Institution:	Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China,Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China and Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China

Abstract:	The text in the natural scene is affected by illumination, smudges, and small text. The detection is difficult, and the traditional detection method is not effective. Based on the research target detection method Faster RCNN, this paper proposes an improved method of natural scene text. The improved model consists of a convolutional neural network feature extraction module, a nested long short-term memory (NLSTM) module and a region proposal network (RPN) module. The improvement point is mainly that the convolutional neural network feature extraction module increases the spatial feature fusion of different convolutional layers, and can extract multi-level features; the nested LSTM module can learn the sequence features of long-sequence text, and is convenient for detecting text sequence of indefinite length; the RPN module can extract a series of possible target suggestion boxes by setting an anchor with a width of 8 pixels and an indefinite height, which is better for small target text. In the experimental part, the comparison of the experimental results of the standard dataset (ICDAR 2013, Multilingual) shows that the proposed improved algorithm is superior to the pre-improvement algorithm in accuracy and efficiency. Through the actual test, the improved model also improves the detection of small target text.

Keywords:	region proposal network spatial features long sequence text suggestion boxes accuracy
本文献已被万方数据等数据库收录！
	点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆邮电大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏