基于注意力机制与图卷积神经网络的单目红外图像深度估计 Depth estimation of monocular infrared images based on attention mechanism and graph convolutional neural network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于注意力机制与图卷积神经网络的单目红外图像深度估计

引用本文：	朱思敏,赵海涛.基于注意力机制与图卷积神经网络的单目红外图像深度估计[J].应用光学,2021,42(1):49-56.

作者姓名：	朱思敏赵海涛

作者单位：	华东理工大学信息科学与工程学院，上海 200237

基金项目：	国家自然科学基金面上项目(61375007)。

摘要：	对场景中的物体进行深度估计是无人驾驶领域中的关键问题，红外图像有利于在光线不佳的情况下解决深度估计问题。针对红外图像纹理不清晰与边缘信息不丰富的特点，提出了将注意力机制与图卷积神经网络相结合来解决单目红外图像深度估计问题。首先，在深度估计问题中，图像中每个像素点的深度信息不仅与其周围像素点的深度信息相关，还需考虑更大范围的其他像素点的深度信息，采用注意力机制可以针对这一点有效提取图像的像素级全局深度信息关联。其次，基于深度信息关联得到的特征可以考虑为非欧数据，进一步使用图卷积神经网络（graph convolutional neural network, GCN）来进行推理。最后，在训练阶段将连续的深度估计回归问题转化成分类问题，使训练过程更稳定，降低了网络的学习难度。实验结果表明，该方法在红外数据集NUST-SR上获得了良好的效果，在阈值指标小于1.253时，准确率提升了1.2%，相较其他方法更具优势。
关键词：	红外图像深度估计注意力机制图卷积神经网络
收稿时间：	2020-09-04
Depth estimation of monocular infrared images based on attention mechanism and graph convolutional neural network

ZHU Simin,ZHAO Haitao.Depth estimation of monocular infrared images based on attention mechanism and graph convolutional neural network[J].Journal of Applied Optics,2021,42(1):49-56.

Authors:	ZHU Simin ZHAO Haitao

Institution:	School of Information Science and Engineering, East China University ofScience and Technology, Shanghai 200237, China

Abstract:	The depth estimation of objects in the scene is a key issue in the field of the unmanned driving.The infrared images are helpful to solve the depth estimation problem under poor light conditions.Aiming at characteristics of unclear infrared images texture and insufficient edge information,a combination of attention mechanism and graph convolutional neural network was proposed to solve the problem of monocular infrared images depth estimation.First of all,in the depth estimation problem,the depth information of each pixel in the image was not only related to the depth information of its surrounding pixels,but also needed to consider the depth information of a larger range of other pixels.The attention mechanism could be effectively extract the pixel-level global depth information association of images.Secondly,the features obtained based on the depth information association could be considered as non-Euclidean data,and the graph convolutional neural network(GCN)was further used for reasoning.Finally,in the training phase,the continuous depth estimation regression problem was transformed into the classification problem,which made the training process more stable and reduced the learning difficulty of the network.The experimental results show that the proposed method has obtained good results on the infrared data set NUST-SR.When the threshold index is less than 1.25³,the accuracy rate is improved by 1.2%,which is more advantageous than other methods.

Keywords:	infrared images depth estimation attention mechanism graph convolutional neural network
本文献已被 CNKI 维普等数据库收录！
	点击此处可从《应用光学》浏览原始摘要信息
	点击此处可从《应用光学》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏