首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多任务卷积神经网络的红外与可见光多分辨率图像融合
引用本文:朱雯青,张 宁,李 争,刘 鹏,汤心溢.基于多任务卷积神经网络的红外与可见光多分辨率图像融合[J].光谱学与光谱分析,2023,43(1):289-296.
作者姓名:朱雯青  张 宁  李 争  刘 鹏  汤心溢
作者单位:1. 中国科学院上海技术物理研究所,上海 200083
2. 中国科学院大学,北京 100049
3. 中国科学院红外探测与成像技术重点实验室,上海 200083
基金项目:国家“十三五”预研基金项目(104040402)资助
摘    要:红外与可见光图像融合一直是图像领域研究的热点,融合技术能弥补单一传感器的不足,为图像理解与分析提供良好的成像基础。因生产工艺以及成本的限制,红外探测器的分辨率远低于可见光探测器,并在一定程度上因源图像分辨率的差异阻碍了实际应用。针对红外与可见光图像分辨率不一致的问题,提出了用于红外图像超分辨率重建与融合的多任务卷积网络框架,应用于多分辨率图像融合。在网络结构方面,首先设计了双通道网络分别提取红外与可见光特征,使算法不受源图像分辨率的限制;其次提出了特征上采样模块,先用双线性插值方法增加像素个数,再通过多层感知器精细化拟合像素平滑空间与高频空间的映射关系,无需重新训练模型即可实现任意尺度的红外图像上采样;接着将线性注意力引入网络,学习特征空间位置间的非线性关系,抑制无关信息并增强网络对全局信息的表达。在损失函数方面,提出了梯度损失,保留红外与可见光图像中绝对值较大的滤波器响应值,并计算该值与重建的融合图像响应值的Frobenius范数,无需理想的融合图像作为真值监督网络学习就能生成融合图像;此外,在梯度损失、像素损失的共同作用下对多任务模型进行优化,可以同时重建融合图像和高分辨率红外图像...

关 键 词:红外与可见光融合  多分辨率图像融合  线性注意力  梯度损失  红外图像超分辨率
收稿时间:2021-12-10

A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion
ZHU Wen-qing,ZHANG Ning,LI Zheng,LIU Peng,TANG Xin-yi.A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion[J].Spectroscopy and Spectral Analysis,2023,43(1):289-296.
Authors:ZHU Wen-qing  ZHANG Ning  LI Zheng  LIU Peng  TANG Xin-yi
Institution:1. Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China 2. University of Chinese Academy of Sciences, Beijing 100049, China 3. Key Laboratory of Infrared System Detection and Imaging Technology, Chinese Academy of Sciences, Shanghai 200083, China
Abstract:Infrared and visible image fusion have always been a research hotspot in the image field. Fusion technology can compensate for a single sensor’s deficiency and provide good imaging pandation for image understanding and analysis. Due to the limitation of production technology and cost, the resolution of infrared detectors is much lower than that of visible detectors, which prevents practical usage to a great extent. A multi-task convolutional neural network framework combining infrared super-resolution and image fusion tasks is proposed, which is applied to the infrared and visible multi-resolution image fusion. In terms of network structure, firstly, a dual-channel network is designed to extract infrared and visible features respectively, so that the resolution of each source image does not limit the proposed algorithm. Secondly, the feature up-sampling block is proposed, using the bilinear interpolation method to increase the number of pixels. Then the mapping relationship between pixel smooth space and high-frequency space is refined via a multilayer perceptron. Therefore, the infrared images can be presented on an arbitrary scale, where the training tasks are not provided. Furthermore, the linear self-attention mechanism is introduced into the network to learn the nonlinear relationship between feature space positions, suppress irrelevant information and enhance global information expression. In terms of the loss function, the gradient loss is proposed to retain the filter response with larger absolute values in the infrared and visible images and calculate the Frobenius norm between the value and the response value of the reconstructed fusion image. Thus, fusion images can be generated without ideal images as ground truth supervising network learning. Finally, the fused and high-resolution infrared images can be reconstructed simultaneously by optimizing the multi-task model under the combined action of gradient loss and pixel loss. The proposed approach is trained on the RoadScene dataset and compared with the other four related algorithms on the TNO dataset. In terms of subjective performance, the proposed method can input source images with the arbitrary resolution, and fusion images have prominent infrared targets and rich visible details. When the resolution of source images is quite different, the proposed method can still reconstruct high-resolution infrared images with clear features and has robust generalization. The objective performance is excellent in multiple evaluation metrics such as entropy, the sum of the correlations of differences and spatial frequency. Experimental results demonstrate that fusion images have a large amount of information, high information conversion rate and high clarity, which verifies the effectiveness of the proposed method.
Keywords:Infrared and visible image fusion  Multi-resolution image fusion  Linear attention  Gradient loss  Infrared image super-resolution  
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号