首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于ViT和多任务自监督学习的图像质量评价
引用本文:王华成,桑庆兵,胡聪.基于ViT和多任务自监督学习的图像质量评价[J].光电子.激光,2024,35(8):785-792.
作者姓名:王华成  桑庆兵  胡聪
作者单位:(江南大学 人工智能与计算机学院,江苏 无锡 214122)
基金项目:国家自然科学基金(62006097)和江苏省自然科学基金(BK20200593) 资助项目
摘    要:针对现有的基于深度学习的图像质量评价方法,因为标注数据不足而存在的过拟合与泛化性能不足的问题,提出了一种基于多任务自监督学习的图像质量评价方法。首先,通过算法合成17种失真类型图像,并以全参考MDSI(mean deviation similarity index)得分和失真类型作为合成失真图像的2个标签;随后,在ViT(vision transformer)上进行预测MDSI得分和失真类型的多任务自监督学习;最后,将训练得到的模型在下游任务上进行微调,将上游任务学习到的语义特征迁移到下游任务。将本文方法与主流无参考图像质量评价(no reference image quality assessment,NR-IQA) 方法在多个公开的图像质量评价数据集上进行了充分比较,在LIVE、CSIQ、TID2013以及CID2013等数据集上的测试 结果相比于表现最好的算法均提升了1—2个百分点,这表明提出的算法优于大多数主流的NR-IQA算法。

关 键 词:图像质量评价    无参考    多任务学习    自监督学习    vision  transformer  (ViT)
收稿时间:2022/12/25 0:00:00
修稿时间:2023/3/31 0:00:00

Image quality assessment based on ViT and multi-task self-supervised learning
WANG Huacheng,SANG Qingbing,HU Cong.Image quality assessment based on ViT and multi-task self-supervised learning[J].Journal of Optoelectronics·laser,2024,35(8):785-792.
Authors:WANG Huacheng  SANG Qingbing  HU Cong
Institution:School of Artificial Intelligence and Computer, Jiangnan University, Wuxi, Jiangsu 214122, China
Abstract:An image quality assessment method based on multi-task self-supervised learning is proposed to address the existing deep learning-based image quality assessment methods,which suffer from overfitting and insufficient generalization performance due to insufficient labeled data.First,17 distortion type images are synthesized by the algorithm and the full reference mean deviation similarity index (MDSI) score and distortion type are used as 2 labels for the synthesized distortion images.Subsequently,multi-task self-supervised learning on vision transformer (ViT) for predicting MDSI scores and distortion types.Finally,the trained model is fine-tuned on the downstream task to migrate the semantic features learned from the upstream task to the downstream task.The method in this paper is fully compared with mainstream no reference image quality assessment (NR-IQA) methods on several publicly available image quality assessment datasets,and the test results on LIVE, CSIQ, TID2013, and CID2013 are all improved by 1 to 2 percentage points compared with the best performing algorithms, which indicates that the proposed algorithm outperforms most mainstream unreferenced image quality assessment algorithms.
Keywords:image quality assessment  no-reference  multi-task learning  self-supervised learning  vision transformer (ViT)
点击此处可从《光电子.激光》浏览原始摘要信息
点击此处可从《光电子.激光》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号