共查询到20条相似文献,搜索用时 15 毫秒
1.
The small, moderate, and large scale saliency patterns in images are valuable to be extracted in saliency detection. By the observation that the probability of small and large saliency patterns appearing in datasets is lower than that of moderate scale saliency patterns. As results, a deep saliency model trained on such datasets would converge to moderate scale saliency patterns, and it is hard to well infer the small and large scale saliency patterns because they are not encoded efficiently in the model for their low probability. Thus a novel but simple saliency detection method using cross-scale deep inference is presented in this paper. Moreover, a new network architecture, in which the attention mechanism is exploited by multiple layers, is proposed to improve the receptive fields of various scale saliency patterns in different scale images. The presented cross-scale deep inference could improve the representation power of small and large scale saliency patterns encoded in multiple scale images efficiently. The quantitative and qualitative evaluation demonstrates our deep model achieves a promising results across a wide of metrics. 相似文献
2.
This paper proposes AMEA-GAN, an attention mechanism enhancement algorithm. It is cycle consistency-based generative adversarial networks for single image dehazing, which follows the mechanism of the human retina and to a great extent guarantees the color authenticity of enhanced images. To address the color distortion and fog artifacts in real-world images caused by most image dehazing methods, we refer to the human visual neurons and use the attention mechanism of similar Horizontal cell and Amazon cell in the retina to improve the structure of the generator adversarial networks. By introducing our proposed attention mechanism, the effect of haze removal becomes more natural without leaving any artifacts, especially in the dense fog area. We also use an improved symmetrical structure of FUNIE-GAN to improve the visual color perception or the color authenticity of the enhanced image and to produce a better visual effect. Experimental results show that our proposed model generates satisfactory results, that is, the output image of AMEA-GAN bears a strong sense of reality. Compared with state-of-the-art methods, AMEA-GAN not only dehazes images taken in daytime scenes but also can enhance images taken in nighttime scenes and even optical remote sensing imagery. 相似文献
3.
In view of the faultiness that the existing image inpainting methods fail to make full use of the complete region to predict the missing region features when the object features are seriously missing, resulting in discontinuous features and fuzzy detail texture of the inpainting results, a fine inpainting method of incomplete image based on features fusion and two-steps inpainting (FFTI) is proposed in this paper. Firstly, the dynamic memory networks (DMN+) are used to fuse the external features and internal features of the incomplete image to generate the incomplete image optimization map. Secondly, a generation countermeasure generative network with gradient penalty constraints is constructed to guide the generator to rough repair the optimized incomplete image and obtain the rough repair map of the target to be repaired. Finally, the coarse repair graph is further optimized by the idea of coherence of relevant features to obtain the final fine repair graph. It is verified by simulation on three image data sets with different complexity, and compared with the existing dominant repair model for visual effect and objective data. The experimental results show that the results of the model repair in this paper are more reasonable in texture structure, better than other models in visual effect and objective data, and the Peak Signal-to-Noise Ratio of the proposed algorithm in the most challenging underwater targe dataset is 27.01, the highest Structural Similarity Index is 0.949. 相似文献
4.
“Composition” determines the vividness of the image and its narrative power. Current research on image aesthetics implicitly considers simple composition rules, but no reliable composition classification and image optimization method explicitly considers composition rules. The existing composition classification models are not suitable for snapshots. We propose a composition classification model based on spatial-invariant convolutional neural networks (RSTN) with translation invariance and rotation invariance. It enhances the generalization of the model for snapshots or skewed images. Ultimately, the accuracy of the RSTN model improved by 3% over the Baseline to 90.8762%, and the rotation consistency improved by 16.015%. Furthermore, we classify images into three categories based on their sensitivity to editing: skew-sensitive, translation-sensitive, and non-space-sensitive. We design a set of composition optimization strategies for each composition that can effectively adjust the composition to beautify the image. 相似文献
5.
文中深入探索了基于深度强化学习的计算机多媒体图像分类技术。首先,阐述了深度强化学习原理,包括智能体、环境、状态、动作和奖励机制,及深度神经网络在逼近价值函数或策略函数的应用。其次,介绍了多媒体图像颜色、 纹理、形状特征提取方法和卷积神经网络在其中的作用。同时,详细描述了模型整体架构,含卷积神经网络感知模块和包括策略网络、价值网络的决策模块,以及奖励机制设计。最后,从训练数据集准备、训练参数设置和利用强化学习策略优化训练 3 个方面阐述了模型训练与优化过程,为多媒体图像分类提供了全面的技术方案。 相似文献
6.
超声图像去噪对提高超声图像的视觉质量和完成其他相关的计算机视觉任务都至关重要。超声图像中的特征信息与斑点噪声信号较为相似,用已有的去噪方法对超声图像去噪,容易造成超声图像纹理特征丢失,这会对临床诊断的准确性产生严重的干扰。因此,在去除斑点噪声的过程中,需尽量保留图像的边缘纹理信息才能更好地完成超声图像去噪任务。该文提出一种基于残差编解码器的通道自适应去噪模型(RED-SENet),能有效去除超声图像中的斑点噪声。在去噪模型的解码器部分引入注意力反卷积残差块,使本模型可以学习并利用全局信息,从而选择性地强调关键通道的内容特征,抑制无用特征,能提高模型去噪的性能。在2个私有数据集和2个公开数据集上对该模型进行定性评估和定量分析,与一些先进的方法相比,该模型的去噪性能有显著提升,并在噪声抑制以及结构保持方面具有良好的效果。 相似文献
7.
在全球尺度上了解鱼类物种组成、丰度及时空分布等,将有助于其生物多样保护。水下图像采集是获取鱼类物种多样性数据的主要调查手段之一,但图像信息分析工作耗时耗力。2015年以来,海洋鱼类图像数据集更新和深度学习模型算法优化等方面取得了一系列进展,但细粒度分类表现仍显不足,研究成果的生产实践应用相对薄弱。因此,该文首先分析海洋相关行业对鱼类自动化图像分类的需求,然后综合介绍鱼类图像数据集和深度学习算法应用,并分析了所面临的小样本下的细粒度分析等主要挑战及相应解决方法。最后探讨了基于深度学习的海洋鱼类图像自动化分类对相关图像信息处理研究及应用平台对未来在生态环境监测等海洋相关产业领域的重要性及其前景。该文旨在为快速了解基于深度学习的海洋鱼类图像自动化分类的研究背景、进展和未来方向的工作者提供相关信息。 相似文献
8.
9.
现有无监督特征学习算法通常在RGB色彩空间进行特征提取,而图像和视频压缩编码标准则广泛采用YUV色彩空间。为了利用人类视觉特性和避免色彩空间转换所消耗的计算量,该文提出一种基于稀疏自动编码器在YUV色彩空间进行无监督特征学习的方法。首先在YUV空间随机采集图像子块并进行白化处理,然后利用稀疏自动编码器进行无监督局部特征学习。在预处理阶段,针对YUV空间亮度和色度通道相互独立的特性,提出一种将亮度和色度进行分离的白化措施。最后用学习到的局部特征在大尺寸图像上进行卷积操作从而获得全局特征,并送入图像分类系统进行性能测试。实验结果表明:只要对亮度分量进行适当的白化处理,在YUV空间中的无监督特征学习就能够获得相当于甚至优于RGB空间的彩色图像分类性能。 相似文献
10.
To solve the problem of low sign language recognition rate under the condition of small samples, a simple and effective static gesture recognition method based on an attention mechanism is proposed. The method proposed in this paper can enhance the features of both the details and the subject of the gesture image. The input of the proposed method depends on the intermediate feature map generated by the original network. Also, the proposed convolutional model is a lightweight general module, which can be seamlessly integrated into any CNN(Convolutional Neural Network) architecture and achieve significant performance gains with minimal overhead. Experiments on two different datasets show that the proposed method is effective and can improve the accuracy of sign language recognition of the benchmark model, making its performance better than the existing methods. 相似文献
11.
The quality of the images in all image-based applications and specially in computer vision applications is very crucial. Hence, design of a light-weight high-performance image super resolution scheme that enhances the quality of the acquired images is crucial for satisfactory functioning of such applications. Design of most of image super resolution schemes ignore extracting and processing of the negative-valued features of the images. In this paper, a novel light-weight residual block, which efficiently extracts and processes both the positive and negative-valued features, is proposed. This new residual block is capable of producing a richer set of features in order to improve the super resolution performance of the network using a set of such blocks. The network using the new residual blocks is shown to yield a performance superior to those of the existing light-weight super resolution networks using other types of residual blocks. 相似文献
12.
Image completion is a challenging task which aims to fill the missing or masked regions in images with plausibly synthesized contents. In this paper, we focus on face image inpainting tasks, aiming at reconstructing missing or damaged regions of an incomplete face image given the context information. We specially design the U-Net architecture to tackle the problem. The proposed U-Net based method combines Hybrid Dilated Convolution (HDC) and spectral normalization to fill in missing regions of any shape with sharp structures and fine-detailed textures. We perform both qualitative and quantitative evaluation on two challenging face datasets. Experimental results demonstrate that our method outperforms previous learning-based inpainting methods. The proposed method can generate realistic and semantically plausible images. 相似文献
13.
Prior image deraining works mainly have two problems: (1) they do not generalize well to various datasets; (2) too much detail information is lost in the heavy rain area of the rain image. To overcome these two problems, we propose a new two-stage Adversarial Residual Refinement Network (ARRN) to deal with heavy rain images. Specifically, for the first problem, we first introduce a new implicit rain model to model a rain image as a composition of a background image and a residual image. Based on the proposed implicit model, we then propose the ARRN which consists of an image decomposition stage and an image refinement stage. For the second problem, a new attention Wasserstein Generative Adversarial Networks (WGAN) loss in the refinement stage is introduced to force the network to focus on refining heavily degraded areas. Comprehensive experiments demonstrate the effectiveness of the proposed approach. 相似文献
14.
To improve image quality assessment (IQA) methods, it is believable that we have to extract image features that are highly representative to human visual perception. In this paper, we propose a novel IQA algorithm by leveraging an optimized convolutional neural network architecture that is designed to automatically extract discriminative image quality features. And the IQA algorithm uses local luminance coefficient normalization, dropout and the other advanced techniques to further improve the network learning ability. At the same time the proposed IQA algorithm is implemented based on Field Programmable Gate Array (FPGA) and further evaluated on two public databases. Extensive experimental results have shown that our method outperforms many existing IQA algorithms in terms of accuracy and speed. 相似文献
15.
Attention mechanism has been found effective for human gaze estimation, and the attention and diversity of learned features are two important aspects of attention mechanism. However, the traditional attention mechanism used in existing gaze model is more prone to utilize first-order information that is attentive but not diverse. Though the existing bilinear pooling-based attention could overcome the shortcoming of traditional attention, it is limited to extract high-order contextual information. Thus we introduce a novel bilinear pooling-based attention mechanism, which could extract the second-order contextual information by the interaction between local deep learned features. To make the gaze-related features robust for spatial misalignment, we further propose an attention-in-attention method, which consists of a global average pooling and an inner attention on the second-order features. For the purpose of gaze estimation, a new bilinear pooling-based attention networks with attention-in-attention is further proposed. Extensive evaluation shows that our method surpasses the state-of-the-art by a big margin. 相似文献
16.
Image quality assessment is an indispensable in computer vision applications, such as image classification, image parsing. With the development of Internet, image data acquisition becomes more conveniently. However, image distortion is inevitable due to imperfect image acquisition system, image transmission medium and image recording equipment. Traditional image quality assessment algorithms only focus on low-level visual features such as color or texture, which could not encode high-level features effectively. CNN-based methods have shown satisfactory results in image quality assessment. However, existing methods have problems such as incomplete feature extraction, partial image block distortion, and inability to determine scores. So in this paper, we propose a novel framework for image quality assessment based on deep learning. We incorporate both low-level visual features and high-level semantic features to better describe images. And image quality is analyzed in a parallel processing mode. Experiments are conducted on LIVE and TID2008 datasets demonstrate the proposed model can predict the quality of the distorted image well, and both SROCC and PLCC can reach 0.92 or higher. 相似文献
17.
针对视频分类中普遍面临的类内离散度和类间相似性较大而制约分类性能的问题,该文提出一种基于深度度量学习的视频分类方法。该方法设计了一种深度网络,网络包含特征学习、基于深度度量学习的相似性度量,以及分类3个部分。其中相似性度量的工作原理为:首先,计算特征间的欧式距离作为样本之间的语义距离;其次,设计一个间隔分配函数,根据语义距离动态分配语义间隔;最后,根据样本语义间隔计算误差并反向传播,使网络能够学习到样本间语义距离的差异,自动聚焦于难分样本,以充分学习难分样本的特征。该网络在训练过程中采用多任务学习的方法,同时学习相似性度量和分类任务,以达到整体最优。在UCF101和HMDB51上的实验结果表明,与已有方法相比,提出的方法能有效提高视频分类精度。 相似文献
18.
19.
当今,饲养猫狗等宠物的人群逐渐增多。很多宠物主人因出差、旅游等难以随时照管宠物,导致宠物的健康得不到保证。如何在不影响正常生活和工作的基础上,方便、高效地照顾宠物,成为亟需解决的一个问题。基于此,文中以Raspberry Pi 3B+为核心装置,应用深度学习与图像识别等先进技术,探索设计了智能宠物猫狗喂养系统。该研究以红外感应系统与卷积神经网络模型为基础,并结合数字图像处理与图像识别分类技术,精准检测与判断宠物猫狗类别,依据检测结果进行精准喂食,从而使整个喂养过程更加智能化和简单化。这样不仅减少了宠物猫狗对主人的依赖性,且对改善人们的生活便利性与生态环境具有重要意义。 相似文献
20.
《Digital Communications & Networks》2022,8(6):942-954
Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance. 相似文献