首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
施丽红 《光学技术》2020,(6):750-756
针对复杂环境下动态手势识别准确率低的问题,提出了一种基于长短期记忆网络和卷积神经网络的动态手势识别算法。采用长短期记忆网络学习每个滤波器的权重,预测人体外形相关的滤波器组;采用卷积神经网络提取目标手势的轨迹图,创建彩色的轨迹图像;将轨迹图像送入注意力卷积神经网络训练,利用神经网络识别出复杂环境下的手势。实验结果表明,该算法能够准确地检测与跟踪手势的动态变化,并且实现了较好的手势识别准确性。  相似文献   

2.
基于扩张卷积注意力神经网络的高光谱图像分类   总被引:1,自引:0,他引:1  
为了解决训练样本有限情况下高光谱图像分类精度低的问题,提出了一种结合扩张卷积与注意力机制的三维-二维串联卷积神经网络模型.首先,该模型以串联的三维-二维卷积神经网络作为基础结构,利用三维卷积同时提取高光谱图像的空谱特征,并采用二维卷积进一步提取高级空间语义信息;然后,通过引入扩张卷积增大卷积核感受野,构建了多尺度特征提...  相似文献   

3.
基于可见光谱和支持向量机的黄瓜叶部病害识别方法研究   总被引:1,自引:0,他引:1  
以黄瓜叶部病害作为研究对象,基于可见光谱反射率差异识别黄瓜叶部病害,研究基于SVM的黄瓜叶部病害识别预测模型。采用小波变换进行数据预处理;选取Otsu、边缘分割法和K均值聚类三类分割方法进行病斑分割,比较错分率和运行时间,K均值聚类方法更适合黄瓜叶部病斑分割;提取纹理、颜色和形状特征参数,共15个特征参数;通过交叉验证选择最优参数cg,对核函数参数进行优化处理,并通过比较线性核、多项式核、RBF核等不同核函数情况下SVM的正确识别率,确定RBF核SVM模式识别方法能够更精准地识别黄瓜叶部病害。并将基于SVM与另外两种常见的黄瓜叶部病害识别方法,BP神经网络和模糊聚类进行比较,结果表明,基于SVM的识别模型对霜霉病的正确识别率为95%,白粉病和褐斑病的正确识别率均为90%,平均诊断正确率为92%;该模式识别方法识别效果最佳,运行时间最短,为基于可见光谱的黄瓜病害识别模型提供参考。  相似文献   

4.
梁联晖  李军  张绍泉 《光子学报》2021,50(9):276-288
传统卷积神经网络模型在高光谱图像分类生成特征图的空间维度中存在大量的空间特征信息冗余,而且把高光谱图像单个像元上的光谱带数据看作是无序高维向量进行数据处理,并不符合光谱数据的特性,极大影响了模型的运行效率和分类性能.针对该问题,提出一种三维Octave卷积和双向循环神经网络注意力网络相结合的高光谱图像分类方法.首先,利...  相似文献   

5.
施宗晗  赵海涛 《应用光学》2022,43(5):893-903
随着计算机技术的发展,基于深度学习的目标跟踪方法已成为计算机视觉领域中重要的研究方向;但跟踪环境的复杂多变使得跟踪算法在背景干扰、颜色相近等问题上仍面临巨大挑战。相比于传统彩色图像,高光谱图像包含丰富的辐射、空间和光谱信息,能够有效提升目标跟踪的准确率。提出了将注意力机制(attention mechanism)和加性角度间隔损失(additive angular margin loss, AAML)相结合的方法来进行针对高光谱图像的目标跟踪。通过融合多域神经网络对不同波段组合进行特征提取,同时设计了融合的注意力机制模型,使得来自不同波段组合之间的相似特征进行整合和强化,在目标背景颜色相近的情况下,网络会更多地注意目标物体,使得跟踪结果更为准确。在此基础上为了使目标和背景的区分更具有判别性,网络使用加性角度间隔损失作为损失函数,在训练过程中可以有效减小同类样本的类内距离,增大正负类样本的类间距离,从而提高网络的准确性和稳定性。实验结果表明,本文方法可使两种跟踪精度评价指标精确率和成功率分别提升1.3%和0.3%,相较于其他方法更具优势。  相似文献   

6.
针对相衬显微镜采集的细胞图像具有亮度不均衡且细胞与背景对比度较低的问题,提出一种以U-Net为基本框架,结合残差块和注意力机制的细胞分割模型.首先,利用具有编码器-解码器结构的U-Net对细胞图像进行细胞初始分割;然后,在U-Net中引入残差块,以强化特征的传播能力,提取更多细胞细节信息;最后,利用注意力机制加重细胞区...  相似文献   

7.
曹阳  张祖鹏  彭小峰 《光子学报》2022,(12):117-130
针对自由空间光通信中大气湍流造成涡旋光束相位畸变,导致通信系统性能下降的问题,提出一种基于残差注意力网络的自适应光学波前复原方法。为防止神经网络的退化现象,首先采用残差网络作为主干网络,并在此基础上构建多尺度残差混合注意力网络结构,用卷积操作将光强图像转换为特征图向后传播;其次通过不同尺度的卷积核来分布式提取特征,利用注意力机制提高网络对破损光斑特征的识别率,以增强网络表达光强图像特征的能力;最后设计结合现实评价指标的网络损失函数,从而得到符合实际波前像差的Zernike系数。在不同大气湍流强度条件下开展仿真,结果表明,残差注意力网络能快速准确地重构湍流相位,复原的残余像差的波峰波谷在0.05~0.3 rad之间,均方根在0.01~0.07 rad之间。  相似文献   

8.
关世豪  杨桄  卢珊  付严宇 《光学学报》2020,(21):181-190
神经网络的注意力机制可以从数据中提取关键信息,将这一特性运用在高光谱波段选择上有助于充分学习波段之间的相互依赖和非线性关系,提取更重要的波段。提出了一种基于注意力机制的多目标优化高光谱波段选择算法。首先,利用注意力模块和自编码器构建网络;然后,将一维光谱数据作为网络输入,采用两种损失函数并结合多目标优化方法对输入数据进行训练,使嵌入在网络中的注意力模块充分学习各波段之间的非线性关系,对信息量大和易于分类的波段赋予更大的权重,以实现波段选择;最后,利用支持向量机分类器和平均光谱散度验证波段子集的性能。实验结果表明:相比于其他算法,所提算法在Botswana与Indian Pines数据集上提取的波段子集的分类精度更高,信息量更大,由此证明了所提算法对高光谱波段选择的有效性。  相似文献   

9.
朱应俊  周文君  朱川  马建敏 《应用声学》2023,42(5):1090-1098
为了使机器能够更好地理解人的情感并改善人机交互体验,可对语声特征及分类网络进行融合以提升情感识别性能。本文从网络融合的角度,把基于梅尔倒谱系数和逆梅尔倒谱系数的二维卷积神经网络和基于散射卷积网络系数的长短期记忆网络作为前端网络,提取前端网络的中间层作为话语级的特征表示,利用压缩-激励(SE)通道注意力机制对前端网络的中间层的权重进行调整并融合,然后由深度神经网络后端分类器输出情感分类结果。在汉语情感数据集中进行五折交叉验证的对比实验,实验结果表明,基于SE通道注意力机制的网络融合方式可以有效地利用不同前端网络在语声情感识别任务中的优势,提高语声情感识别的准确率。  相似文献   

10.
生物特征识别在信息安全领域发挥着重要作用,掌纹识别作为一种新型生物特征识别方式,具有低失真、非侵入性和高唯一性等优势。传统掌纹研究大多使用自然光成像系统以灰度格式获取,识别精度很难进一步提升。为了获得更多的身份鉴别信息,提出利用多光谱掌纹图像代替自然光掌纹图像。针对现有掌纹识别算法由于没有考虑到不同光谱的特性而导致纹理细节丢失,识别精准率低的问题,提出了一种基于多光谱图像融合的掌纹识别算法。该方法通过对不同光谱下的掌纹图像进行快速自适应二维经验模式分解(FABEMD),将多光谱掌纹图像分解成一系列频率由高到低的二维固有模态函数(BIMF)和一个残余分量,残余分量可被视为该光谱图像低频信息的初步估计。图像采集过程中光照条件很难保持稳定,而近红外光谱图像在进行FABEMD分解时对光照变换敏感,容易导致分解后的BIMF背景信息过于冗余;因此对分解后的近红外掌纹图像进行背景重建及特征细化,在对背景冗余信息进行平滑处理的同时可以有效增强高频信息的特征表达。为避免直接融合处理后引发的图像过度曝光问题,提出对近红外特征压缩后再融合。此外,提出了一种结合了注意力机制的改进残差网络(IRCANet),用于融合后的掌纹图像分类,在网络中引入分阶段残差结构,缓解了网络的退化问题,在学习过程中有效地减少信息丢失,对于融合后的多光谱掌纹图像,分阶段残差结构能够稳定地将图像信息在网络间传输,但对图像中的高低频信息区分效果不够显著,为了使网络关注更多区分性特征,利用特征通道间的相互依赖性,在分阶段残差结构中结合了通道注意力(Channel Attention)机制。最终,在香港理工大学(PolyU)多光谱掌纹数据集上进行的综合实验表明,该方法可以取得良好的效果,算法识别准确率能达到99.67%且具有良好的实时性。  相似文献   

11.
建立权重独立的双通道残差卷积神经网络,对可见光与红外频段下的目标图像进行特征提取,生成多尺度复合频段特征图组.基于像点间的欧式距离计算双频段特征图显著性,根据目标在不同成像频段下的特征贡献值进行自适应融合.通过热源能量池化核与视觉注意力机制,分别生成目标在双频段下的兴趣区域逻辑掩码并叠加在融合图像上,凸显目标特征并抑制...  相似文献   

12.
    
With the emergence of Optical Coherence Tomography (OCT) technology and the rapid development of computer hardware, researchers have been attempting to utilize computer-aided to identify breast tissue OCT images. This study proposes a Convolutional Neural Network (CNN) model based on residual network enhancement for auxiliary diagnosis of OCT breast tissue images. The proposed method employs ResNet-18 framework as the basis to prevent gradient disappearance or explosion. In addition, to enhance the efficiency and regularization of the model, the original 7×7 convolutional layer was substituted with a series of three cascading layers, each having a dimension of 3×3. This design choice allows for a nonlinear decomposition of the original 7×7 layer while preserving the same receptive field size. As a result, it accomplishes two advantages: a reduction in computational cost associated with model parameters and acting as an implicit regularization technique. Subsequently, a Convolutional Block Attention Module (CBAM) was introduced after each set of cascaded small convolutional layers and the final convolutional layer. This module integrates a spatial attention module, which focuses on capturing spatial dependencies, and a channel attention module, which emphasizes informative channels, thereby serving as the initial stage of filtering and enhancing the discriminative capabilities of the network. At the same time, Octave Convolution (OctConv) is employed to substitute the 3×3 convolutional layers in the original model. The convolutional kernel of OctConv has the capability to partition the input image sample data into four parts based on the low-frequency dimension ratio parameter within its structure. This functionality allows the network to dynamically balance high and low-frequency components during the process of extracting image features as the secondary filtering stage. After this, the Global Average Pooling (GAP) layer is used instead of the fully connected layer to reduce the computation of network parameters, and the structure of the network is regularized to prevent overfitting of the models. Ultimately, a residual network model “Double Filtering” ResNet (DF-ResNet) is constructed. The “double filtering” structure can not only reduce the overall parameter computation of the model, but also focus on high-frequency components with rich structural information during feature extraction. By decreasing the ratio of low-frequency constituents, the reduction of informative duplication is attained, resulting in an enhancement of the model's proficiency towards categorization and identification of images manifesting proportional configurations. The proposed DF-ResNet model is employed to train and classify the OCT image dataset of three breast tissue types. It conducts multiple tuning tests and optimization techniques such as data augmentation and batch normalization, achieving an overall classification accuracy of 96.88%. After conducting three comparative experiments, the performance of the DF-ResNet model has been validated. In the first experiment, the DF-ResNet model was compared with the ResNet-28 model with an equivalent number of layers. The experimental results showed that the replacement of some convolutional layers with OctConv allows the model to focus on high-frequency components during the image feature extraction process. This led to an improvement in the model's ability to classify and recognize images with similar structures and ultimately culminating in an overall increase in classification accuracy. Additionally, it is important to note that the use of OctConv did not negatively impact the overall convergence speed of the model. In the second experiment, the DF-ResNet model was compared with the Oct_ResNet-28 model, which incorporated OctConv as a modification to improve its performance. The experimental results validated that the DF-ResNet model effectively filtered out low-frequency information. In the third experiment, the performance of the DF-ResNet model was assessed against established CNN models such as DenseNet-169 and VGG-19. The DF-ResNet model not only boasts a reduced parameter count but also demonstrated superior classification accuracy compared to several traditional CNN models. For the classification of OCT images of breast tissue, the DF-ResNet model displayed exceptional performance, robustness, and real-time processing capabilities. As a result, it is well-suited for providing technical support for real-time margin diagnosis in the clinical applications of breast cancer.  相似文献   

13.
    
In order to improve the accuracy of manipulator operation, it is necessary to install a tactile sensor on the manipulator to obtain tactile information and accurately classify a target. However, with the increase in the uncertainty and complexity of tactile sensing data characteristics, and the continuous development of tactile sensors, typical machine-learning algorithms often cannot solve the problem of target classification of pure tactile data. Here, we propose a new model by combining a convolutional neural network and a residual network, named ResNet10-v1. We optimized the convolutional kernel, hyperparameters, and loss function of the model, and further improved the accuracy of target classification through the K-means clustering method. We verified the feasibility and effectiveness of the proposed method through a large number of experiments. We expect to further improve the generalization ability of this method and provide an important reference for the research in the field of tactile perception classification.  相似文献   

14.
Hardware acceleration of image recognition through a visual cortex model   总被引:1,自引:0,他引:1  
Recent findings in neuroscience have led to the development of several new models describing the processes in the neocortex. These models excel at cognitive applications such as image analysis and movement control. This paper presents a hardware architecture to speed up image content recognition through a recently proposed model of the visual cortex. The system is based on a set of parallel computation nodes implemented in an FPGA. The design was optimized for hardware by reducing the data storage requirements, and removing the need for multiplies and divides. The reconfigurable logic hardware implementation running at 121 MHz provided a speedup of 148 times over a 2 GHz AMD Opteron processor. The results indicate the feasibility of specialized hardware to accelerate larger biological scale implementations of the model.  相似文献   

15.
地面对空中无人机的视觉识别中,由于无人机的飞行速度、角度呈现非线性变化。使得采集的疑似图像存在特征模糊、衰退等问题,传统的模式识别方法无法提取无人机图像的主要特征,极大程度上降低了图像的识别概率。提出一种引入球面谐波基图像特征细分的无人机识别算法,建立球面谐波基图像识别模型,利用无人机图像的球面谐波基图像近似率,对模糊图像的差异特征进行依次识别。实验结果表明,利用改进算法建立的模糊无人机图像差异特征识别模型,具有一定的优越性,提高了无人机识别的准确率。  相似文献   

16.
在鱼苗养殖过程中, 同一养殖池会出现个体大的鱼苗攻击个体小的鱼苗, 个体小的鱼苗会出现伤病甚至死亡, 造成经济损失, 鱼苗分塘和售卖价格主要与其体长参数相关,因此需要对不同大小的鱼苗进行分离。鱼苗分类主要依赖于不同大小的网筛,费时费力,且容易对鱼苗造成损伤。针对传统人工分离方法效率低下并且缺乏科学指导的问题, 本文提出了基于可见光谱的鱼苗体长估测方法研究, 能够根据鱼苗图像计算鱼苗长度并进行分类。为了精确无损的获取鱼苗的体长,提出了基于迁移学习ResNet50模型的鱼苗体长估测方法。首先采集在同等高度条件下拍摄的不同长度鱼苗图像,同时手工测量鱼苗的实际长度作为数据集的标签,用四种迁移学习模型AlexNet, VGG16, GoogLeNet, ResNet50对鱼苗体长进行估算,通过验证集准确率,测试集准确率,以及不同方法的运行时间三个指标进行分析,AlexNet模型验证集准确率90.04%,测试集准确率89.82%,运行时间52 min 3 s;VGG16模型验证集准确率91.01%,测试集准确率91.17%,运行时间131 min 37 s;GoogLeNet模型验证集准确率88.02%,测试集准确率88.39%,运行时间45 min 2 s;ResNet50模型验证集准确率91.92%,测试集准确率91.09%,运行时间99 min 17 s;确定方法ResNet50。该模型具有50层的Residual Network架构,用迁移学习的方法将在ImageNet上训练得到的卷积层的参数传递到训练所使用的模型上,并调整softmax层适应本文问题。对来自10种不同长度的6 677个样本的鱼苗数据集上的实验结果表明该方法可以有效地用于鱼苗分类,通过对模型ResNet50的迁移学习的层数,迭代次数,学习率,最小批处理尺寸(Mini Batch Size)进行微调以优化模型。实验结果表明,当迁移学习模型的迁移层数为30,迭代次数为6,学习率为0.001,Mini Batch Size为10时,方法效果达到最优,模型的验证集准确率94.31%,测试集的准确率达到93.93%。该算法与传统的图像处理方法相比估算鱼苗体长准确率提高2%左右。在未来实际生产场景中,可以将该方法嵌套入鱼苗体长分离装置之中,真正的做到将科研落地,投入到实际的生产之中,减少鱼苗损伤,为未来的无人渔场奠定基础。  相似文献   

17.
Light detection and ranging (LiDAR), as an active remote sensing technology, is characterized by providing high-precision geographical location information. In this study, we further explored its capability in image classification over a suburban area. Firstly, full waveforms of small footprint airborne LiDAR were decomposed into discrete point clouds. During the decomposition, six parameters describing the physical interaction between laser pulse and the targets were calculated. They are amplitude, pulse width, central position, range, backscatter cross-section and backscatter coefficient. Secondly, the point clouds were interpolated into raster. Correspondingly, six high spatial resolution images (0.5 m) were produced. Three classification models namely decision tree (DT), maximum likelihood (ML) and support vector machine (SVM) were established based on these images. The objects of interest were classified into buildings, trees, bare soil and crop land. Results showed that all these three models yielded high overall accuracy and kappa coefficient. SVM performed the best with the highest overall accuracy (87.85%) and kappa coefficient (83.29%). Therefore, we came to conclude that classification models can also achieve satisfactory classification accuracy on LiDAR images as they did on common remote-sensed images. In addition, our study proved that physical information derived from waveform LiDAR showed good potential in classification.  相似文献   

18.
How can we complete massive images with some missing region? In this paper, we present a new image completion algorithm. Our goal is to finish the uncompleted image with missing region more creative and seamless. In other words, we not only complete the missing region semantic valid but also make the completed image more seamless and consistent.  相似文献   

19.
非合作第三方水下标准协议信号识别在水声通信信号识别中具有重要研究意义。针对浅海水声JANUS信号的特征提取因易受脉冲噪声和多径效应等复杂水声环境影响而导致识别率低下的问题,提出一种分数低阶时频谱和ResNet18 (Residual Network 18)相结合的迁移学习识别方法。首先,选取JANUS固定前导作为识别对象,设计分数低阶傅里叶同步压缩变换(FLOFSST),以分数低阶操作抑制脉冲噪声,以时频重排特性增强时频集中性。其次,将基于ImageNet的ResNet18预训练模型微调,迁移至JANUS信号和常见水声信号时频图集。仿真表明所提算法在信噪比为-10 dB时JANUS信号的识别率为96.15%,能够有效抑制脉冲噪声并减小多径效应影响,比传统算法识别性能好。海试中JANUS信号识别率达90.00%,证明算法识别准确率和网络的泛化性较高。  相似文献   

20.
杨壮  颜永红  黄志华 《应用声学》2024,43(3):498-504
口音识别是指在同一语种下识别不同的区域口音的过程。为了提高口音识别的准确率,我们采用了多种方法,取得了明显的效果。首先,为了解决声学特征中关键特征权重不突出的问题,引入了有效的注意力机制,并对多种注意力机制进行了比较和分析。通过模型自适应学习通道和空间维度的不同权重,提高了口音识别的性能。在Common Voice英语口音数据集上的实验结果表明,引入CBAM注意力模块是有效的,识别准确率相对提升了12.7%,精确度和F1分数相对提升了17.9%。之后,我们提出了一种树形分类方法来缓解数据集中的长尾效应,识别准确率最多相对提升了5.2%。受域对抗训练(DAT)的启发,我们尝试通过对抗学习方法剔除口音特征中的冗余信息,使得准确率最多相对提升了3.4%,召回率最多相对提升了16.9%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号