基于改进3D卷积网络的人体动作识别 Human motion recognition based on improved 3D convolution network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于改进3D卷积网络的人体动作识别

引用本文：	高海玲王晓东章联军赵伸豪金建国.基于改进3D卷积网络的人体动作识别[J].宁波大学学报(理工版),2023,0(3):16-21.

作者姓名：	高海玲王晓东章联军赵伸豪金建国

作者单位：	1.宁波大学信息科学与工程学院, 浙江宁波 315211; 2.浙江德塔森特数据技术有限公司, 浙江宁波 315048

基金项目：	浙江省自然科学基金（LY20F010005）；

摘要：	为解决现有多数视频人体动作识别3D卷积方法无法区分信息中各维度的重要和非重要特征问题,提出了通过门控循环单元(GatedRecurrentUnit,GRU)和空间注意力增强模块构建时空特征处理网络的方法,基于多级特征融合和多组通道注意力特征选择构建网络,改进基础网络模型Res Net3D对视频人体动作识别中的网络模型.改进后模型在2个公开数据集UCF101和HMDB51上的准确率分别为96.42%和71.08%,与C3D、Two-stream等网络模型相比,具有更高的识别准确率.
关键词：	深度学习人体动作识别 3D卷积注意力机制
Human motion recognition based on improved 3D convolution network

GAO Hailing,WANG Xiaodong,ZHANG Lianjun,ZHAO Shenhao,JIN Jianguo.Human motion recognition based on improved 3D convolution network[J].Journal of Ningbo University(Natural Science and Engineering Edition),2023,0(3):16-21.

Authors:	GAO Hailing WANG Xiaodong ZHANG Lianjun ZHAO Shenhao JIN Jianguo

Institution:	1.Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China;2.Zhejiang DTCT Co., Ltd., Ningbo 315048, China

Abstract:	Video human motion recognition research has great potential for applications, but the modeling quality is greatly affected by movement types, environmental differences and other factors. Most 3D convolution methods for video human motion recognition cannot distinguish between important and non-important features in each dimension given the needed information. To tackle this problem the GRU gating unit and spatial attention enhancement module are used to build a spatio-temporal feature processing network, and the network is built based on multi-level feature fusion and multi-channel attention feature selection. Based on the basic network model ResNet3D, the network model in video human motion recognition is improved. The model achieves 96.42％ and 71.08％ recognition accuracy on two public datasets UCF101 and HMDB51, respectively, with satisfactory recognition performance. Compared with C3D, two-stream and other generic network models, the proposed model shows higher recognition accuracy, which indicates the effectiveness of the proposed model.

Keywords:	deep learning human motion recognition 3D convolution attention mechanism

	点击此处可从《宁波大学学报(理工版)》浏览原始摘要信息
	点击此处可从《宁波大学学报(理工版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏