登录    注册    忘记密码

详细信息

Spatial-Temporal Hypergraph Based on Dual-Stage Attention Network for Multi-View Data Lightweight Action Recognition  ( EI收录)  

文献类型:期刊文献

英文题名:Spatial-Temporal Hypergraph Based on Dual-Stage Attention Network for Multi-View Data Lightweight Action Recognition

作者:Wu, Zhixuan[1]; Ma, Nan[2,3]; Wang, Cheng[1]; Xu, Cheng[1]; Xu, Genbao[2,3]; Li, Mingxing[1]

第一作者:Wu, Zhixuan

机构:[1] Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing, 100101, China; [2] Faculty of Information and Technology, Beijing University of Technology, Beijing, 100124, China; [3] Engineering Research Center of Intelligence Perception and Autonomous Control, Ministry of Education, Beijing University of Technology, Beijing, 100124, China

第一机构:北京联合大学北京市信息服务工程重点实验室

年份:2023

外文期刊名:SSRN

收录:EI(收录号:20230214504)

语种:英文

外文关键词:Dynamics - Graph theory

摘要:For the problems of irrelevant frames and high model complexity in action recognition, this paper proposes a Spatial-Temporal Hypergraph based on Dual-Stage Attention Network (STHG-DAN) for multi-view data lightweight action recognition. It includes two stages: Temporal Attention Mechanism based on Trainable Threshold (TAM-TT) and Hypergraph Convolution based on Dynamic Spatial-Temporal Attention Mechanism (HG-DSTAM). In the first stage, TAM-TT uses a learning threshold to extract key frames from multi-view videos; In the second stage, HG-DSTAM divides the human joints into three parts: trunk, hand and leg to build spatial hypergraphs, extracts the higher-order features of the multi-view spatial hypergraphs of human joints, inputs them into the dynamic spatial-temporal attention mechanism, and learns the intra frame correlation of multi-view data between the joint features of body parts, which can obtain the significant areas of action; We use multi-scale convolution operation and depth separable network, which can realize efficient action recognition with a few trainable parameters. We experiment on the NTU-RGB+D dataset and the imitating traffic police gesture dataset, the performance and accuracy of the model is better than the existing algorithms, effectively improving the machine and human body language interaction cognitive ability. ? 2023, The Authors. All rights reserved.

参考文献:

正在载入数据...

版权所有©北京联合大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心