详细信息
Improved Spatio-Temporal Convolutional Neural Networks for Traffic Police Gestures Recognition ( CPCI-S收录 EI收录)
文献类型:会议论文
英文题名:Improved Spatio-Temporal Convolutional Neural Networks for Traffic Police Gestures Recognition
作者:Wu, Zhixuan[1];Ma, Nan[1];Cheung, Yiu-ming[2];Li, Jiahong[3];He, Qin[4];Yao, Yongqiang[1];Zhang, Guoping[1]
第一作者:Wu, Zhixuan
通讯作者:Ma, N[1]
机构:[1]Beijing Union Univ, Coll Robot, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China;[2]Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong 999077, Peoples R China;[3]Beijing Union Univ, Coll Robot, Beijing 100101, Peoples R China;[4]Beijing Union Univ, Beijing 100101, Peoples R China
第一机构:北京联合大学北京市信息服务工程重点实验室|北京联合大学机器人学院
通讯机构:[1]corresponding author), Beijing Union Univ, Coll Robot, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China.|[1141739]北京联合大学机器人学院;[11417]北京联合大学;[11417103]北京联合大学北京市信息服务工程重点实验室;
会议论文集:16th International Conference on Computational Intelligence and Security (CIS)
会议日期:NOV 27-30, 2020
会议地点:Nanning, PEOPLES R CHINA
语种:英文
外文关键词:Artificial intelligence; Human action recognition; Spatio-Temporal feature; Traffic police gesture; Improved LSTM network
摘要:In the era of artificial intelligence, human action recognition is a hot spot in the field of vision research, which makes the interaction between human and machine possible. Many intelligent applications benefit from human action recognition. Traditional traffic police gesture recognition methods often ignore the spatial and temporal information, so its timeliness in human computer interaction is limited. We propose a method that is Spatio-Temporal Convolutional Neural Networks (ST-CNN) which can detect and identify traffic police gestures. The method can identify traffic police gestures by using the correlation between spatial and temporal. Specifically, we use the convolutional neural network for feature extraction by taking into account both the spatial and temporal characteristics of the human actions. After the extraction of spatial and temporal features, the improved LSTM network can be used to effectively fuse, classify and recognize various features, so as to achieve the goal of human action recognition. We can make full use of the spatial and temporal information of the video and select effective features to reduce the computational load of the network. A large number of experiments on the Chinese traffic police gesture dataset show that our method is superior.
参考文献:
正在载入数据...