登录    注册    忘记密码

详细信息

Gaze Estimation via Strip Pooling and Multi-Criss-Cross Attention Networks  ( SCI-EXPANDED收录)  

文献类型:期刊文献

英文题名:Gaze Estimation via Strip Pooling and Multi-Criss-Cross Attention Networks

作者:Yan, Chao[1,2];Pan, Weiguo[1,2];Xu, Cheng[1,2];Dai, Songyin[1,2];Li, Xuewei[1,2]

第一作者:Yan, Chao

通讯作者:Dai, SY[1];Dai, SY[2]

机构:[1]Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China;[2]Beijing Union Univ, Inst Brain & Cognit Sci, Coll Robot, Beijing 100101, Peoples R China

第一机构:北京联合大学北京市信息服务工程重点实验室

通讯机构:[1]corresponding author), Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China;[2]corresponding author), Beijing Union Univ, Inst Brain & Cognit Sci, Coll Robot, Beijing 100101, Peoples R China.|[1141739]北京联合大学机器人学院;[11417]北京联合大学;[11417103]北京联合大学北京市信息服务工程重点实验室;

年份:2023

卷号:13

期号:10

外文期刊名:APPLIED SCIENCES-BASEL

收录:;Scopus(收录号:2-s2.0-85160672994);WOS:【SCI-EXPANDED(收录号:WOS:000995539700001)】;

基金:This research was funded by the Beijing Natural Science Foundation (4232026), the National Natural Science Foundation of China (Grant No. 61906017, 62102033, 62171042, 61871028, 62272049, 62006020), the Beijing Municipal Commission of Education Project (No. KM201911417001, KM202111417001), the Project of Construction and Support for high-level Innovative Teams of Beijing Municipal Institutions (No. BPHR20220121), the Beijing Advanced Talents Great Wall Scholar Training Program (CIT&TCD20190313), the R & D Program of the Beijing Municipal Education Commission (KZ202211417048), and the Collaborative Innovation Center of Chaoyang (Grant No. CYXC2203). Scientific research projects of Beijing Union University (ZK10202202, BPHR2020DZ02, ZK40202101, ZK120202104).

语种:英文

外文关键词:gaze estimation; deep learning; strip pooling; multi-criss-cross attention

摘要:Deep learning techniques for gaze estimation usually determine gaze direction directly from images of the face. These algorithms achieve good performance because face images contain more feature information than eye images. However, these image classes contain a substantial amount of redundant information that may interfere with gaze prediction and may represent a bottleneck for performance improvement. To address these issues, we model long-distance dependencies between the eyes via Strip Pooling and Multi-Criss-Cross Attention Networks (SPMCCA-Net), which consist of two newly designed network modules. One module is represented by a feature enhancement bottleneck block based on fringe pooling. By incorporating strip pooling, this residual module not only enlarges its receptive fields to capture long-distance dependence between the eyes but also increases weights on important features and reduces the interference of redundant information unrelated to gaze. The other module is a multi-criss-cross attention network. This module exploits a cross-attention mechanism to further enhance long-range dependence between the eyes by incorporating the distribution of eye-gaze features and providing more gaze cues for improving estimation accuracy. Network training relies on the multi-loss function, combined with smooth L1 loss and cross entropy loss. This approach speeds up training convergence while increasing gaze estimation precision. Extensive experiments demonstrate that SPMCCA-Net outperforms several state-of-the-art methods, achieving mean angular error values of 10.13 degrees on the Gaze360 dataset and 6.61 degrees on the RT-gene dataset.

参考文献:

正在载入数据...

版权所有©北京联合大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心