详细信息
Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition ( EI收录)
文献类型:期刊文献
英文题名:Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition
作者:Li, Mengzhu[1]; Zha, Quanxing[2]; Wu, Hongjun[3]
第一作者:Li, Mengzhu
机构:[1] Beijing Key Laboratory of Information Service Engineering, Beijing Union University, China; [2] Huaqiao University, China; [3] Beijing University of Posts and Telecommunications, China
第一机构:北京联合大学北京市信息服务工程重点实验室
年份:2025
外文期刊名:arXiv
收录:EI(收录号:20250116726)
语种:英文
外文关键词:Benchmarking
摘要:Dynamic Facial Expression Recognition (DFER) facilitates the understanding of psychological intentions through nonverbal communication. Existing methods struggle to manage irrelevant information, such as background noise and redundant semantics, which impacts both efficiency and effectiveness. In this work, we propose a novel supervised temporal soft masked autoencoder network for DFER, namely AdaTosk, which integrates a parallel supervised classification branch with the self-supervised reconstruction branch. The self-supervised reconstruction branch applies random binary hard mask to generate diverse training samples, encouraging meaningful feature representations in visible tokens. Meanwhile the classification branch employs an adaptive temporal soft mask to flexibly mask visible tokens based on their temporal significance. Its two key components, respectively of, class-agnostic and class-semantic soft masks, serve to enhance critical expression moments and reduce semantic redundancy over time. Extensive experiments conducted on widely-used benchmarks demonstrate that our AdaTosk remarkably reduces computational costs compared with current state-of-the-art methods while still maintaining competitive performance. ? 2025, CC0.
参考文献:
正在载入数据...
