Multi-granularity Facial Emotional Representation with Unlabeled Data and Textual Supervision

Facial expressions (FEs) and action units (AUs) are facial emotional representations at different levels of granularity. In the past, recognizing them has often been treated as two separate tasks. There are also some methods that use the knowledge of one to aid in recognizing the other, but currently, unified models capable of recognizing both FEs and AUs simultaneously remain rare. In this paper, we construct a unified model with strong generalization capability to jointly perform facial expres