Unified multi‐stage fusion network for affective video content analysis

Abstract Affective video content analysis is an active topic in the field of affective computing. In general, affective video content can be depicted by feature vectors of multiple modalities, so it is important to effectively fuse information. In this work, a novel framework is designed to fuse inf...

Full description

Bibliographic Details
Published in:Electronics Letters
Main Authors: Yun Yi, Hanli Wang, Pengjie Tang
Format: Article
Language:English
Published: Wiley 2022-10-01
Subjects:
Online Access:https://doi.org/10.1049/ell2.12605
Description
Summary:Abstract Affective video content analysis is an active topic in the field of affective computing. In general, affective video content can be depicted by feature vectors of multiple modalities, so it is important to effectively fuse information. In this work, a novel framework is designed to fuse information from multiple stages in a unified manner. In particular, a unified fusion layer is devised to combine output tensors from multiple stages of the proposed neural network. With the unified fusion layer, a bidirectional residual recurrent fusion block is devised to model the information of each modality. Moreover, the proposed method achieves state‐of‐the‐art performances on two challenging datasets, i.e. the accuracy value on the VideoEmotion dataset is 55.8%, and the MSE values on the two domains of EIMT16 are 0.464 and 0.176 respectively. The code of UMFN is available at: https://github.com/yunyi9/UMFN.
ISSN:0013-5194
1350-911X