A Novel Two-Stream Transformer-Based Framework for Multi-Modality Human Action Recognition

Due to the great success of Vision Transformer (ViT) in image classification tasks, many pure Transformer architectures for human action recognition have been proposed. However, very few works have attempted to use Transformer to conduct bimodal action recognition, i.e., both skeleton and RGB modali...

全面介紹

書目詳細資料
發表在:Applied Sciences
Main Authors: Jing Shi, Yuanyuan Zhang, Weihang Wang, Bin Xing, Dasha Hu, Liangyin Chen
格式: Article
語言:英语
出版: MDPI AG 2023-02-01
主題:
在線閱讀:https://www.mdpi.com/2076-3417/13/4/2058