A Novel Two-Stream Transformer-Based Framework for Multi-Modality Human Action Recognition
Due to the great success of Vision Transformer (ViT) in image classification tasks, many pure Transformer architectures for human action recognition have been proposed. However, very few works have attempted to use Transformer to conduct bimodal action recognition, i.e., both skeleton and RGB modali...
| 發表在: | Applied Sciences |
|---|---|
| Main Authors: | , , , , , |
| 格式: | Article |
| 語言: | 英语 |
| 出版: |
MDPI AG
2023-02-01
|
| 主題: | |
| 在線閱讀: | https://www.mdpi.com/2076-3417/13/4/2058 |
