Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
Recognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of act...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8315006/ |
id |
doaj-b68c5579ea1441ec9f4c5723958f7033 |
---|---|
record_format |
Article |
spelling |
doaj-b68c5579ea1441ec9f4c5723958f70332021-03-29T20:41:28ZengIEEEIEEE Access2169-35362018-01-016168551686810.1109/ACCESS.2018.28156118315006Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary LearningChengkun Zhang0Huicheng Zheng1https://orcid.org/0000-0002-6729-4176Jianhuang Lai2https://orcid.org/0000-0003-3883-2024School of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaRecognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of action segments therein. In this paper, we propose a novel hierarchical transfer framework, which is based on an action temporal-structure model that contains sequential relationship between action segments at multiple timescales. Thus, it can capture the view invariance of the sequential relationship of segment-level transfer. Additionally, we observe that the original feature distributions under different views differ greatly, leading to view-dependent representations irrelevant to the intrinsic structure of actions. Thus, at each level of the proposed framework, we transform the original feature spaces of different views to a view-shared low-dimensional feature space, and jointly learn a dictionary in this space for these views. This view-shared dictionary captures the common structure of action data across the views and can represent the action segments in a way robust to view changes. Moreover, the proposed method can be kernelized easily, and operate in both unsupervised and supervised cross-view scenarios. Extensive experimental results on the IXMAS and WVU datasets demonstrate superiority of the proposed method over state-of-the-art methods.https://ieeexplore.ieee.org/document/8315006/Cross-viewaction recognitionhierarchical transfer learningfeature space transformationdictionary learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chengkun Zhang Huicheng Zheng Jianhuang Lai |
spellingShingle |
Chengkun Zhang Huicheng Zheng Jianhuang Lai Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning IEEE Access Cross-view action recognition hierarchical transfer learning feature space transformation dictionary learning |
author_facet |
Chengkun Zhang Huicheng Zheng Jianhuang Lai |
author_sort |
Chengkun Zhang |
title |
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning |
title_short |
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning |
title_full |
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning |
title_fullStr |
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning |
title_full_unstemmed |
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning |
title_sort |
cross-view action recognition based on hierarchical view-shared dictionary learning |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2018-01-01 |
description |
Recognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of action segments therein. In this paper, we propose a novel hierarchical transfer framework, which is based on an action temporal-structure model that contains sequential relationship between action segments at multiple timescales. Thus, it can capture the view invariance of the sequential relationship of segment-level transfer. Additionally, we observe that the original feature distributions under different views differ greatly, leading to view-dependent representations irrelevant to the intrinsic structure of actions. Thus, at each level of the proposed framework, we transform the original feature spaces of different views to a view-shared low-dimensional feature space, and jointly learn a dictionary in this space for these views. This view-shared dictionary captures the common structure of action data across the views and can represent the action segments in a way robust to view changes. Moreover, the proposed method can be kernelized easily, and operate in both unsupervised and supervised cross-view scenarios. Extensive experimental results on the IXMAS and WVU datasets demonstrate superiority of the proposed method over state-of-the-art methods. |
topic |
Cross-view action recognition hierarchical transfer learning feature space transformation dictionary learning |
url |
https://ieeexplore.ieee.org/document/8315006/ |
work_keys_str_mv |
AT chengkunzhang crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning AT huichengzheng crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning AT jianhuanglai crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning |
_version_ |
1724194391726227456 |