Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning

Recognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of act...

Full description

Bibliographic Details
Main Authors: Chengkun Zhang, Huicheng Zheng, Jianhuang Lai
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8315006/
id doaj-b68c5579ea1441ec9f4c5723958f7033
record_format Article
spelling doaj-b68c5579ea1441ec9f4c5723958f70332021-03-29T20:41:28ZengIEEEIEEE Access2169-35362018-01-016168551686810.1109/ACCESS.2018.28156118315006Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary LearningChengkun Zhang0Huicheng Zheng1https://orcid.org/0000-0002-6729-4176Jianhuang Lai2https://orcid.org/0000-0003-3883-2024School of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou, ChinaRecognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of action segments therein. In this paper, we propose a novel hierarchical transfer framework, which is based on an action temporal-structure model that contains sequential relationship between action segments at multiple timescales. Thus, it can capture the view invariance of the sequential relationship of segment-level transfer. Additionally, we observe that the original feature distributions under different views differ greatly, leading to view-dependent representations irrelevant to the intrinsic structure of actions. Thus, at each level of the proposed framework, we transform the original feature spaces of different views to a view-shared low-dimensional feature space, and jointly learn a dictionary in this space for these views. This view-shared dictionary captures the common structure of action data across the views and can represent the action segments in a way robust to view changes. Moreover, the proposed method can be kernelized easily, and operate in both unsupervised and supervised cross-view scenarios. Extensive experimental results on the IXMAS and WVU datasets demonstrate superiority of the proposed method over state-of-the-art methods.https://ieeexplore.ieee.org/document/8315006/Cross-viewaction recognitionhierarchical transfer learningfeature space transformationdictionary learning
collection DOAJ
language English
format Article
sources DOAJ
author Chengkun Zhang
Huicheng Zheng
Jianhuang Lai
spellingShingle Chengkun Zhang
Huicheng Zheng
Jianhuang Lai
Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
IEEE Access
Cross-view
action recognition
hierarchical transfer learning
feature space transformation
dictionary learning
author_facet Chengkun Zhang
Huicheng Zheng
Jianhuang Lai
author_sort Chengkun Zhang
title Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
title_short Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
title_full Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
title_fullStr Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
title_full_unstemmed Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning
title_sort cross-view action recognition based on hierarchical view-shared dictionary learning
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2018-01-01
description Recognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of action segments therein. In this paper, we propose a novel hierarchical transfer framework, which is based on an action temporal-structure model that contains sequential relationship between action segments at multiple timescales. Thus, it can capture the view invariance of the sequential relationship of segment-level transfer. Additionally, we observe that the original feature distributions under different views differ greatly, leading to view-dependent representations irrelevant to the intrinsic structure of actions. Thus, at each level of the proposed framework, we transform the original feature spaces of different views to a view-shared low-dimensional feature space, and jointly learn a dictionary in this space for these views. This view-shared dictionary captures the common structure of action data across the views and can represent the action segments in a way robust to view changes. Moreover, the proposed method can be kernelized easily, and operate in both unsupervised and supervised cross-view scenarios. Extensive experimental results on the IXMAS and WVU datasets demonstrate superiority of the proposed method over state-of-the-art methods.
topic Cross-view
action recognition
hierarchical transfer learning
feature space transformation
dictionary learning
url https://ieeexplore.ieee.org/document/8315006/
work_keys_str_mv AT chengkunzhang crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning
AT huichengzheng crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning
AT jianhuanglai crossviewactionrecognitionbasedonhierarchicalviewshareddictionarylearning
_version_ 1724194391726227456