Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System

Due to the constantly increasing demand for automatic tracking and recognition systems, there is a need for more proficient, intelligent and sustainable human activity tracking. The main purpose of this study is to develop an accurate and sustainable human action tracking system that is capable of e...

Full description

Bibliographic Details
Main Authors:	Nida Khalid, Munkhjargal Gochoo, Ahmad Jalal, Kibum Kim
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Sustainability
Subjects:	geodesic distance human action recognition human locomotion neuro-fuzzy classifier particle swarm optimization RGB-D sensors
Online Access:	https://www.mdpi.com/2071-1050/13/2/970

id	doaj-f29356ec05d9418e975c94aa492b528b
record_format	Article
spelling	doaj-f29356ec05d9418e975c94aa492b528b2021-01-20T00:02:11ZengMDPI AGSustainability2071-10502021-01-011397097010.3390/su13020970Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance SystemNida Khalid0Munkhjargal Gochoo1Ahmad Jalal2Kibum Kim3Department of Computer Science, Air University, Islamabad 44000, PakistanDepartment of Computer Science and Software Engineering, United Arab Emirates University, Al Ain 15551, UAEDepartment of Computer Science, Air University, Islamabad 44000, PakistanDepartment of Human-Computer Interaction, Hanyang University, Ansan 15588, KoreaDue to the constantly increasing demand for automatic tracking and recognition systems, there is a need for more proficient, intelligent and sustainable human activity tracking. The main purpose of this study is to develop an accurate and sustainable human action tracking system that is capable of error-free identification of human movements irrespective of the environment in which those actions are performed. Therefore, in this paper we propose a stereoscopic Human Action Recognition (HAR) system based on the fusion of RGB (red, green, blue) and depth sensors. These sensors give an extra depth of information which enables the three-dimensional (3D) tracking of each and every movement performed by humans. Human actions are tracked according to four features, namely, (1) geodesic distance; (2) 3D Cartesian-plane features; (3) joints Motion Capture (MOCAP) features and (4) way-points trajectory generation. In order to represent these features in an optimized form, Particle Swarm Optimization (PSO) is applied. After optimization, a neuro-fuzzy classifier is used for classification and recognition. Extensive experimentation is performed on three challenging datasets: A Nanyang Technological University (NTU) RGB+D dataset; a UoL (University of Lincoln) 3D social activity dataset and a Collective Activity Dataset (CAD). Evaluation experiments on the proposed system proved that a fusion of vision sensors along with our unique features is an efficient approach towards developing a robust HAR system, having achieved a mean accuracy of 93.5% with the NTU RGB+D dataset, 92.2% with the UoL dataset and 89.6% with the Collective Activity dataset. The developed system can play a significant role in many computer vision-based applications, such as intelligent homes, offices and hospitals, and surveillance systems.https://www.mdpi.com/2071-1050/13/2/970geodesic distancehuman action recognitionhuman locomotionneuro-fuzzy classifierparticle swarm optimizationRGB-D sensors
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Nida Khalid Munkhjargal Gochoo Ahmad Jalal Kibum Kim
spellingShingle	Nida Khalid Munkhjargal Gochoo Ahmad Jalal Kibum Kim Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System Sustainability geodesic distance human action recognition human locomotion neuro-fuzzy classifier particle swarm optimization RGB-D sensors
author_facet	Nida Khalid Munkhjargal Gochoo Ahmad Jalal Kibum Kim
author_sort	Nida Khalid
title	Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System
title_short	Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System
title_full	Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System
title_fullStr	Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System
title_full_unstemmed	Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System
title_sort	modeling two-person segmentation and locomotion for stereoscopic action identification: a sustainable video surveillance system
publisher	MDPI AG
series	Sustainability
issn	2071-1050
publishDate	2021-01-01
description	Due to the constantly increasing demand for automatic tracking and recognition systems, there is a need for more proficient, intelligent and sustainable human activity tracking. The main purpose of this study is to develop an accurate and sustainable human action tracking system that is capable of error-free identification of human movements irrespective of the environment in which those actions are performed. Therefore, in this paper we propose a stereoscopic Human Action Recognition (HAR) system based on the fusion of RGB (red, green, blue) and depth sensors. These sensors give an extra depth of information which enables the three-dimensional (3D) tracking of each and every movement performed by humans. Human actions are tracked according to four features, namely, (1) geodesic distance; (2) 3D Cartesian-plane features; (3) joints Motion Capture (MOCAP) features and (4) way-points trajectory generation. In order to represent these features in an optimized form, Particle Swarm Optimization (PSO) is applied. After optimization, a neuro-fuzzy classifier is used for classification and recognition. Extensive experimentation is performed on three challenging datasets: A Nanyang Technological University (NTU) RGB+D dataset; a UoL (University of Lincoln) 3D social activity dataset and a Collective Activity Dataset (CAD). Evaluation experiments on the proposed system proved that a fusion of vision sensors along with our unique features is an efficient approach towards developing a robust HAR system, having achieved a mean accuracy of 93.5% with the NTU RGB+D dataset, 92.2% with the UoL dataset and 89.6% with the Collective Activity dataset. The developed system can play a significant role in many computer vision-based applications, such as intelligent homes, offices and hospitals, and surveillance systems.
topic	geodesic distance human action recognition human locomotion neuro-fuzzy classifier particle swarm optimization RGB-D sensors
url	https://www.mdpi.com/2071-1050/13/2/970
work_keys_str_mv	AT nidakhalid modelingtwopersonsegmentationandlocomotionforstereoscopicactionidentificationasustainablevideosurveillancesystem AT munkhjargalgochoo modelingtwopersonsegmentationandlocomotionforstereoscopicactionidentificationasustainablevideosurveillancesystem AT ahmadjalal modelingtwopersonsegmentationandlocomotionforstereoscopicactionidentificationasustainablevideosurveillancesystem AT kibumkim modelingtwopersonsegmentationandlocomotionforstereoscopicactionidentificationasustainablevideosurveillancesystem
_version_	1724331667046268928

Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System

Similar Items