Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos

This thesis addresses the problem of human action recognition in realistic video data, such as movies and online videos. Automatic and accurate recognition of human actions in video is a fascinating capability. The potential applications range from surveillance and robotics to medical diagnosis, con...

Full description

Bibliographic Details
Main Author:	Muneeb Ullah, Muhammad
Language:	English
Published:	Université Européenne de Bretagne 2012
Subjects:	[INFO:INFO_CV] Computer Science/Computer Vision and Pattern Recognition [INFO:INFO_CV] Informatique/Vision par ordinateur et reconnaissance de formes computer vision action recognition
Online Access:	http://tel.archives-ouvertes.fr/tel-01063349 http://tel.archives-ouvertes.fr/docs/01/06/33/49/PDF/2012thesisUllah.pdf

id	ndltd-CCSD-oai-tel.archives-ouvertes.fr-tel-01063349
record_format	oai_dc
spelling	ndltd-CCSD-oai-tel.archives-ouvertes.fr-tel-010633492014-09-13T03:26:23Z http://tel.archives-ouvertes.fr/tel-01063349 http://tel.archives-ouvertes.fr/docs/01/06/33/49/PDF/2012thesisUllah.pdf Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos Muneeb Ullah, Muhammad [INFO:INFO_CV] Computer Science/Computer Vision and Pattern Recognition [INFO:INFO_CV] Informatique/Vision par ordinateur et reconnaissance de formes computer vision action recognition This thesis addresses the problem of human action recognition in realistic video data, such as movies and online videos. Automatic and accurate recognition of human actions in video is a fascinating capability. The potential applications range from surveillance and robotics to medical diagnosis, content-based video retrieval, and intelligent human- computer interfaces. The task is highly challenging due to the large variations in person appearances, dynamic backgrounds, view-point changes, lighting conditions, action styles and other factors. Statistical video representations based on local space-time features have been recently shown successful for action recognition in realistic scenarios. Their success can be at- tributed to the mild assumptions about the data and robustness to several variations in the video. Such representations, however, often encode videos by disordered collection of low-level primitives. This thesis extends current methods by developing more discrimi- native features and integrating additional supervision into Bag-of-Features based video representations, aiming to improve action recognition in unconstrained and challenging video data. We start by evaluating a range of available local space-time feature detectors and descriptors under the standard Bag-of-Features framework. We then propose to improve the basic Bag-of-Features model by integrating additional supervision in the form of non-local region-level information. We further investigate an attribute-based representation, wherein the attributes range from objects (e.g., car, chair, table, etc.) to human poses and actions. We demonstrate that such representation captures high-level information in video, and provides complementary information to the low-level features. We finally propose a novel local representation for human action recognition in video, denoted as Actlets. Actlets are body part detectors undergoing characteristic motion patterns. We train Actlets using a large synthetic video dataset of rendered avatars and demonstrate the advantages of Actlets for action recognition in realistic data. All methods proposed and developed in this thesis represent alternative ways of construct- ing supervised video representations and demonstrate improvements of human action recognition in realistic settings. 2012-10-23 eng PhD thesis Université Européenne de Bretagne
collection	NDLTD
language	English
sources	NDLTD
topic	[INFO:INFO_CV] Computer Science/Computer Vision and Pattern Recognition [INFO:INFO_CV] Informatique/Vision par ordinateur et reconnaissance de formes computer vision action recognition
spellingShingle	[INFO:INFO_CV] Computer Science/Computer Vision and Pattern Recognition [INFO:INFO_CV] Informatique/Vision par ordinateur et reconnaissance de formes computer vision action recognition Muneeb Ullah, Muhammad Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
description	This thesis addresses the problem of human action recognition in realistic video data, such as movies and online videos. Automatic and accurate recognition of human actions in video is a fascinating capability. The potential applications range from surveillance and robotics to medical diagnosis, content-based video retrieval, and intelligent human- computer interfaces. The task is highly challenging due to the large variations in person appearances, dynamic backgrounds, view-point changes, lighting conditions, action styles and other factors. Statistical video representations based on local space-time features have been recently shown successful for action recognition in realistic scenarios. Their success can be at- tributed to the mild assumptions about the data and robustness to several variations in the video. Such representations, however, often encode videos by disordered collection of low-level primitives. This thesis extends current methods by developing more discrimi- native features and integrating additional supervision into Bag-of-Features based video representations, aiming to improve action recognition in unconstrained and challenging video data. We start by evaluating a range of available local space-time feature detectors and descriptors under the standard Bag-of-Features framework. We then propose to improve the basic Bag-of-Features model by integrating additional supervision in the form of non-local region-level information. We further investigate an attribute-based representation, wherein the attributes range from objects (e.g., car, chair, table, etc.) to human poses and actions. We demonstrate that such representation captures high-level information in video, and provides complementary information to the low-level features. We finally propose a novel local representation for human action recognition in video, denoted as Actlets. Actlets are body part detectors undergoing characteristic motion patterns. We train Actlets using a large synthetic video dataset of rendered avatars and demonstrate the advantages of Actlets for action recognition in realistic data. All methods proposed and developed in this thesis represent alternative ways of construct- ing supervised video representations and demonstrate improvements of human action recognition in realistic settings.
author	Muneeb Ullah, Muhammad
author_facet	Muneeb Ullah, Muhammad
author_sort	Muneeb Ullah, Muhammad
title	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
title_short	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
title_full	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
title_fullStr	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
title_full_unstemmed	Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos
title_sort	statistiques supervisées pour la reconnaissance d'actions humaines dans les vidéos
publisher	Université Européenne de Bretagne
publishDate	2012
url	http://tel.archives-ouvertes.fr/tel-01063349 http://tel.archives-ouvertes.fr/docs/01/06/33/49/PDF/2012thesisUllah.pdf
work_keys_str_mv	AT muneebullahmuhammad statistiquessuperviseespourlareconnaissancedactionshumainesdanslesvideos
_version_	1716713806578581504

Statistiques Supervisées pour la Reconnaissance d'Actions Humaines dans les Vidéos

Similar Items