3D hand pose regression with variants of decision forests

3D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many...

Full description

Bibliographic Details
Main Author: Tang, Danhang
Other Authors: Kim, Tae-Kyun
Published: Imperial College London 2015
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.684328
id ndltd-bl.uk-oai-ethos.bl.uk-684328
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-6843282017-08-30T03:18:54Z3D hand pose regression with variants of decision forestsTang, DanhangKim, Tae-Kyun20153D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many challenges including high degrees of freedom, severe viewpoint changes, self-occlusion and sensor noise. The main contributions of this work are to propose a series of decision forest-based methods in a progressive manner, which improves upon the previous and achieves state-of-the-art performance is achieved in the end. The thesis first introduces a novel algorithm called semi-supervised transductive regression forest, which combines transductive learning and semi-supervised learning to bridge the gap between synthetically generated, noise-free training data and real noisy data. Moreover, it incorporates a coarse-to-fine training quality function to handle viewpoint changes in a more efficient manner. As a patch-based method, STR forest has high complexity during inference. To handle that, this thesis proposes latent regression forest, a method that models the pose estimation problem as a coarse-to-fine search. This inherently combines the efficiency of a holistic method and the flexibility of a patch-based method, and thus results in 62.5 FPS without CPU/GPU optimisation. Targeting the drawbacks of LRF, a new algorithm called hierarchical sampling forests is proposed to model this problem as a progressive search, guided by kinematic structure. Hence the intermediate results (partial poses) can be verified by a new efficient energy function. Consequently it can produce more accurate full poses. All these methods are thoroughly described, compared and published. In the conclusion part we discuss and analyse their differences, limitations and usage scenarios, and then propose a few ideas for future work.006.6Imperial College Londonhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.684328http://hdl.handle.net/10044/1/31531Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 006.6
spellingShingle 006.6
Tang, Danhang
3D hand pose regression with variants of decision forests
description 3D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many challenges including high degrees of freedom, severe viewpoint changes, self-occlusion and sensor noise. The main contributions of this work are to propose a series of decision forest-based methods in a progressive manner, which improves upon the previous and achieves state-of-the-art performance is achieved in the end. The thesis first introduces a novel algorithm called semi-supervised transductive regression forest, which combines transductive learning and semi-supervised learning to bridge the gap between synthetically generated, noise-free training data and real noisy data. Moreover, it incorporates a coarse-to-fine training quality function to handle viewpoint changes in a more efficient manner. As a patch-based method, STR forest has high complexity during inference. To handle that, this thesis proposes latent regression forest, a method that models the pose estimation problem as a coarse-to-fine search. This inherently combines the efficiency of a holistic method and the flexibility of a patch-based method, and thus results in 62.5 FPS without CPU/GPU optimisation. Targeting the drawbacks of LRF, a new algorithm called hierarchical sampling forests is proposed to model this problem as a progressive search, guided by kinematic structure. Hence the intermediate results (partial poses) can be verified by a new efficient energy function. Consequently it can produce more accurate full poses. All these methods are thoroughly described, compared and published. In the conclusion part we discuss and analyse their differences, limitations and usage scenarios, and then propose a few ideas for future work.
author2 Kim, Tae-Kyun
author_facet Kim, Tae-Kyun
Tang, Danhang
author Tang, Danhang
author_sort Tang, Danhang
title 3D hand pose regression with variants of decision forests
title_short 3D hand pose regression with variants of decision forests
title_full 3D hand pose regression with variants of decision forests
title_fullStr 3D hand pose regression with variants of decision forests
title_full_unstemmed 3D hand pose regression with variants of decision forests
title_sort 3d hand pose regression with variants of decision forests
publisher Imperial College London
publishDate 2015
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.684328
work_keys_str_mv AT tangdanhang 3dhandposeregressionwithvariantsofdecisionforests
_version_ 1718522136373493760