Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks

This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synth...

Full description

Bibliographic Details
Main Authors: Yung-Han Chen, Chi-Hsuan Huang, Sin-Wun Syu, Tien-Ying Kuo, Po-Chyi Su
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/13/4382
id doaj-4889cac2a24f44b0b41413dd4c9b7384
record_format Article
spelling doaj-4889cac2a24f44b0b41413dd4c9b73842021-07-15T15:45:17ZengMDPI AGSensors1424-82202021-06-01214382438210.3390/s21134382Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural NetworksYung-Han Chen0Chi-Hsuan Huang1Sin-Wun Syu2Tien-Ying Kuo3Po-Chyi Su4Department of Computer Science and Information Engineering, National Central University, Taoyuan City 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan City 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan City 32001, TaiwanDepartment of Electrical Engineering, National Taipei University of Technology, Taipei 10608, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan City 32001, TaiwanThis research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 6<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>×</mo><mn>480</mn></mrow></semantics></math></inline-formula> RGB images, with an average error of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>8.3</mn></mrow></semantics></math></inline-formula> pixels. The speed is high enough to enable real-time “air-writing”, where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology.https://www.mdpi.com/1424-8220/21/13/4382air-writingfingertip detectionregion-based convolutional neural networksmart glasses
collection DOAJ
language English
format Article
sources DOAJ
author Yung-Han Chen
Chi-Hsuan Huang
Sin-Wun Syu
Tien-Ying Kuo
Po-Chyi Su
spellingShingle Yung-Han Chen
Chi-Hsuan Huang
Sin-Wun Syu
Tien-Ying Kuo
Po-Chyi Su
Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
Sensors
air-writing
fingertip detection
region-based convolutional neural network
smart glasses
author_facet Yung-Han Chen
Chi-Hsuan Huang
Sin-Wun Syu
Tien-Ying Kuo
Po-Chyi Su
author_sort Yung-Han Chen
title Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
title_short Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
title_full Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
title_fullStr Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
title_full_unstemmed Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks
title_sort egocentric-view fingertip detection for air writing based on convolutional neural networks
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2021-06-01
description This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 6<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>40</mn><mo>×</mo><mn>480</mn></mrow></semantics></math></inline-formula> RGB images, with an average error of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>8.3</mn></mrow></semantics></math></inline-formula> pixels. The speed is high enough to enable real-time “air-writing”, where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology.
topic air-writing
fingertip detection
region-based convolutional neural network
smart glasses
url https://www.mdpi.com/1424-8220/21/13/4382
work_keys_str_mv AT yunghanchen egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT chihsuanhuang egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT sinwunsyu egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT tienyingkuo egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
AT pochyisu egocentricviewfingertipdetectionforairwritingbasedonconvolutionalneuralnetworks
_version_ 1721298506578132992