CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition

Pixel-based images captured by a charge-coupled device (CCD) with infrared (IR) LEDs around the image sensor are the well-known CCD Red–Green–Blue IR (the so-called CCD RGB-IR) data. The CCD RGB-IR data are generally acquired for video surveillance applications. Currently, CCD RGB-IR information has...

詳細記述

書誌詳細
出版年:Sensors
主要な著者: Ing-Jr Ding, Nai-Wei Zheng
フォーマット: 論文
言語:英語
出版事項: MDPI AG 2022-01-01
主題:
オンライン・アクセス:https://www.mdpi.com/1424-8220/22/3/803
_version_ 1850758753175994368
author Ing-Jr Ding
Nai-Wei Zheng
author_facet Ing-Jr Ding
Nai-Wei Zheng
author_sort Ing-Jr Ding
collection DOAJ
container_title Sensors
description Pixel-based images captured by a charge-coupled device (CCD) with infrared (IR) LEDs around the image sensor are the well-known CCD Red–Green–Blue IR (the so-called CCD RGB-IR) data. The CCD RGB-IR data are generally acquired for video surveillance applications. Currently, CCD RGB-IR information has been further used to perform human gesture recognition on surveillance. Gesture recognition, including hand gesture intention recognition, is attracting great attention in the field of deep neural network (DNN) calculations. For further enhancing conventional CCD RGB-IR gesture recognition by DNN, this work proposes a deep learning framework for gesture recognition where a convolution neural network (CNN) incorporated with wavelet image fusion of CCD RGB-IR and additional depth-based depth-grayscale images (captured from depth sensors of the famous Microsoft Kinect device) is constructed for gesture intention recognition. In the proposed CNN with wavelet image fusion, a five-level discrete wavelet transformation (DWT) with three different wavelet decomposition merge strategies, namely, max-min, min-max and mean-mean, is employed; the visual geometry group (VGG)-16 CNN is used for deep learning and recognition of the wavelet fused gesture images. Experiments on the classifications of ten hand gesture intention actions (specified in a scenario of laboratory interactions) show that by additionally incorporating depth-grayscale data into CCD RGB-IR gesture recognition one will be able to further increase the averaged recognition accuracy to 83.88% for the VGG-16 CNN with min-max wavelet image fusion of the CCD RGB-IR and depth-grayscale data, which is obviously superior to the 75.33% of VGG-16 CNN with only CCD RGB-IR.
format Article
id doaj-art-8e75162c3faa467f970cd1d0fd0a7898
institution Directory of Open Access Journals
issn 1424-8220
language English
publishDate 2022-01-01
publisher MDPI AG
record_format Article
spelling doaj-art-8e75162c3faa467f970cd1d0fd0a78982025-08-19T22:34:32ZengMDPI AGSensors1424-82202022-01-0122380310.3390/s22030803CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention RecognitionIng-Jr Ding0Nai-Wei Zheng1Department of Electrical Engineering, National Formosa University, Huwei, Yunlin 632, TaiwanDepartment of Electrical Engineering, National Formosa University, Huwei, Yunlin 632, TaiwanPixel-based images captured by a charge-coupled device (CCD) with infrared (IR) LEDs around the image sensor are the well-known CCD Red–Green–Blue IR (the so-called CCD RGB-IR) data. The CCD RGB-IR data are generally acquired for video surveillance applications. Currently, CCD RGB-IR information has been further used to perform human gesture recognition on surveillance. Gesture recognition, including hand gesture intention recognition, is attracting great attention in the field of deep neural network (DNN) calculations. For further enhancing conventional CCD RGB-IR gesture recognition by DNN, this work proposes a deep learning framework for gesture recognition where a convolution neural network (CNN) incorporated with wavelet image fusion of CCD RGB-IR and additional depth-based depth-grayscale images (captured from depth sensors of the famous Microsoft Kinect device) is constructed for gesture intention recognition. In the proposed CNN with wavelet image fusion, a five-level discrete wavelet transformation (DWT) with three different wavelet decomposition merge strategies, namely, max-min, min-max and mean-mean, is employed; the visual geometry group (VGG)-16 CNN is used for deep learning and recognition of the wavelet fused gesture images. Experiments on the classifications of ten hand gesture intention actions (specified in a scenario of laboratory interactions) show that by additionally incorporating depth-grayscale data into CCD RGB-IR gesture recognition one will be able to further increase the averaged recognition accuracy to 83.88% for the VGG-16 CNN with min-max wavelet image fusion of the CCD RGB-IR and depth-grayscale data, which is obviously superior to the 75.33% of VGG-16 CNN with only CCD RGB-IR.https://www.mdpi.com/1424-8220/22/3/803CCD RGB-IRdepth-grayscalewavelet image fusionDWTCNN
spellingShingle Ing-Jr Ding
Nai-Wei Zheng
CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
CCD RGB-IR
depth-grayscale
wavelet image fusion
DWT
CNN
title CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
title_full CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
title_fullStr CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
title_full_unstemmed CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
title_short CNN Deep Learning with Wavelet Image Fusion of CCD RGB-IR and Depth-Grayscale Sensor Data for Hand Gesture Intention Recognition
title_sort cnn deep learning with wavelet image fusion of ccd rgb ir and depth grayscale sensor data for hand gesture intention recognition
topic CCD RGB-IR
depth-grayscale
wavelet image fusion
DWT
CNN
url https://www.mdpi.com/1424-8220/22/3/803
work_keys_str_mv AT ingjrding cnndeeplearningwithwaveletimagefusionofccdrgbiranddepthgrayscalesensordataforhandgestureintentionrecognition
AT naiweizheng cnndeeplearningwithwaveletimagefusionofccdrgbiranddepthgrayscalesensordataforhandgestureintentionrecognition