Transition Motion Synthesis for Video-Based Text to ASL

碩士 === 國立臺灣科技大學 === 資訊管理系 === 107 === This research describes a novel approach to provide a text to ASL 1 media, a VideoBased Text to ASL. The hearing impaired or we called as the Deaf are used to communicate using Sign Language. When they have to face the spoken language, they have difficulties to...

Full description

Bibliographic Details
Main Authors: Yulia, 潘秀蓮
Other Authors: Chuan-Kai Yang
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/4jw5ru
Description
Summary:碩士 === 國立臺灣科技大學 === 資訊管理系 === 107 === This research describes a novel approach to provide a text to ASL 1 media, a VideoBased Text to ASL. The hearing impaired or we called as the Deaf are used to communicate using Sign Language. When they have to face the spoken language, they have difficulties to read the spoken words as fast as the hearing people. The availability of a public dataset named ASL Lexicon Dataset give the challenge to make the videobased interpreter for the Deaf. The problem is on the transition from one word to another since it does not exist in the original dataset. Regarding to this case, our focus in on how to make a better transition from one word to another rather than a blink. After the dataset has been preprocessed, they are fed to OpenPose library to extract the skeleton of the signers and save it as JSON files. The system requires the user to input some glosses2 by text, then it will find the JSON files and the videos for the corresponding glosses. The whole sequences of original video are also fed into the system to be used as a transition pools. Later, the corresponding frames of the glosses are input together with the transition pools to construct the sequence transition frames. After getting the sequences, a smoothing algorithm is applied to enhance the smoothness of the motion. Since this algorithm is fully depends on the transition pulls, there are some limitation regarding to make a good transition. If the transition frames we need to make a logically and visually correct motion are not available, then the result will be not optimized. But as long as the frames we need are available, this system can generate a logically and visually correct transitions.