Summary: | 碩士 === 國立臺灣科技大學 === 資訊管理系 === 107 === This research describes a novel approach to provide a text to ASL 1
media, a VideoBased
Text to ASL. The hearing impaired or we called as
the Deaf are used to communicate using Sign Language. When they have
to face the spoken language, they have difficulties to read the spoken words
as fast as the hearing people.
The availability of a public dataset named ASL Lexicon Dataset give
the challenge to make the videobased
interpreter for the Deaf. The problem
is on the transition from one word to another since it does not exist in
the original dataset. Regarding to this case, our focus in on how to make a
better transition from one word to another rather than a blink.
After the dataset has been preprocessed,
they are fed to OpenPose library
to extract the skeleton of the signers and save it as JSON files. The
system requires the user to input some glosses2 by text, then it will find the
JSON files and the videos for the corresponding glosses. The whole sequences
of original video are also fed into the system to be used as a transition
pools. Later, the corresponding frames of the glosses are input together
with the transition pools to construct the sequence transition frames. After
getting the sequences, a smoothing algorithm is applied to enhance the
smoothness of the motion.
Since this algorithm is fully depends on the transition pulls, there are
some limitation regarding to make a good transition. If the transition frames we need to make a logically and visually correct motion are not available,
then the result will be not optimized. But as long as the frames we need are
available, this system can generate a logically and visually correct transitions.
|