Design a Robust Taiwanese Tonal Phoneme Database for Taiwanese Text-To-Speech System

碩士 === 國立成功大學 === 醫學工程研究所碩博士班 === 97 === Taiwanese is one of the most commonly used languages in Taiwan, especially for the middle-aged, senior citizens and persons living in central and southern Taiwan. However, Taiwan's medical treatment service, clinical education and training, medical multi...

Full description

Bibliographic Details
Main Authors: Ming-Chun Hsu, 許銘峻
Other Authors: Kao-Chi Chung
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/57058739168476114109
Description
Summary:碩士 === 國立成功大學 === 醫學工程研究所碩博士班 === 97 === Taiwanese is one of the most commonly used languages in Taiwan, especially for the middle-aged, senior citizens and persons living in central and southern Taiwan. However, Taiwan's medical treatment service, clinical education and training, medical multimedia applications are mainly using Mandarin speech. The patient with Taiwanese speaking often has great communication difficulties with medical staffs, leading to diagnostic and treatment problems. In fact, the elderly and the persons with hearing and/or speech impairments in Taiwan have been increasing year by year, particularly the elderly being highly risky to hearing impairments. Augmentative and Alternative Communication (AAC) systems and hearing aids of assistive technology play an important role for the disabled to interact with the outside world. The Text-To-Speech (TTS) AAC systems can utter nature speech sounds with fluency instead of people. However, most of research and development on communication and hearing assistive technology in Taiwan have been focused on Mandarin and occidental language systems, and it leads to the lack of Taiwanese communication and hearing assistive technology. Therefore, it is important to develop Taiwanese speech technology for the implementation on medical service and medical multimedia as well as assistive technology. This research purpose is to design and establish a robust Taiwanese tonal phoneme database and then to develop a Taiwanese TTS system. More specifically, this research is aimed to: (1) design an algorithm to train Taiwanese balanced sentences and establish a speech database of Modern Literal Taiwanese (MLT) balanced sentences, (2) analyze the robustness of tonal phoneme models by statistical methods and then establish a robust Taiwanese tonal phoneme database, and (3) develop a Taiwanese TTS system through applying HMM-based TTS system and Taiwanese tonal phoneme database. The materials and methods include: (1) to analyze MLT subsyllables through Taiwanese phonetics and phonology to establish tonal phoneme models, (2) to design a training algorithm for Taiwanese balanced speech database, (3) to apply HMM Toolkit (HTK) to recognize tonal phoneme and validate the robust Taiwanese tonal phoneme set through Bayes screening test, (4) to apply the robust tonal phoneme models and HMM-based TTS system to develop Taiwanese TTS system. The collected text corpus consists of 8,905 MLT sentences and one hundred thousand syllables from MLT books. A Taiwanese balanced sentences speech database including 869 MLT sentences is established through a training and analyzing system developed by windows programming, and another 218 sentences of rare phoneme unit are generated to be included in the database. The phonetic set of 156 Taiwanese tonal phonemes are generated from the HTK recognition results, and the robustness of the phonetic set is validated through sensitivity, specificity and receiver operating characteristic (ROC) curve of statistics. The HMM-based Taiwanese TTS system is successfully developed on Linux and Windows operating system, and the synthetic speech has been evaluated with 4 MOS score and the performance has been up to a higher naturalness level. The results of this research can provide the fundamental information and techniques for the development of indigenous clinical speech technology and Taiwanese computational linguistics. The outcomes of this study are expected to be applied to the fields, such as Taiwan medical services, clinical education and training, medical multimedia and augmentative/alternative communication (AAC) and rehabilitation technology. Further research is recommended to include the following: (1) to analyze and compare fundamental frequency of Taiwanese tones with each other through Taiwanese balanced sentences database, (2) to investigate the training effect of Greedy algorithm on balanced sentences selection, (3) to record female speech for more general range of applications of balanced sentences database, (4) to develop a consistent training and recognition protocol under windows programming, (5) to consider the addition of Chinese character input interface on Taiwanese TTS system for general and community applications, (6) to construct more syntax rules of Taiwanese sentences to judge synthetic prosody, and it will get more colloquial synthetic speech.