A Design of Bilingual Character and Speech Recognition System for Chinese and Hindi

碩士 === 國立中山大學 === 電機工程學系研究所 === 101 === Chinese has more than 1.2 billion native speakers, ranks first among all languages in the world. Moreover, China has been the second largest economical group around the globe since 2010, whose gross domestic product (GDP) was lower than that of the preceding...

Full description

Bibliographic Details
Main Authors: Wei-ting Yang, 楊為珽
Other Authors: Chih-Chien Chen
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/42634151892082126651
Description
Summary:碩士 === 國立中山大學 === 電機工程學系研究所 === 101 === Chinese has more than 1.2 billion native speakers, ranks first among all languages in the world. Moreover, China has been the second largest economical group around the globe since 2010, whose gross domestic product (GDP) was lower than that of the preceding U.S., but much higher than twice of the succeeding Japan’s. The importance of Chinese is obvious and unquestionable. The recent progress in India is significant, especially in the field of telecommunication and information software. Many foreign companies outsource their information management and customer service systems to India because of her lower labor cost and fluent English capability. This makes India the eighth largest economical group in the world in 2013. Hindi, the most populous and the first official language used in India, has more than 195 million of native speakers. China and India, both members of the BRICS, possess tremendous market and business opportunity. Investments from countries around the world, including Taiwan, have been flourishing. The population of these two languages is more than 1.395 billion, and about one fifth of the all world. It is our hope to establish a character and speech recognition system for Chinese and Hindi to learn the languages, to widen our perspectives and to promote the economy as well. In this thesis, both character and speech recognition systems are designed and implemented for Chinese and Hindi. Two-dimensional Fourier transform and Karhunen-Loeve transform are used to extract the character features. Two-pattern strategy is then applied in the training process. Under the 1.3 GHz Intel Core i5 PC and Windows 7 operating system environment, correct character recognition rates of 94.5% and 99.04% can be reached respectively for the 4,000 Chinese word and 5,000 Hindi word databases. Mel-frequency cepstral coefficients and linear predicted cepstral coefficients are utilized for the speech feature extraction. Three-pattern training is then used to tune the template. Under the 2.2 GHz AMD Athlon XP 2800+ PC and Ubuntu 9.04 operating system environment, correction speech recognition rates of 91.6% and 92% can be obtained respectively for the 7,000 Chinese phrase and 6,000 Hindi phrase databases.