Integrating Blind Source Separation and Subspace Speech Enhancement for Ubiquitous Voice Control System

碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Speech is the easiest and commonest way of communication for people, thus “voice control” is always pursued by people. If the 3C products can be controlled by voice in the future, they will become convenient and friendly in our life. Most of the voice-contr...

Full description

Bibliographic Details
Main Authors: Hung-Jen Kao, 高宏仁
Other Authors: Jhing-Fa Wang
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/80163538613932712839
Description
Summary:碩士 === 國立成功大學 === 電機工程學系碩博士班 === 96 === Speech is the easiest and commonest way of communication for people, thus “voice control” is always pursued by people. If the 3C products can be controlled by voice in the future, they will become convenient and friendly in our life. Most of the voice-controlled products are used in a short-distance way to record command voice, since the long distance and interference will significantly degrade the performance of recognition. Hence we proposed two far-field ubiquitous voice-control systems to upgrade the noise reduction and recognition rate. For the stationary environment, a mixer is first exploited to mix the multi-channel signals to single–channel signal for increasing the scope of recording and speeding up the computation. Then single-channel subspace speech enhancement method is applied to reduce the background noise. Finally, the speech segments are retained by end-point detection, and recognized by HMM’s mandarin Chinese keyword recognizer. About the consideration of the microphone setup in this system, we exploit uniform-microphone-location instead of microphone-array to reduce the numbers of microphone and cost of related recorded devices. The second system is for low signal-noise-ratio (SNR) and a nonstationary environment. We further proposed a novel architecture combining convolutive blind source separation with subspace speech enhancement to suppress babble noise and background noise using microphone-array setup. We set up two noisy environments containing different noises and two microphone setups. The experimental results show that superior recognition rates can be obtained in two systems.