Study of Modulation Spectrum Normalization Techniques for Robust Speech Recognition

碩士 === 國立暨南國際大學 === 電機工程學系 === 96 === The performance of an automatic speech recognition system is often degraded due to the embedded noise in the processed speech signal. A variety of techniques have been proposed to deal with this problem, and one category of these techniques aims to normalize the...

Full description

Bibliographic Details
Main Authors: Chieh-cheng Wang, 王致程
Other Authors: Jeih-weih Hung
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/19625906694881317221
Description
Summary:碩士 === 國立暨南國際大學 === 電機工程學系 === 96 === The performance of an automatic speech recognition system is often degraded due to the embedded noise in the processed speech signal. A variety of techniques have been proposed to deal with this problem, and one category of these techniques aims to normalize the temporal statistics of the speech features, which is the main direction of our proposed new approaches here. In this thesis, we propose a series of noise robustness approaches, all of which attempt to normalize the modulation spectrum of speech features. They include equi-ripple temporal filtering (ERTF), least-squares spectrum fitting (LSSF) and magnitude spectrum interpolation (MSI). With these approaches, the mismatch between the modulation spectra for clean and noise-corrupted speech features is reduced, and thus the resulting new features are expected to be more noise-robust. Recognition experiments implemented on Aurora-2 digit database show that the three new approaches effectively improve the recognition accuracy under a wide range of noise-corrupted environment. Moreover, it is also shown that they can be successfully combined with some other noise robustness approaches, like CMVN and MVA, to achieve a more excellent recognition performance.