Modulation Spectrum Normalization for Robust Speech Recognition

碩士 === 國立暨南國際大學 === 電機工程學系 === 101 === In human civilization, people gradually increase the demand for technology products, in the past many things in life have to rely on the remote control, keyboard, mouse, input devices and so on. Recent mobile communication, wireless networks, smart phones, and...

Full description

Bibliographic Details
Main Authors: Yu-Hung Yang, 楊玉鴻
Other Authors: Gin-Der Wu
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/04376581319653589762
id ndltd-TW-101NCNU0442049
record_format oai_dc
spelling ndltd-TW-101NCNU04420492016-03-23T04:14:10Z http://ndltd.ncl.edu.tw/handle/04376581319653589762 Modulation Spectrum Normalization for Robust Speech Recognition 調變頻譜正規化法之強健性語音辨識 Yu-Hung Yang 楊玉鴻 碩士 國立暨南國際大學 電機工程學系 101 In human civilization, people gradually increase the demand for technology products, in the past many things in life have to rely on the remote control, keyboard, mouse, input devices and so on. Recent mobile communication, wireless networks, smart phones, and so the technology has become more sophisticated, people and machines to communicate, I believe you can take a more humane, more natural design. In this thesis, we present two scheme to improve the noise robustness of features in speech recognition. Cepstral mean and variance normalization (CMVN) and cepstral gain normalization (CGN), the processed temporal domain feature sequence is first converted into the modulation spectral domain. The magnitude part of the modulation spectrum is decomposed into overlapped non-uniform sub-band segments, and then each sub-band segment is individually processed by the normalization methods. Recognition experiments implemented on database show that the two methods effectively improve the recognition range of noise environment, like CMVN and CGN, to achieve a more excellent recognition performance. Gin-Der Wu 吳俊德 2013 學位論文 ; thesis 42 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立暨南國際大學 === 電機工程學系 === 101 === In human civilization, people gradually increase the demand for technology products, in the past many things in life have to rely on the remote control, keyboard, mouse, input devices and so on. Recent mobile communication, wireless networks, smart phones, and so the technology has become more sophisticated, people and machines to communicate, I believe you can take a more humane, more natural design. In this thesis, we present two scheme to improve the noise robustness of features in speech recognition. Cepstral mean and variance normalization (CMVN) and cepstral gain normalization (CGN), the processed temporal domain feature sequence is first converted into the modulation spectral domain. The magnitude part of the modulation spectrum is decomposed into overlapped non-uniform sub-band segments, and then each sub-band segment is individually processed by the normalization methods. Recognition experiments implemented on database show that the two methods effectively improve the recognition range of noise environment, like CMVN and CGN, to achieve a more excellent recognition performance.
author2 Gin-Der Wu
author_facet Gin-Der Wu
Yu-Hung Yang
楊玉鴻
author Yu-Hung Yang
楊玉鴻
spellingShingle Yu-Hung Yang
楊玉鴻
Modulation Spectrum Normalization for Robust Speech Recognition
author_sort Yu-Hung Yang
title Modulation Spectrum Normalization for Robust Speech Recognition
title_short Modulation Spectrum Normalization for Robust Speech Recognition
title_full Modulation Spectrum Normalization for Robust Speech Recognition
title_fullStr Modulation Spectrum Normalization for Robust Speech Recognition
title_full_unstemmed Modulation Spectrum Normalization for Robust Speech Recognition
title_sort modulation spectrum normalization for robust speech recognition
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/04376581319653589762
work_keys_str_mv AT yuhungyang modulationspectrumnormalizationforrobustspeechrecognition
AT yángyùhóng modulationspectrumnormalizationforrobustspeechrecognition
AT yuhungyang diàobiànpínpǔzhèngguīhuàfǎzhīqiángjiànxìngyǔyīnbiànshí
AT yángyùhóng diàobiànpínpǔzhèngguīhuàfǎzhīqiángjiànxìngyǔyīnbiànshí
_version_ 1718211540594720768