A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition

碩士 === 國立暨南國際大學 === 電機工程學系 === 102 === This paper presents a novel noise robustness algorithm to enhance speech features in noisy speech recognition. In the presented algorithm, the temporal speech feature sequence is first converted to its spectrum via discrete cosine transform (DCT) or discrete Fo...

Full description

Bibliographic Details
Main Authors:	Yen-Chih Cheng, 程彥誌
Other Authors:	Gin-Der Wu
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/54555410084305100242

id	ndltd-TW-102NCNU0442075
record_format	oai_dc
spelling	ndltd-TW-102NCNU04420752015-10-13T23:38:01Z http://ndltd.ncl.edu.tw/handle/54555410084305100242 A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition 門檻值去噪法於調變頻譜之強健性語音辨識研究 Yen-Chih Cheng 程彥誌碩士國立暨南國際大學電機工程學系 102 This paper presents a novel noise robustness algorithm to enhance speech features in noisy speech recognition. In the presented algorithm, the temporal speech feature sequence is first converted to its spectrum via discrete cosine transform (DCT) or discrete Fourier transform (DFT), and then the DCT or DFT-based spectrum is compensated by a thresholding function in order to further shrink the smaller portion. Finally, the updated spectrum is converted back to the temporal domain to obtain the new feature sequence. The method have two advantages: The first is that the overall compensation process is unsupervised that no information about noise in speech signals is required. The second is that the used threshold can be decided with various optimization criteria flexibly. The experiment evaluation performed on the Aurora-2 connected digit database and task reveals that the presented methods can provide significant improvement in recognition accuracy to the speech features pre-processed by any of the statistics normalization algorithms, including cepstral mean and variance normalization (CMVN), CMVN plus ARMA filtering (MVA), cepstral gain normalization (CGN) and histogram equalization (HEQ). The DFT-based thresholding methods achieve better performance than the DCT-based ones, but we further showed that, using the DCT-based methods, simply compensating the low frequency portion gives similar performance on a par with that achieved by compensation over the entire frequency band. As a result, both the DCT- and DFT-based compensation methods are quite effective in enhancing noise robustness of speech features. Gin-Der Wu Jeih-Weih Hung 吳俊德洪志偉 2014 學位論文 ; thesis 75 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立暨南國際大學 === 電機工程學系 === 102 === This paper presents a novel noise robustness algorithm to enhance speech features in noisy speech recognition. In the presented algorithm, the temporal speech feature sequence is first converted to its spectrum via discrete cosine transform (DCT) or discrete Fourier transform (DFT), and then the DCT or DFT-based spectrum is compensated by a thresholding function in order to further shrink the smaller portion. Finally, the updated spectrum is converted back to the temporal domain to obtain the new feature sequence. The method have two advantages: The first is that the overall compensation process is unsupervised that no information about noise in speech signals is required. The second is that the used threshold can be decided with various optimization criteria flexibly. The experiment evaluation performed on the Aurora-2 connected digit database and task reveals that the presented methods can provide significant improvement in recognition accuracy to the speech features pre-processed by any of the statistics normalization algorithms, including cepstral mean and variance normalization (CMVN), CMVN plus ARMA filtering (MVA), cepstral gain normalization (CGN) and histogram equalization (HEQ). The DFT-based thresholding methods achieve better performance than the DCT-based ones, but we further showed that, using the DCT-based methods, simply compensating the low frequency portion gives similar performance on a par with that achieved by compensation over the entire frequency band. As a result, both the DCT- and DFT-based compensation methods are quite effective in enhancing noise robustness of speech features.
author2	Gin-Der Wu
author_facet	Gin-Der Wu Yen-Chih Cheng 程彥誌
author	Yen-Chih Cheng 程彥誌
spellingShingle	Yen-Chih Cheng 程彥誌 A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
author_sort	Yen-Chih Cheng
title	A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
title_short	A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
title_full	A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
title_fullStr	A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
title_full_unstemmed	A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition
title_sort	study of threshold denoising on modulation spectrum for robust speech recognition
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/54555410084305100242
work_keys_str_mv	AT yenchihcheng astudyofthresholddenoisingonmodulationspectrumforrobustspeechrecognition AT chéngyànzhì astudyofthresholddenoisingonmodulationspectrumforrobustspeechrecognition AT yenchihcheng ménkǎnzhíqùzàofǎyúdiàobiànpínpǔzhīqiángjiànxìngyǔyīnbiànshíyánjiū AT chéngyànzhì ménkǎnzhíqùzàofǎyúdiàobiànpínpǔzhīqiángjiànxìngyǔyīnbiànshíyánjiū AT yenchihcheng studyofthresholddenoisingonmodulationspectrumforrobustspeechrecognition AT chéngyànzhì studyofthresholddenoisingonmodulationspectrumforrobustspeechrecognition
_version_	1718086536995536896

A Study of Threshold Denoising on Modulation Spectrum for Robust Speech Recognition

Similar Items