Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization

碩士 === 國立臺灣大學 === 電信工程學研究所 === 100 === In this paper we propose a new approach of two-dimensional frame-and-feature weighted Viterbi decoding performed at the recognizer back-end for robust speech recognition. The frame weighting is based on an Support Vector Machine (SVM) classifier considering the...

Full description

Bibliographic Details
Main Authors: Yang Chang, 張暘
Other Authors: 李琳山
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/80127356350852988068
id ndltd-TW-100NTU05435053
record_format oai_dc
spelling ndltd-TW-100NTU054350532015-10-13T21:50:16Z http://ndltd.ncl.edu.tw/handle/80127356350852988068 Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization 使用二維特徵音框權重法及調變頻譜正規化之強健型語音辨識 Yang Chang 張暘 碩士 國立臺灣大學 電信工程學研究所 100 In this paper we propose a new approach of two-dimensional frame-and-feature weighted Viterbi decoding performed at the recognizer back-end for robust speech recognition. The frame weighting is based on an Support Vector Machine (SVM) classifier considering the energy distribution and cross-correlation spectrum of the frame. The basic idea is that voiced frames with higher harmonicity is in general more reliable than other frames in noisy speech and therefore should be weighted higher. The feature weighting is based on an entropy measure considering confusion between phoneme classes. The basic idea is that the scores obtained with more discriminating features causing less confusion between phonemes should be weighted higher. These two different weighting schemes on the two different dimensions, frames and features, are then properly integrated in Viterbi decoding. Very significant improvements were achieved in extensive experiments performed with the Aurora 4 testing environment for all types of noise and all SNR values. 李琳山 2012 學位論文 ; thesis 87 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 電信工程學研究所 === 100 === In this paper we propose a new approach of two-dimensional frame-and-feature weighted Viterbi decoding performed at the recognizer back-end for robust speech recognition. The frame weighting is based on an Support Vector Machine (SVM) classifier considering the energy distribution and cross-correlation spectrum of the frame. The basic idea is that voiced frames with higher harmonicity is in general more reliable than other frames in noisy speech and therefore should be weighted higher. The feature weighting is based on an entropy measure considering confusion between phoneme classes. The basic idea is that the scores obtained with more discriminating features causing less confusion between phonemes should be weighted higher. These two different weighting schemes on the two different dimensions, frames and features, are then properly integrated in Viterbi decoding. Very significant improvements were achieved in extensive experiments performed with the Aurora 4 testing environment for all types of noise and all SNR values.
author2 李琳山
author_facet 李琳山
Yang Chang
張暘
author Yang Chang
張暘
spellingShingle Yang Chang
張暘
Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
author_sort Yang Chang
title Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
title_short Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
title_full Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
title_fullStr Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
title_full_unstemmed Robust Speech Recognition with Two-dimensional Frame-and-feature Weighting and Modulation Spectrum Normalization
title_sort robust speech recognition with two-dimensional frame-and-feature weighting and modulation spectrum normalization
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/80127356350852988068
work_keys_str_mv AT yangchang robustspeechrecognitionwithtwodimensionalframeandfeatureweightingandmodulationspectrumnormalization
AT zhāngyáng robustspeechrecognitionwithtwodimensionalframeandfeatureweightingandmodulationspectrumnormalization
AT yangchang shǐyòngèrwéitèzhēngyīnkuāngquánzhòngfǎjídiàobiànpínpǔzhèngguīhuàzhīqiángjiànxíngyǔyīnbiànshí
AT zhāngyáng shǐyòngèrwéitèzhēngyīnkuāngquánzhòngfǎjídiàobiànpínpǔzhèngguīhuàzhīqiángjiànxíngyǔyīnbiànshí
_version_ 1718068542126948352