Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features

碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 94 === Recently, the speech recognition technologies are close to maturity. However, edit difluency in spontaneous speech should be considered as an important issue for practical application. Most of researches on edit disfluency either focus on specific edit disflue...

Full description

Bibliographic Details
Main Authors: Wei-Yen Wu, 吳維彥
Other Authors: Chung-Hsien Wu
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/50827960864536693806
id ndltd-TW-094NCKU5392011
record_format oai_dc
spelling ndltd-TW-094NCKU53920112016-05-30T04:21:58Z http://ndltd.ncl.edu.tw/handle/50827960864536693806 Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features 應用不定長度特徵之條件隨機域於口語不流暢語流修正模型 Wei-Yen Wu 吳維彥 碩士 國立成功大學 資訊工程學系碩博士班 94 Recently, the speech recognition technologies are close to maturity. However, edit difluency in spontaneous speech should be considered as an important issue for practical application. Most of researches on edit disfluency either focus on specific edit disfluecy type or does not integrate multiple knowledge sources jointly. Therefore, central to this issue is how to detect and correct the three categories of edit disfluency with multiple knowledge sources. In this thesis we propose a conditional random fields with variable length model to detect and correct edit disfluency, which is composed of state transition function and observation function. The observation feature functions consist of context related, disfluency related and pattern related features. Three variable-length units, word, chunk and sentence are employed as states of state transition feature functions. Chunk is extracted by Apriori algorithm according to words co-occurrence and term frequency. Sentence is identified according to the verb with corresponding necessary arguments. Finally, the improved iterative scaling (IIS) algorithm is adopted for estimating the weights. For the evaluation of the proposed method, Mandarin conversational dialogue corpus (MCDC) is used as the spontaneous corpus. The detect error rate of edit word is 17.3%. Compared with DF-GRAM, Maximum Entropy and the approach combining language model and alignment model, the proposed approach achieved 11.7%, 8% and 3.9% improvements, respectively. The experimental results show that the proposed model outperforms other methods and efficiently detects and corrects edit disfluency in spontaneous speech. Chung-Hsien Wu 吳宗憲 2006 學位論文 ; thesis 69 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 94 === Recently, the speech recognition technologies are close to maturity. However, edit difluency in spontaneous speech should be considered as an important issue for practical application. Most of researches on edit disfluency either focus on specific edit disfluecy type or does not integrate multiple knowledge sources jointly. Therefore, central to this issue is how to detect and correct the three categories of edit disfluency with multiple knowledge sources. In this thesis we propose a conditional random fields with variable length model to detect and correct edit disfluency, which is composed of state transition function and observation function. The observation feature functions consist of context related, disfluency related and pattern related features. Three variable-length units, word, chunk and sentence are employed as states of state transition feature functions. Chunk is extracted by Apriori algorithm according to words co-occurrence and term frequency. Sentence is identified according to the verb with corresponding necessary arguments. Finally, the improved iterative scaling (IIS) algorithm is adopted for estimating the weights. For the evaluation of the proposed method, Mandarin conversational dialogue corpus (MCDC) is used as the spontaneous corpus. The detect error rate of edit word is 17.3%. Compared with DF-GRAM, Maximum Entropy and the approach combining language model and alignment model, the proposed approach achieved 11.7%, 8% and 3.9% improvements, respectively. The experimental results show that the proposed model outperforms other methods and efficiently detects and corrects edit disfluency in spontaneous speech.
author2 Chung-Hsien Wu
author_facet Chung-Hsien Wu
Wei-Yen Wu
吳維彥
author Wei-Yen Wu
吳維彥
spellingShingle Wei-Yen Wu
吳維彥
Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
author_sort Wei-Yen Wu
title Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
title_short Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
title_full Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
title_fullStr Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
title_full_unstemmed Disfluency Correction of Spontaneous Speech using Conditional Random Fields with Variable Length Features
title_sort disfluency correction of spontaneous speech using conditional random fields with variable length features
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/50827960864536693806
work_keys_str_mv AT weiyenwu disfluencycorrectionofspontaneousspeechusingconditionalrandomfieldswithvariablelengthfeatures
AT wúwéiyàn disfluencycorrectionofspontaneousspeechusingconditionalrandomfieldswithvariablelengthfeatures
AT weiyenwu yīngyòngbùdìngzhǎngdùtèzhēngzhītiáojiànsuíjīyùyúkǒuyǔbùliúchàngyǔliúxiūzhèngmóxíng
AT wúwéiyàn yīngyòngbùdìngzhǎngdùtèzhēngzhītiáojiànsuíjīyùyúkǒuyǔbùliúchàngyǔliúxiūzhèngmóxíng
_version_ 1718284964539138048