Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech

碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 99 === It is one of most essential issues to extract the keywords from conversational speech for understanding the utterances from speakers. This thesis aims at keyword spotting from spontaneous speech for semantic object detecting. We proposed information structure...

Full description

Bibliographic Details
Main Authors: Yin-Wei Chung, 鐘尹蔚
Other Authors: Jui-Feng Yeh
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/63699585816942165778
id ndltd-TW-099NCYU5392022
record_format oai_dc
spelling ndltd-TW-099NCYU53920222015-10-19T04:03:43Z http://ndltd.ncl.edu.tw/handle/63699585816942165778 Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech 以資訊結構結合音韻屬性偵測擷取對話語音之語意物件之研究 Yin-Wei Chung 鐘尹蔚 碩士 國立嘉義大學 資訊工程學系研究所 99 It is one of most essential issues to extract the keywords from conversational speech for understanding the utterances from speakers. This thesis aims at keyword spotting from spontaneous speech for semantic object detecting. We proposed information structure based approach with prosodic features that are used for semantic object detection. The prosody words are segmented from speaker’s utterance according to the pre-training decision tree. The supported vector machine is further used as the classifier to judge the prosody word is semantic object or not. This thesis mainly consists of three parts; information structure, prosody word segmentation, and semantic object detection are included. We first describe information structure originated from cognitive psychology. Instead of syntactic analysis, the pragmatics viewpoint is used to observe the content of conversation here. We can divide the conversation content into focus and topic parts in the utterance. It is more robust by information structure for semantic object detection from ungrammatical spontaneous speech compared to syntactic analysis. In the second part, the prosody word boundary segmentation algorithm based on decision tree is illustrated. Besides the data driven feature, the knowledge obtained from the corpus observation is integrated in the decision tree. Finally, the semantic objects in the focus part are extracted using prosody features by sported vector machine (SVM). According to the experimental results, we can find the proposed method outperform the phone verification approach especially in recall and accuracy. This shows the proposed approach is operative for semantic object detecting. Jui-Feng Yeh 葉瑞峰 2011 學位論文 ; thesis 0 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 99 === It is one of most essential issues to extract the keywords from conversational speech for understanding the utterances from speakers. This thesis aims at keyword spotting from spontaneous speech for semantic object detecting. We proposed information structure based approach with prosodic features that are used for semantic object detection. The prosody words are segmented from speaker’s utterance according to the pre-training decision tree. The supported vector machine is further used as the classifier to judge the prosody word is semantic object or not. This thesis mainly consists of three parts; information structure, prosody word segmentation, and semantic object detection are included. We first describe information structure originated from cognitive psychology. Instead of syntactic analysis, the pragmatics viewpoint is used to observe the content of conversation here. We can divide the conversation content into focus and topic parts in the utterance. It is more robust by information structure for semantic object detection from ungrammatical spontaneous speech compared to syntactic analysis. In the second part, the prosody word boundary segmentation algorithm based on decision tree is illustrated. Besides the data driven feature, the knowledge obtained from the corpus observation is integrated in the decision tree. Finally, the semantic objects in the focus part are extracted using prosody features by sported vector machine (SVM). According to the experimental results, we can find the proposed method outperform the phone verification approach especially in recall and accuracy. This shows the proposed approach is operative for semantic object detecting.
author2 Jui-Feng Yeh
author_facet Jui-Feng Yeh
Yin-Wei Chung
鐘尹蔚
author Yin-Wei Chung
鐘尹蔚
spellingShingle Yin-Wei Chung
鐘尹蔚
Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
author_sort Yin-Wei Chung
title Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
title_short Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
title_full Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
title_fullStr Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
title_full_unstemmed Study on Semantic Objects Extraction using Information Structure Combined Prosodic Attribute Detection for Conversational Speech
title_sort study on semantic objects extraction using information structure combined prosodic attribute detection for conversational speech
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/63699585816942165778
work_keys_str_mv AT yinweichung studyonsemanticobjectsextractionusinginformationstructurecombinedprosodicattributedetectionforconversationalspeech
AT zhōngyǐnwèi studyonsemanticobjectsextractionusinginformationstructurecombinedprosodicattributedetectionforconversationalspeech
AT yinweichung yǐzīxùnjiégòujiéhéyīnyùnshǔxìngzhēncèxiéqǔduìhuàyǔyīnzhīyǔyìwùjiànzhīyánjiū
AT zhōngyǐnwèi yǐzīxùnjiégòujiéhéyīnyùnshǔxìngzhēncèxiéqǔduìhuàyǔyīnzhīyǔyìwùjiànzhīyánjiū
_version_ 1718094487751753728