Summary: | 博士 === 國立成功大學 === 資訊工程學系碩博士班 === 94 === In the latest decades, the research of dialog systems has significantly improved for practice. However, the spontaneous speech variety acutely reduces performance of acoustic speech recognition. The lack of applicable semantic interpretation hedges of dialog system development. This dissertation research focuses on semantic extraction and matching in conversational dialog systems. Several helpful events are illustrated in this dissertation: ontology alignment and domain ontology extraction, edit disfluency detection and correction, speech act identification and query formulation.
We propose island-driven algorithm to ontology alignment and domain ontology extraction from two existing knowledge bases, WordNet and HowNet, based on the co-occurrence of words in a bilingual parallel corpus. The bilingual ontology achieves more semantic information coverage by integrating two complementary knowledge bases. A domain-specific ontology is further extracted from the bilingual ontology for domain application. Finally, domain-dependent terminology and axioms between domain terminologies defined in a medical encyclopedia are integrated into the domain-specific ontology.
Mis-understanding is usually resulted from the edit disfluency in conversational speech processed by computer. Detecting and correcting the edit disfluency in spontaneous speech become an important issue in dialog systems. Hypothesis testing using acoustic features is first adopted to detect potential interruption points (IPs) in the input speech. The word order of the cleanup utterance is then cleaned up based on the potential IPs using a class-based cleanup language model. The deletable region and the correction are aligned using an alignment model. Finally, log linear weighting is applied to optimize the result of edit disfluency detection and correction.
Two approaches for speech act identification: partial pattern tree to model the sentence pattern and semantic dependence graph are proposed in this dissertation. The first one extracts the semantic words/concepts using latent semantic analysis (LSA). Based on the extracted semantic words and the domain ontology, a partial pattern tree is constructed to model the speech act of a spoken utterance. The partial pattern tree is used to deal with the ill-formed sentence problem in a spoken dialog system. Concept expansion based on domain ontology is also adopted to improve system performance. Another investigation proposes an approach to modeling the discourse of spoken dialog using semantic dependency graphs. By characterizing the discourse as a sequence of speech acts, discourse modeling becomes the identification of the speech act sequence. A statistical approach is adopted to model the relations between words in the user’s utterance using the semantic dependency graphs. Dependency relation between the headword and other words in a sentence is detected using the semantic dependency grammar.
There are two approaches for FAQ mining: one is semantic segment based investigation and another is word based investigation using the independent aspects. The first one presents a novel approach to semantic segment extraction and matching for retrieving information using natural language queries. A semantic matching method is presented to estimate the similarity between the semantic segments of the query and the questions in the FAQ collection. This investigation presents an approach to domain-specific FAQ (frequently-asked question) retrieval using independent aspects. For semantic representation of the aspects, a domain-specific ontology used to be the domain knowledge representation. A probabilistic mixture model is then used to interpret the query and QA pairs based on independent aspects. The expectation-maximization (EM) algorithm is employed to estimate the optimal mixing weights in the probabilistic mixture model.
Finally, this dissertation describes a dialog system with multiple services in medical domain for evaluating the methods proposed in this dissertation. The experimental results show the proposed methods achieve improvement in semantic extraction and matching for dialog systems.
|