Summary: | 碩士 === 國立交通大學 === 資訊工程學系 === 86 === This paper considers the design of a general keyword spotter for
Mandarin speech and utterance verification. The design of
keywords spotting is general from two aspects: First, we have
considered various vocabulary size. Second all contents of
Mandarin speech are assumed to be extraneous speech. We
establish the baseline keyword spotting systems according to the
framework of Huang et. al. All extraneous speech can be modeled
under this framework. Also, the frame work can be well
integrated with the tree-trellis search algorithm to achieve
efficient search in large vocabulary tasks. This paper
considers three varieties of filler model structures for the
framework based on subsyllabic grammar of Mandarin speech. On
the basis of the three structures, we infer the problem s of
this framework through three arguments. This paper then
presents two methods to modify the spotting mechanism according
to these arguments. Thebest top 1 inclusion rates of the
baseline system are 86.8%, 68.6%, and 52.8% for 500-, 5000-, and
25000-word systems. After our proposed two methods are applied,
the top 1 inclusion rates can be significantly enhanced
individually by about 7%, 12% and 13% for 500-, 5000-, and
25000-word systems. If taking the best results of each filler
structure in 500-word vocabulary as keyword spotting system and
apply utterance verification, the recognition rates can be
farther improved with reasonable false rejection rate.
|