Summary: | 碩士 === 國立交通大學 === 電信研究所 === 83 === In this thesis, several techniques to improve the initial-final
based HMM method for continuous Mandarin speech recognition are
proposed. The baseline system uses 100 right-context-dependent
initial HMM models and 39 context-independent final HMM models.
First, the technique of bounded state duration is employed to
model the temporal structure of speech signals and incorporated
into the recognition process. The technique of syllable penalty
is then used to relieve the suffering of high insertion errors.
We then employ the technique of signal normalization to improve
the system. The performance of the recognizer is then further
improved by using gender-dependent HMM models. Effectiveness of
the above proposals was confirmed by simulations on a speaker-
independent speech recognition task to recognize continuous
Mandarin speech through telephone channel. Syllable recognition
rate was raised from 30.86% to 42.14%. Finally, an RNN-based
finite state machine is proposed to pre-segment the input
signal into 4 states including initial, final, silence, and
transient states. State-dependent Constraints are then set to
restrict the search of optimal path for relieving the
computation load of the one-stage recognition process.
Experimental results showed that about half of the computations
can be saved with a very minor loss on the recognition rate.
|