Classification Error-based Linear Discriminative Feature Transformation for Large Vocabulary Continuous Speech Recognition

碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 97 === The goal of linear discriminant analysis (LDA) is to seek a linear transformation that projects an original data set into a lower-dimensional feature subspace while simultaneously retaining geometrical class separability. However, LDA cannot always guarantee be...

Full description

Bibliographic Details
Main Authors: Hung-Shin Lee, 李鴻欣
Other Authors: Berlin Chen
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/6w7r9s
Description
Summary:碩士 === 國立臺灣師範大學 === 資訊工程研究所 === 97 === The goal of linear discriminant analysis (LDA) is to seek a linear transformation that projects an original data set into a lower-dimensional feature subspace while simultaneously retaining geometrical class separability. However, LDA cannot always guarantee better classification accuracy. One of the possible reasons lies in that its criterion is not directly associated with the classification error rate, so that it does not necessarily accommodate itself to the allocation rule governed by a given classifier, such as that employed in automatic speech recognition (ASR). In this thesis, we extend the classical LDA by leveraging the relationship between the empirical phone classification error rate and the Mahalanobis distance for each respective phone class pair. To this end, we modify the original between-class scatter from a measure of the Euclidean distance to the pairwise empirical classification accuracy for each class pair, while preserving the lightweight solvability and taking no distributional assumption, just as what LDA does. Furthermore, we also present a novel discriminative linear feature transformation, named generalized likelihood ratio discriminant analysis (GLRDA), on the basis of the likelihood ratio test (LRT). It attempts to seek a lower dimensional feature subspace by making the most confusing situation, described by the null hypothesis, as unlikely to happen as possible without the homoscedastic assumption on class distributions. We also show that the classical linear discriminant analysis (LDA) and its well-known extension – heteroscedastic linear discriminant analysis (HLDA) are just two special cases of our proposed method. The empirical class confusion information can be further incorporated into GLRDA for better recognition performance. Experimental results demonstrate that our approaches yields moderate improvements over LDA and other existing methods, such as HLDA, on the Chinese large vocabulary continuous speech recognition (LVCSR) task.