A Study on Identifying the Most Effective Speech Features for Speech Emotion Recognition

碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recogn...

Full description

Bibliographic Details
Main Authors: Ching-yi Lin, 林靜宜
Other Authors: Tsang-long Pao
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/77453967950782385836
Description
Summary:碩士 === 大同大學 === 資訊工程學系(所) === 97 === There are many ways for humans to express their emotion, for instance, speech, attitude or writing. Human speech involves not only the syntax but also the feeling at the moment of speaking. Thus, emotions play an important role for speech communication and recognizing human emotion in speech signal has attracted quite a lot of attention. In emotion recognition, different classifiers and features used in the system will influence the recognition rate. The purpose of this study is to acquire the most effective feature set for a specific classifier used in the speech emotion recognition. There are three main focuses: classifiers, emotion corpus combinations and the features to be analyzed. In this thesis we use 78 speech features, including Formant, Shimmer, Jitter, Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), Mel-Frequency Cepstral Coefficients (MFCC), first derivative of MFCC (D-MFCC), second derivative of MFCC (DD-MFCC), Log Frequency Power Coefficients (LFPC), Perceptual Linear Prediction (PLP), RelAtive SpecTrAl PLP (RastaPLP), Log-Energy, Zero Crossing Rate (ZCR), as well as their mean, standard deviation, minimum, maximum and range, are extracted. The method that we analyze the effects of features is called sequential forward selection (SFS). Experiment results indicate that the most effective feature set for five emotions using WD-KNN can obtain the highest recognition accuracy of 90% with 13 features. From the results, we can see that the most effective feature among all extracted features for emotion recognition is Linear Predictive Coefficients (LPC). It appears in the most effective features for all the classifiers tested.