Speaker Identification by Empirical Mode Decomposition
碩士 === 國立臺灣大學 === 電機工程學研究所 === 94 === Timbre is a main feature that one verifies who is speaking. It is the information that is hidden inside the acoustic properties. Using the differences of timbre features in speaker identification has been an open issue over the years. In the literature, most spe...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2006
|
Online Access: | http://ndltd.ncl.edu.tw/handle/20500645615568765596 |
id |
ndltd-TW-094NTU05442148 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-094NTU054421482015-12-16T04:38:40Z http://ndltd.ncl.edu.tw/handle/20500645615568765596 Speaker Identification by Empirical Mode Decomposition 運用經驗模態分解法於語者辨識 Yi-Huan Lai 賴亦桓 碩士 國立臺灣大學 電機工程學研究所 94 Timbre is a main feature that one verifies who is speaking. It is the information that is hidden inside the acoustic properties. Using the differences of timbre features in speaker identification has been an open issue over the years. In the literature, most speaker identification systems use LPC-derived Cesptral Coefficients (LPCC) or Mel Frequency Cesptral Coefficients (MFCC) as timbre models. The linear and stationary assumptions of above techniques limit identification performance. In this thesis, we apply an adaptive time-frequency distribution, Hilbert-Huang transform. By decomposing original signal into simple oscillation modes empirically, we can obtain meaningful instantaneous frequencies. These instantaneous frequencies are taken as the input pattern to train the Neural Network classifier. Using these timbre features in the proposed system, we achieve a nice accuracy. Yung-Yaw Chen 陳永耀 2006 學位論文 ; thesis 60 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 電機工程學研究所 === 94 === Timbre is a main feature that one verifies who is speaking. It is the information that is hidden inside the acoustic properties. Using the differences of timbre features in speaker identification has been an open issue over the years. In the literature, most speaker identification systems use LPC-derived Cesptral Coefficients (LPCC) or Mel Frequency Cesptral Coefficients (MFCC) as timbre models. The linear and stationary assumptions of above techniques limit identification performance.
In this thesis, we apply an adaptive time-frequency distribution, Hilbert-Huang transform. By decomposing original signal into simple oscillation modes empirically, we can obtain meaningful instantaneous frequencies. These instantaneous frequencies are taken as the input pattern to train the Neural Network classifier. Using these timbre features in the proposed system, we achieve a nice accuracy.
|
author2 |
Yung-Yaw Chen |
author_facet |
Yung-Yaw Chen Yi-Huan Lai 賴亦桓 |
author |
Yi-Huan Lai 賴亦桓 |
spellingShingle |
Yi-Huan Lai 賴亦桓 Speaker Identification by Empirical Mode Decomposition |
author_sort |
Yi-Huan Lai |
title |
Speaker Identification by Empirical Mode Decomposition |
title_short |
Speaker Identification by Empirical Mode Decomposition |
title_full |
Speaker Identification by Empirical Mode Decomposition |
title_fullStr |
Speaker Identification by Empirical Mode Decomposition |
title_full_unstemmed |
Speaker Identification by Empirical Mode Decomposition |
title_sort |
speaker identification by empirical mode decomposition |
publishDate |
2006 |
url |
http://ndltd.ncl.edu.tw/handle/20500645615568765596 |
work_keys_str_mv |
AT yihuanlai speakeridentificationbyempiricalmodedecomposition AT làiyìhuán speakeridentificationbyempiricalmodedecomposition AT yihuanlai yùnyòngjīngyànmótàifēnjiěfǎyúyǔzhěbiànshí AT làiyìhuán yùnyòngjīngyànmótàifēnjiěfǎyúyǔzhěbiànshí |
_version_ |
1718151164946546688 |