Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2010
|
Online Access: | http://ndltd.ncl.edu.tw/handle/58508874162434220784 |
id |
ndltd-TW-099NCU05442014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-099NCU054420142015-10-30T04:10:15Z http://ndltd.ncl.edu.tw/handle/58508874162434220784 Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data 實現於音訊壓縮域之內涵式歌者分類法 Yu-siang Huang 黃昱翔 碩士 國立中央大學 電機工程研究所 99 In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach. Tsung-han Tsai 蔡宗漢 2010 學位論文 ; thesis 50 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach.
|
author2 |
Tsung-han Tsai |
author_facet |
Tsung-han Tsai Yu-siang Huang 黃昱翔 |
author |
Yu-siang Huang 黃昱翔 |
spellingShingle |
Yu-siang Huang 黃昱翔 Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
author_sort |
Yu-siang Huang |
title |
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
title_short |
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
title_full |
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
title_fullStr |
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
title_full_unstemmed |
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data |
title_sort |
design and implementation for content-based singer classification on compressed domain audio data |
publishDate |
2010 |
url |
http://ndltd.ncl.edu.tw/handle/58508874162434220784 |
work_keys_str_mv |
AT yusianghuang designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata AT huángyùxiáng designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata AT yusianghuang shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ AT huángyùxiáng shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ |
_version_ |
1718116448217333760 |