Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data

碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients...

Full description

Bibliographic Details
Main Authors: Yu-siang Huang, 黃昱翔
Other Authors: Tsung-han Tsai
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/58508874162434220784
id ndltd-TW-099NCU05442014
record_format oai_dc
spelling ndltd-TW-099NCU054420142015-10-30T04:10:15Z http://ndltd.ncl.edu.tw/handle/58508874162434220784 Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data 實現於音訊壓縮域之內涵式歌者分類法 Yu-siang Huang 黃昱翔 碩士 國立中央大學 電機工程研究所 99 In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach. Tsung-han Tsai 蔡宗漢 2010 學位論文 ; thesis 50 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach.
author2 Tsung-han Tsai
author_facet Tsung-han Tsai
Yu-siang Huang
黃昱翔
author Yu-siang Huang
黃昱翔
spellingShingle Yu-siang Huang
黃昱翔
Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
author_sort Yu-siang Huang
title Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_short Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_full Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_fullStr Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_full_unstemmed Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_sort design and implementation for content-based singer classification on compressed domain audio data
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/58508874162434220784
work_keys_str_mv AT yusianghuang designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata
AT huángyùxiáng designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata
AT yusianghuang shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ
AT huángyùxiáng shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ
_version_ 1718116448217333760