Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data

碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients...

Full description

Bibliographic Details
Main Authors:	Yu-siang Huang, 黃昱翔
Other Authors:	Tsung-han Tsai
Format:	Others
Language:	en_US
Published:	2010
Online Access:	http://ndltd.ncl.edu.tw/handle/58508874162434220784

id	ndltd-TW-099NCU05442014
record_format	oai_dc
spelling	ndltd-TW-099NCU054420142015-10-30T04:10:15Z http://ndltd.ncl.edu.tw/handle/58508874162434220784 Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data 實現於音訊壓縮域之內涵式歌者分類法 Yu-siang Huang 黃昱翔碩士國立中央大學電機工程研究所 99 In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach. Tsung-han Tsai 蔡宗漢 2010 學位論文 ; thesis 50 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中央大學 === 電機工程研究所 === 99 === In this thesis we proposed a singer classification approach to automatically identify the singer of an unknown MP3 or AAC audio data. Differing from previous researches for singer identification in MP3 compressed domain, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature instead of MDCT (modified discrete cosine transform) coefficients. Although MFCC is often used in music classification and speaker recognition, it can not be directly obtained from compressed music data such as MP3 and AAC. In this thesis, we introduce a modified method for calculating MFCC vector in MP3 and AAC compressed domain. Besides, for describing the distribution of MFCC vectors in MFCC feature space, the GMM (Gaussian mixture model) is used. And then, for finding the nearest singer, we use maximum likelihood classification (MLC) to allot each input MFCC vector to its nearest group. Finally, we implement our approach on two embedded platforms, including Socle CDK and ITRI PAC Duo. Except the two embedded platforms, two operation system configurations are adopted, including Linux and Android. The experimental result verifies the feasibility of the proposed approach.
author2	Tsung-han Tsai
author_facet	Tsung-han Tsai Yu-siang Huang 黃昱翔
author	Yu-siang Huang 黃昱翔
spellingShingle	Yu-siang Huang 黃昱翔 Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
author_sort	Yu-siang Huang
title	Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_short	Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_full	Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_fullStr	Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_full_unstemmed	Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data
title_sort	design and implementation for content-based singer classification on compressed domain audio data
publishDate	2010
url	http://ndltd.ncl.edu.tw/handle/58508874162434220784
work_keys_str_mv	AT yusianghuang designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata AT huángyùxiáng designandimplementationforcontentbasedsingerclassificationoncompresseddomainaudiodata AT yusianghuang shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ AT huángyùxiáng shíxiànyúyīnxùnyāsuōyùzhīnèihánshìgēzhěfēnlèifǎ
_version_	1718116448217333760

Design and Implementation for Content-based Singer Classification on Compressed Domain Audio Data

Similar Items