Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems

碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end...

Full description

Bibliographic Details
Main Authors:	Ting-En Wu, 吳廷恩
Other Authors:	簡福榮
Format:	Others
Online Access:	http://ndltd.ncl.edu.tw/handle/79sb94

id	ndltd-TW-105TIT05442058
record_format	oai_dc
spelling	ndltd-TW-105TIT054420582019-05-15T23:53:23Z http://ndltd.ncl.edu.tw/handle/79sb94 Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems 在分散式語音辨識系統中結合奇異值分解及最佳化純量量化做語音特徵參數壓縮 Ting-En Wu 吳廷恩碩士國立臺北科技大學電機工程研究所 105 In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression. In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature. 簡福榮學位論文 ; thesis 0
collection	NDLTD
format	Others
sources	NDLTD
description	碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression. In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature.
author2	簡福榮
author_facet	簡福榮 Ting-En Wu 吳廷恩
author	Ting-En Wu 吳廷恩
spellingShingle	Ting-En Wu 吳廷恩 Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
author_sort	Ting-En Wu
title	Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_short	Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_full	Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_fullStr	Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_full_unstemmed	Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_sort	speech feature compression using svd and optimal sq in distributed speech recognition systems
url	http://ndltd.ncl.edu.tw/handle/79sb94
work_keys_str_mv	AT tingenwu speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems AT wútíngēn speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems AT tingenwu zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō AT wútíngēn zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō
_version_	1719156332434554880

Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems

Similar Items