Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Online Access: | http://ndltd.ncl.edu.tw/handle/79sb94 |
id |
ndltd-TW-105TIT05442058 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105TIT054420582019-05-15T23:53:23Z http://ndltd.ncl.edu.tw/handle/79sb94 Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems 在分散式語音辨識系統中結合奇異值分解及最佳化純量量化做語音特徵參數壓縮 Ting-En Wu 吳廷恩 碩士 國立臺北科技大學 電機工程研究所 105 In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression. In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature. 簡福榮 學位論文 ; thesis 0 |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression.
In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature.
|
author2 |
簡福榮 |
author_facet |
簡福榮 Ting-En Wu 吳廷恩 |
author |
Ting-En Wu 吳廷恩 |
spellingShingle |
Ting-En Wu 吳廷恩 Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
author_sort |
Ting-En Wu |
title |
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
title_short |
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
title_full |
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
title_fullStr |
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
title_full_unstemmed |
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems |
title_sort |
speech feature compression using svd and optimal sq in distributed speech recognition systems |
url |
http://ndltd.ncl.edu.tw/handle/79sb94 |
work_keys_str_mv |
AT tingenwu speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems AT wútíngēn speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems AT tingenwu zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō AT wútíngēn zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō |
_version_ |
1719156332434554880 |