Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems

碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end...

Full description

Bibliographic Details
Main Authors: Ting-En Wu, 吳廷恩
Other Authors: 簡福榮
Format: Others
Online Access:http://ndltd.ncl.edu.tw/handle/79sb94
id ndltd-TW-105TIT05442058
record_format oai_dc
spelling ndltd-TW-105TIT054420582019-05-15T23:53:23Z http://ndltd.ncl.edu.tw/handle/79sb94 Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems 在分散式語音辨識系統中結合奇異值分解及最佳化純量量化做語音特徵參數壓縮 Ting-En Wu 吳廷恩 碩士 國立臺北科技大學 電機工程研究所 105 In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression. In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature. 簡福榮 學位論文 ; thesis 0
collection NDLTD
format Others
sources NDLTD
description 碩士 === 國立臺北科技大學 === 電機工程研究所 === 105 === In distributed speech recognition (DSR) systems, speech recognition is split into two major components, feature extraction is performed at the client-end, then the extracted speech feature is sent and transmitted through communication channels to the back-end server for recognition. In this thesis, two speech features are investigated including mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive cepstral coefficients (PLPCC). In order to achieve low bit-rate transmission, we employ singular value decomposition (SVD) and optimal scalar quantization (OSQ) for feature compression. In the proposed SVD_OSQ scheme, the speech features are grouped into blocks of nine or ten successive frames without overlapping. The half frame rate (1/2 frame rate) and one-third frame rate (1/3 frame rate) data using feature interpolation reconstruction (FIR) at the back-end are experimented as the baseline systems for comparison. The experimental results show that the recognition performance of the proposed SVD_OSQ scheme is found to be superior to that of the FIR scheme when using MFCC feature. Furthermore, for multi-condition training model the recognition performance of using PLPCC feature is better than that of using MFCC feature.
author2 簡福榮
author_facet 簡福榮
Ting-En Wu
吳廷恩
author Ting-En Wu
吳廷恩
spellingShingle Ting-En Wu
吳廷恩
Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
author_sort Ting-En Wu
title Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_short Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_full Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_fullStr Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_full_unstemmed Speech Feature Compression Using SVD and Optimal SQ in Distributed Speech Recognition Systems
title_sort speech feature compression using svd and optimal sq in distributed speech recognition systems
url http://ndltd.ncl.edu.tw/handle/79sb94
work_keys_str_mv AT tingenwu speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems
AT wútíngēn speechfeaturecompressionusingsvdandoptimalsqindistributedspeechrecognitionsystems
AT tingenwu zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō
AT wútíngēn zàifēnsànshìyǔyīnbiànshíxìtǒngzhōngjiéhéqíyìzhífēnjiějízuìjiāhuàchúnliàngliànghuàzuòyǔyīntèzhēngcānshùyāsuō
_version_ 1719156332434554880