Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters

碩士 === 國立交通大學 === 電信工程系所 === 96 === Assessing speech quality is an important issue in modern communication systems. The subjective speech quality measurements in early days involve much human resource and money such that the need of an objective speech quality measurement emerges. In addition, origi...

Full description

Bibliographic Details
Main Authors: Shang-Ju Yu, 余尚儒
Other Authors: Tai-Shih Chi
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/70275781211050224354
id ndltd-TW-096NCTU5435111
record_format oai_dc
spelling ndltd-TW-096NCTU54351112015-10-13T12:18:06Z http://ndltd.ncl.edu.tw/handle/70275781211050224354 Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters 感知訊號非侵入式客觀語音品質測量 Shang-Ju Yu 余尚儒 碩士 國立交通大學 電信工程系所 96 Assessing speech quality is an important issue in modern communication systems. The subjective speech quality measurements in early days involve much human resource and money such that the need of an objective speech quality measurement emerges. In addition, original speech signals are not always available when measuring speech quality in practical world. Many non-intrusive methods, which do not require original signals in judging the speech quality, are newly developed to meet this criterion. Such non-intrusive methods do not cost much human resource while being used for the real-time quality test with great efficiency. The main theme of this work is to extract perceptual parameters from an auditory model, which mimics the signal processing principles in the human auditory pathway, and build an objective speech quality measurement without reference signal. First, we propose a voice activity detector (VAD) algorithm by using the perceptual parameters from the auditory model. This VAD algorithm detects three basic categories in speech signals: voice, unvoice and inactive. Next, we acquire the auditory cepstral coefficients (ACC) to be the non-intrusive quality judging parameter. A Gaussian Mixture Model (GMM) is used to build the statistical template of the clean signal to represent the absent reference signal. When measuring the quality of speech from different channels and codecs, the VAD is first utilized to distinguish distorted speech into three categories. Then, ACC parameters are extracted and compared to the statistical templates of the clean speech. The log-probability density function (log-pdf) is used to represent the distance between clean and degraded speech signals. Finally, a regression function is used to map the overall distances from those three categories to the subjective quality scores. The correlation between our objective measures and the subjective measures are examined to validate our approach. Tai-Shih Chi 冀泰石 2008 學位論文 ; thesis 61 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電信工程系所 === 96 === Assessing speech quality is an important issue in modern communication systems. The subjective speech quality measurements in early days involve much human resource and money such that the need of an objective speech quality measurement emerges. In addition, original speech signals are not always available when measuring speech quality in practical world. Many non-intrusive methods, which do not require original signals in judging the speech quality, are newly developed to meet this criterion. Such non-intrusive methods do not cost much human resource while being used for the real-time quality test with great efficiency. The main theme of this work is to extract perceptual parameters from an auditory model, which mimics the signal processing principles in the human auditory pathway, and build an objective speech quality measurement without reference signal. First, we propose a voice activity detector (VAD) algorithm by using the perceptual parameters from the auditory model. This VAD algorithm detects three basic categories in speech signals: voice, unvoice and inactive. Next, we acquire the auditory cepstral coefficients (ACC) to be the non-intrusive quality judging parameter. A Gaussian Mixture Model (GMM) is used to build the statistical template of the clean signal to represent the absent reference signal. When measuring the quality of speech from different channels and codecs, the VAD is first utilized to distinguish distorted speech into three categories. Then, ACC parameters are extracted and compared to the statistical templates of the clean speech. The log-probability density function (log-pdf) is used to represent the distance between clean and degraded speech signals. Finally, a regression function is used to map the overall distances from those three categories to the subjective quality scores. The correlation between our objective measures and the subjective measures are examined to validate our approach.
author2 Tai-Shih Chi
author_facet Tai-Shih Chi
Shang-Ju Yu
余尚儒
author Shang-Ju Yu
余尚儒
spellingShingle Shang-Ju Yu
余尚儒
Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
author_sort Shang-Ju Yu
title Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
title_short Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
title_full Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
title_fullStr Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
title_full_unstemmed Model-based Non-Intrusive Objective Speech Quality Measurement Using Perceptual Parameters
title_sort model-based non-intrusive objective speech quality measurement using perceptual parameters
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/70275781211050224354
work_keys_str_mv AT shangjuyu modelbasednonintrusiveobjectivespeechqualitymeasurementusingperceptualparameters
AT yúshàngrú modelbasednonintrusiveobjectivespeechqualitymeasurementusingperceptualparameters
AT shangjuyu gǎnzhīxùnhàofēiqīnrùshìkèguānyǔyīnpǐnzhìcèliàng
AT yúshàngrú gǎnzhīxùnhàofēiqīnrùshìkèguānyǔyīnpǐnzhìcèliàng
_version_ 1716857851070119936