Advanced pre-and-post processing techniques for speech coding

Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications...

Full description

Bibliographic Details
Main Author: Farsi, Hassan
Published: University of Surrey 2003
Subjects:
006
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.274156
id ndltd-bl.uk-oai-ethos.bl.uk-274156
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-2741562018-09-11T03:17:53ZAdvanced pre-and-post processing techniques for speech codingFarsi, Hassan2003Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications such as mobile satellite systems increased the demand for reducing the transmission bandwidth and achieving higher speech quality. This resulted in the development of efficient parametric models for speech production system. These models were the basis of powerful speech compression algorithms such as CELP, MBE, MELP and WI. The performance of a speech coder not only depends on the speech production model employed but also on the accurate estimation of speech parameters. Periodicity, also known as pitch, is one of the speech parameters that greatly affect the synthesised speech quality. Thus, the subject of pitch determination has attracted much research in the area of low bit rate coding. In these studies it is assumed that for a short segment of speech, called frame, the pitch is fixed or smoothly evolving. The pitch estimation algorithms generally fail to determine irregular variations, which can occur at onset and offset speech segments. In order to overcome this problem, a novel preprocessing method, which detects irregular pitch variations and modifies the speech signal such as to improve the accuracy of the pitch estimation, is proposed. This method results in more regular speech while maintaining perceptual speech quality. The perceptual quality of the synthesised speech may also be improved using postfiltering techniques. Conventional postfiltering methods generally consider the enhancement of the whole speech spectrum. This may result in the broadening of the first formant, which leads to the increase of quantisation noise for this formant. A new postfiltering technique, which is based on factorising the linear prediction synthesis filter, is proposed. This provides more control over the formant bandwidth and attenuation of spectral speech valleys. Key words: Pitch smoothing, speech pre-processor, postfiltering.006Pattern recognition & image processingUniversity of Surreyhttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.274156http://epubs.surrey.ac.uk/844491/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 006
Pattern recognition & image processing
spellingShingle 006
Pattern recognition & image processing
Farsi, Hassan
Advanced pre-and-post processing techniques for speech coding
description Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications such as mobile satellite systems increased the demand for reducing the transmission bandwidth and achieving higher speech quality. This resulted in the development of efficient parametric models for speech production system. These models were the basis of powerful speech compression algorithms such as CELP, MBE, MELP and WI. The performance of a speech coder not only depends on the speech production model employed but also on the accurate estimation of speech parameters. Periodicity, also known as pitch, is one of the speech parameters that greatly affect the synthesised speech quality. Thus, the subject of pitch determination has attracted much research in the area of low bit rate coding. In these studies it is assumed that for a short segment of speech, called frame, the pitch is fixed or smoothly evolving. The pitch estimation algorithms generally fail to determine irregular variations, which can occur at onset and offset speech segments. In order to overcome this problem, a novel preprocessing method, which detects irregular pitch variations and modifies the speech signal such as to improve the accuracy of the pitch estimation, is proposed. This method results in more regular speech while maintaining perceptual speech quality. The perceptual quality of the synthesised speech may also be improved using postfiltering techniques. Conventional postfiltering methods generally consider the enhancement of the whole speech spectrum. This may result in the broadening of the first formant, which leads to the increase of quantisation noise for this formant. A new postfiltering technique, which is based on factorising the linear prediction synthesis filter, is proposed. This provides more control over the formant bandwidth and attenuation of spectral speech valleys. Key words: Pitch smoothing, speech pre-processor, postfiltering.
author Farsi, Hassan
author_facet Farsi, Hassan
author_sort Farsi, Hassan
title Advanced pre-and-post processing techniques for speech coding
title_short Advanced pre-and-post processing techniques for speech coding
title_full Advanced pre-and-post processing techniques for speech coding
title_fullStr Advanced pre-and-post processing techniques for speech coding
title_full_unstemmed Advanced pre-and-post processing techniques for speech coding
title_sort advanced pre-and-post processing techniques for speech coding
publisher University of Surrey
publishDate 2003
url https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.274156
work_keys_str_mv AT farsihassan advancedpreandpostprocessingtechniquesforspeechcoding
_version_ 1718732168352497664