Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking
碩士 === 國立暨南國際大學 === 電機工程學系 === 107 === In this study, we focus on the issue of noise distortion in speech signals, and develop two novel unsupervised speech enhancement algorithms including temporal lowpass filtering (TLF) and relative-to-maximum masking (RMM). Both of these two algorithms are condu...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/f6vwk9 |
id |
ndltd-TW-107NCNU0442024 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NCNU04420242019-09-20T03:25:53Z http://ndltd.ncl.edu.tw/handle/f6vwk9 Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking 整合全卷積神經網路、時序低通濾波與時頻遮罩法之語音強化 LIU,KUAN-YI 劉冠毅 碩士 國立暨南國際大學 電機工程學系 107 In this study, we focus on the issue of noise distortion in speech signals, and develop two novel unsupervised speech enhancement algorithms including temporal lowpass filtering (TLF) and relative-to-maximum masking (RMM). Both of these two algorithms are conducted on the magnitude spectrogram of speech signals. TLF uses a simple moving-average filter to emphasize the low modulation frequencies of speech signals, which are believed to contain richer linguistic information and exhibit higher signal-to-noise ratios (SNR). Comparatively, in RMM we apply a mask that is directly multiplied with the speech spectrogram in a point-wise manner, and the used masking value is directly proportional to the magnitude of each temporal-frequency (T-F) point in the spectrogram. The preliminary experiments conducted on a subset of TIMIT database show that the two novel methods can promote the quality of noise-corrupted speech signals significantly, and both of them can be integrated with a well-known supervised speech enhancement scenario, namely fully convolutional network, to achieve even better perceptual speech quality values. HUNG,JEIH-WEHI 洪志偉 2019 學位論文 ; thesis 50 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立暨南國際大學 === 電機工程學系 === 107 === In this study, we focus on the issue of noise distortion in speech signals, and develop two novel unsupervised speech enhancement algorithms including temporal lowpass filtering (TLF) and relative-to-maximum masking (RMM). Both of these two algorithms are conducted on the magnitude spectrogram of speech signals. TLF uses a simple moving-average filter to emphasize the low modulation frequencies of speech signals, which are believed to contain richer linguistic information and exhibit higher signal-to-noise ratios (SNR). Comparatively, in RMM we apply a mask that is directly multiplied with the speech spectrogram in a point-wise manner, and the used masking value is directly proportional to the magnitude of each temporal-frequency (T-F) point in the spectrogram. The preliminary experiments conducted on a subset of TIMIT database show that the two novel methods can promote the quality of noise-corrupted speech signals significantly, and both of them can be integrated with a well-known supervised speech enhancement scenario, namely fully convolutional network, to achieve even better perceptual speech quality values.
|
author2 |
HUNG,JEIH-WEHI |
author_facet |
HUNG,JEIH-WEHI LIU,KUAN-YI 劉冠毅 |
author |
LIU,KUAN-YI 劉冠毅 |
spellingShingle |
LIU,KUAN-YI 劉冠毅 Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
author_sort |
LIU,KUAN-YI |
title |
Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
title_short |
Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
title_full |
Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
title_fullStr |
Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
title_full_unstemmed |
Speech Enhancement Based on the Integration of Fully Convolutional Network, Temporal Lowpass Filtering and Spectrogram Masking |
title_sort |
speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/f6vwk9 |
work_keys_str_mv |
AT liukuanyi speechenhancementbasedontheintegrationoffullyconvolutionalnetworktemporallowpassfilteringandspectrogrammasking AT liúguānyì speechenhancementbasedontheintegrationoffullyconvolutionalnetworktemporallowpassfilteringandspectrogrammasking AT liukuanyi zhěnghéquánjuǎnjīshénjīngwǎnglùshíxùdītōnglǜbōyǔshípínzhēzhàofǎzhīyǔyīnqiánghuà AT liúguānyì zhěnghéquánjuǎnjīshénjīngwǎnglùshíxùdītōnglǜbōyǔshípínzhēzhàofǎzhīyǔyīnqiánghuà |
_version_ |
1719252755653066752 |