A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation

<p/> <p>We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between <it>underdetermined blind source separation</it> techniques and those techniques that model the human auditory system, that is, <it>...

Full description

Bibliographic Details
Main Authors:	Dansereau Richard M, Radfar Mohammad H, Sayadiyan Abolghasem
Format:	Article
Language:	English
Published:	SpringerOpen 2007-01-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Online Access:	http://asmp.eurasipjournals.com/content/2007/084186

id	doaj-551d202f879647b6bbacb99a2ad0a47b
record_format	Article
spelling	doaj-551d202f879647b6bbacb99a2ad0a47b2020-11-25T01:27:03ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47141687-47222007-01-0120071084186A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech SeparationDansereau Richard MRadfar Mohammad HSayadiyan Abolghasem<p/> <p>We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between <it>underdetermined blind source separation</it> techniques and those techniques that model the human auditory system, that is, <it>computational auditory scene analysis</it> (CASA). For this purpose, we decompose the speech signal into the excitation signal and the vocal-tract-related filter and then estimate the components from the mixed speech using a hybrid model. We first express the probability density function (PDF) of the mixed speech's log spectral vectors in terms of the PDFs of the underlying speech signal's vocal-tract-related filters. Then, the mean vectors of PDFs of the vocal-tract-related filters are obtained using a <it>maximum likelihood</it> estimator given the mixed signal. Finally, the estimated vocal-tract-related filters along with the extracted fundamental frequencies are used to reconstruct estimates of the individual speech signals. The proposed technique effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation. We compare our model with both an underdetermined blind source separation and a CASA method. The experimental results show that our model outperforms both techniques in terms of SNR improvement and the percentage of crosstalk suppression.</p> http://asmp.eurasipjournals.com/content/2007/084186
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Dansereau Richard M Radfar Mohammad H Sayadiyan Abolghasem
spellingShingle	Dansereau Richard M Radfar Mohammad H Sayadiyan Abolghasem A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation EURASIP Journal on Audio, Speech, and Music Processing
author_facet	Dansereau Richard M Radfar Mohammad H Sayadiyan Abolghasem
author_sort	Dansereau Richard M
title	A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation
title_short	A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation
title_full	A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation
title_fullStr	A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation
title_full_unstemmed	A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation
title_sort	maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation
publisher	SpringerOpen
series	EURASIP Journal on Audio, Speech, and Music Processing
issn	1687-4714 1687-4722
publishDate	2007-01-01
description	<p/> <p>We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between <it>underdetermined blind source separation</it> techniques and those techniques that model the human auditory system, that is, <it>computational auditory scene analysis</it> (CASA). For this purpose, we decompose the speech signal into the excitation signal and the vocal-tract-related filter and then estimate the components from the mixed speech using a hybrid model. We first express the probability density function (PDF) of the mixed speech's log spectral vectors in terms of the PDFs of the underlying speech signal's vocal-tract-related filters. Then, the mean vectors of PDFs of the vocal-tract-related filters are obtained using a <it>maximum likelihood</it> estimator given the mixed signal. Finally, the estimated vocal-tract-related filters along with the extracted fundamental frequencies are used to reconstruct estimates of the individual speech signals. The proposed technique effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation. We compare our model with both an underdetermined blind source separation and a CASA method. The experimental results show that our model outperforms both techniques in terms of SNR improvement and the percentage of crosstalk suppression.</p>
url	http://asmp.eurasipjournals.com/content/2007/084186
work_keys_str_mv	AT dansereaurichardm amaximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation AT radfarmohammadh amaximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation AT sayadiyanabolghasem amaximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation AT dansereaurichardm maximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation AT radfarmohammadh maximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation AT sayadiyanabolghasem maximumlikelihoodestimationofvocaltractrelatedfiltercharacteristicsforsinglechannelspeechseparation
_version_	1725107281620631552

A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation

Similar Items