Application of sound source separation methods to advanced spatial audio systems

This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are pres...

Full description

Bibliographic Details
Main Author:	Cobos Serrano, Máximo
Other Authors:	López Monfort, José Javier
Format:	Doctoral Thesis
Language:	English
Published:	Universitat Politècnica de València 2010
Subjects:	Wave field synthesis Source separation Time frequency processing Direction of arrival Spatial audio TEORIA DE LA SEÑAL Y COMUNICACIONES
Online Access:	http://hdl.handle.net/10251/8969

id	ndltd-upv.es-oai-riunet.upv.es-10251-8969
record_format	oai_dc
spelling	ndltd-upv.es-oai-riunet.upv.es-10251-89692020-12-02T20:21:27Z Application of sound source separation methods to advanced spatial audio systems Cobos Serrano, Máximo López Monfort, José Javier Universitat Politècnica de València. Departamento de Comunicaciones - Departament de Comunicacions Wave field synthesis Source separation Time frequency processing Direction of arrival Spatial audio TEORIA DE LA SEÑAL Y COMUNICACIONES This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented. Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969 Palancia 2010-12-03 info:eu-repo/semantics/doctoralThesis info:eu-repo/semantics/acceptedVersion http://hdl.handle.net/10251/8969 10.4995/Thesis/10251/8969 eng http://rightsstatements.org/vocab/InC/1.0/ info:eu-repo/semantics/openAccess Universitat Politècnica de València Riunet
collection	NDLTD
language	English
format	Doctoral Thesis
sources	NDLTD
topic	Wave field synthesis Source separation Time frequency processing Direction of arrival Spatial audio TEORIA DE LA SEÑAL Y COMUNICACIONES
spellingShingle	Wave field synthesis Source separation Time frequency processing Direction of arrival Spatial audio TEORIA DE LA SEÑAL Y COMUNICACIONES Cobos Serrano, Máximo Application of sound source separation methods to advanced spatial audio systems
description	This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented. === Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969 === Palancia
author2	López Monfort, José Javier
author_facet	López Monfort, José Javier Cobos Serrano, Máximo
author	Cobos Serrano, Máximo
author_sort	Cobos Serrano, Máximo
title	Application of sound source separation methods to advanced spatial audio systems
title_short	Application of sound source separation methods to advanced spatial audio systems
title_full	Application of sound source separation methods to advanced spatial audio systems
title_fullStr	Application of sound source separation methods to advanced spatial audio systems
title_full_unstemmed	Application of sound source separation methods to advanced spatial audio systems
title_sort	application of sound source separation methods to advanced spatial audio systems
publisher	Universitat Politècnica de València
publishDate	2010
url	http://hdl.handle.net/10251/8969
work_keys_str_mv	AT cobosserranomaximo applicationofsoundsourceseparationmethodstoadvancedspatialaudiosystems
_version_	1719367156543520768

Application of sound source separation methods to advanced spatial audio systems

Similar Items