Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions

The attended speech stream can be detected robustly, even in adverse auditory scenarios with auditory attentional modulation, and can be decoded using electroencephalographic (EEG) data. Speech segmentation based on the relative root-mean-square (RMS) intensity can be used to estimate segmental cont...

Full description

Bibliographic Details
Main Authors:	Lei Wang, Ed X. Wu, Fei Chen
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2020-10-01
Series:	Frontiers in Human Neuroscience
Subjects:	EEG temporal response function (TRF) auditory attention decoding speech RMS-level segments signal-to-noise ratio
Online Access:	https://www.frontiersin.org/article/10.3389/fnhum.2020.557534/full

id	doaj-2ce3b78ce4fb4376a7d7c6827d023d2e
record_format	Article
spelling	doaj-2ce3b78ce4fb4376a7d7c6827d023d2e2020-11-25T03:40:14ZengFrontiers Media S.A.Frontiers in Human Neuroscience1662-51612020-10-011410.3389/fnhum.2020.557534557534Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy ConditionsLei Wang0Lei Wang1Ed X. Wu2Fei Chen3Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, ChinaDepartment of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, Hong KongDepartment of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, Hong KongDepartment of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, ChinaThe attended speech stream can be detected robustly, even in adverse auditory scenarios with auditory attentional modulation, and can be decoded using electroencephalographic (EEG) data. Speech segmentation based on the relative root-mean-square (RMS) intensity can be used to estimate segmental contributions to perception in noisy conditions. High-RMS-level segments contain crucial information for speech perception. Hence, this study aimed to investigate the effect of high-RMS-level speech segments on auditory attention decoding performance under various signal-to-noise ratio (SNR) conditions. Scalp EEG signals were recorded when subjects listened to the attended speech stream in the mixed speech narrated concurrently by two Mandarin speakers. The temporal response function was used to identify the attended speech from EEG responses of tracking to the temporal envelopes of intact speech and high-RMS-level speech segments alone, respectively. Auditory decoding performance was then analyzed under various SNR conditions by comparing EEG correlations to the attended and ignored speech streams. The accuracy of auditory attention decoding based on the temporal envelope with high-RMS-level speech segments was not inferior to that based on the temporal envelope of intact speech. Cortical activity correlated more strongly with attended than with ignored speech under different SNR conditions. These results suggest that EEG recordings corresponding to high-RMS-level speech segments carry crucial information for the identification and tracking of attended speech in the presence of background noise. This study also showed that with the modulation of auditory attention, attended speech can be decoded more robustly from neural activity than from behavioral measures under a wide range of SNR.https://www.frontiersin.org/article/10.3389/fnhum.2020.557534/fullEEGtemporal response function (TRF)auditory attention decodingspeech RMS-level segmentssignal-to-noise ratio
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Lei Wang Lei Wang Ed X. Wu Fei Chen
spellingShingle	Lei Wang Lei Wang Ed X. Wu Fei Chen Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions Frontiers in Human Neuroscience EEG temporal response function (TRF) auditory attention decoding speech RMS-level segments signal-to-noise ratio
author_facet	Lei Wang Lei Wang Ed X. Wu Fei Chen
author_sort	Lei Wang
title	Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
title_short	Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
title_full	Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
title_fullStr	Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
title_full_unstemmed	Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
title_sort	robust eeg-based decoding of auditory attention with high-rms-level speech segments in noisy conditions
publisher	Frontiers Media S.A.
series	Frontiers in Human Neuroscience
issn	1662-5161
publishDate	2020-10-01
description	The attended speech stream can be detected robustly, even in adverse auditory scenarios with auditory attentional modulation, and can be decoded using electroencephalographic (EEG) data. Speech segmentation based on the relative root-mean-square (RMS) intensity can be used to estimate segmental contributions to perception in noisy conditions. High-RMS-level segments contain crucial information for speech perception. Hence, this study aimed to investigate the effect of high-RMS-level speech segments on auditory attention decoding performance under various signal-to-noise ratio (SNR) conditions. Scalp EEG signals were recorded when subjects listened to the attended speech stream in the mixed speech narrated concurrently by two Mandarin speakers. The temporal response function was used to identify the attended speech from EEG responses of tracking to the temporal envelopes of intact speech and high-RMS-level speech segments alone, respectively. Auditory decoding performance was then analyzed under various SNR conditions by comparing EEG correlations to the attended and ignored speech streams. The accuracy of auditory attention decoding based on the temporal envelope with high-RMS-level speech segments was not inferior to that based on the temporal envelope of intact speech. Cortical activity correlated more strongly with attended than with ignored speech under different SNR conditions. These results suggest that EEG recordings corresponding to high-RMS-level speech segments carry crucial information for the identification and tracking of attended speech in the presence of background noise. This study also showed that with the modulation of auditory attention, attended speech can be decoded more robustly from neural activity than from behavioral measures under a wide range of SNR.
topic	EEG temporal response function (TRF) auditory attention decoding speech RMS-level segments signal-to-noise ratio
url	https://www.frontiersin.org/article/10.3389/fnhum.2020.557534/full
work_keys_str_mv	AT leiwang robusteegbaseddecodingofauditoryattentionwithhighrmslevelspeechsegmentsinnoisyconditions AT leiwang robusteegbaseddecodingofauditoryattentionwithhighrmslevelspeechsegmentsinnoisyconditions AT edxwu robusteegbaseddecodingofauditoryattentionwithhighrmslevelspeechsegmentsinnoisyconditions AT feichen robusteegbaseddecodingofauditoryattentionwithhighrmslevelspeechsegmentsinnoisyconditions
_version_	1724535364270424064

Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions

Similar Items