Accent Recognition with Hybrid Phonetic Features

The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with l...

Full description

Bibliographic Details
Main Authors: Zhan Zhang, Yuehai Wang, Jianyi Yang
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/18/6258
id doaj-ec427c78bb1646058cf443c2646a705b
record_format Article
spelling doaj-ec427c78bb1646058cf443c2646a705b2021-09-26T01:23:56ZengMDPI AGSensors1424-82202021-09-01216258625810.3390/s21186258Accent Recognition with Hybrid Phonetic FeaturesZhan Zhang0Yuehai Wang1Jianyi Yang2Department of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaDepartment of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaDepartment of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaThe performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method.https://www.mdpi.com/1424-8220/21/18/6258accent recognitionaudio classificationaccented English speech recognition
collection DOAJ
language English
format Article
sources DOAJ
author Zhan Zhang
Yuehai Wang
Jianyi Yang
spellingShingle Zhan Zhang
Yuehai Wang
Jianyi Yang
Accent Recognition with Hybrid Phonetic Features
Sensors
accent recognition
audio classification
accented English speech recognition
author_facet Zhan Zhang
Yuehai Wang
Jianyi Yang
author_sort Zhan Zhang
title Accent Recognition with Hybrid Phonetic Features
title_short Accent Recognition with Hybrid Phonetic Features
title_full Accent Recognition with Hybrid Phonetic Features
title_fullStr Accent Recognition with Hybrid Phonetic Features
title_full_unstemmed Accent Recognition with Hybrid Phonetic Features
title_sort accent recognition with hybrid phonetic features
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2021-09-01
description The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method.
topic accent recognition
audio classification
accented English speech recognition
url https://www.mdpi.com/1424-8220/21/18/6258
work_keys_str_mv AT zhanzhang accentrecognitionwithhybridphoneticfeatures
AT yuehaiwang accentrecognitionwithhybridphoneticfeatures
AT jianyiyang accentrecognitionwithhybridphoneticfeatures
_version_ 1716869034770694144