Accent Recognition with Hybrid Phonetic Features
The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with l...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/21/18/6258 |
id |
doaj-ec427c78bb1646058cf443c2646a705b |
---|---|
record_format |
Article |
spelling |
doaj-ec427c78bb1646058cf443c2646a705b2021-09-26T01:23:56ZengMDPI AGSensors1424-82202021-09-01216258625810.3390/s21186258Accent Recognition with Hybrid Phonetic FeaturesZhan Zhang0Yuehai Wang1Jianyi Yang2Department of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaDepartment of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaDepartment of Information and Electronic Engineering, Zhejiang University, Hangzhou 310007, ChinaThe performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method.https://www.mdpi.com/1424-8220/21/18/6258accent recognitionaudio classificationaccented English speech recognition |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Zhan Zhang Yuehai Wang Jianyi Yang |
spellingShingle |
Zhan Zhang Yuehai Wang Jianyi Yang Accent Recognition with Hybrid Phonetic Features Sensors accent recognition audio classification accented English speech recognition |
author_facet |
Zhan Zhang Yuehai Wang Jianyi Yang |
author_sort |
Zhan Zhang |
title |
Accent Recognition with Hybrid Phonetic Features |
title_short |
Accent Recognition with Hybrid Phonetic Features |
title_full |
Accent Recognition with Hybrid Phonetic Features |
title_fullStr |
Accent Recognition with Hybrid Phonetic Features |
title_full_unstemmed |
Accent Recognition with Hybrid Phonetic Features |
title_sort |
accent recognition with hybrid phonetic features |
publisher |
MDPI AG |
series |
Sensors |
issn |
1424-8220 |
publishDate |
2021-09-01 |
description |
The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method. |
topic |
accent recognition audio classification accented English speech recognition |
url |
https://www.mdpi.com/1424-8220/21/18/6258 |
work_keys_str_mv |
AT zhanzhang accentrecognitionwithhybridphoneticfeatures AT yuehaiwang accentrecognitionwithhybridphoneticfeatures AT jianyiyang accentrecognitionwithhybridphoneticfeatures |
_version_ |
1716869034770694144 |