Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition

This paper presents an automatic proficiency assessment method for a non‐native Korean read utterance using bidirectional long short–term memory (BLSTM)–based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without pr...

Full description

Bibliographic Details
Main Authors: Yoo Rhee Oh, Kiyoung Park, Hyung‐Bae Jeon, Jeon Gue Park
Format: Article
Language:English
Published: Electronics and Telecommunications Research Institute (ETRI) 2020-04-01
Series:ETRI Journal
Subjects:
Online Access:https://doi.org/10.4218/etrij.2019-0400
id doaj-7e8d91bc10274eebaf16e305f909c383
record_format Article
spelling doaj-7e8d91bc10274eebaf16e305f909c3832021-01-05T05:15:53ZengElectronics and Telecommunications Research Institute (ETRI)ETRI Journal1225-64632020-04-0142576477510.4218/etrij.2019-040010.4218/etrij.2019-0400Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognitionYoo Rhee OhKiyoung ParkHyung‐Bae JeonJeon Gue ParkThis paper presents an automatic proficiency assessment method for a non‐native Korean read utterance using bidirectional long short–term memory (BLSTM)–based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced‐alignment step using a native AM and non‐native AM, and (c) a linear regression–based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un‐segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.https://doi.org/10.4218/etrij.2019-0400automatic speech recognition (asr) for a non‐native korean utterancebidirectional long short–term memory (blstm)–based acoustic models (ams)speech data augmentationspoken computer‐assisted language learning (call)spoken proficiency assessment
collection DOAJ
language English
format Article
sources DOAJ
author Yoo Rhee Oh
Kiyoung Park
Hyung‐Bae Jeon
Jeon Gue Park
spellingShingle Yoo Rhee Oh
Kiyoung Park
Hyung‐Bae Jeon
Jeon Gue Park
Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
ETRI Journal
automatic speech recognition (asr) for a non‐native korean utterance
bidirectional long short–term memory (blstm)–based acoustic models (ams)
speech data augmentation
spoken computer‐assisted language learning (call)
spoken proficiency assessment
author_facet Yoo Rhee Oh
Kiyoung Park
Hyung‐Bae Jeon
Jeon Gue Park
author_sort Yoo Rhee Oh
title Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
title_short Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
title_full Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
title_fullStr Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
title_full_unstemmed Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition
title_sort automatic proficiency assessment of korean speech read aloud by non‐natives using bidirectional lstm‐based speech recognition
publisher Electronics and Telecommunications Research Institute (ETRI)
series ETRI Journal
issn 1225-6463
publishDate 2020-04-01
description This paper presents an automatic proficiency assessment method for a non‐native Korean read utterance using bidirectional long short–term memory (BLSTM)–based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced‐alignment step using a native AM and non‐native AM, and (c) a linear regression–based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un‐segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.
topic automatic speech recognition (asr) for a non‐native korean utterance
bidirectional long short–term memory (blstm)–based acoustic models (ams)
speech data augmentation
spoken computer‐assisted language learning (call)
spoken proficiency assessment
url https://doi.org/10.4218/etrij.2019-0400
work_keys_str_mv AT yoorheeoh automaticproficiencyassessmentofkoreanspeechreadaloudbynonnativesusingbidirectionallstmbasedspeechrecognition
AT kiyoungpark automaticproficiencyassessmentofkoreanspeechreadaloudbynonnativesusingbidirectionallstmbasedspeechrecognition
AT hyungbaejeon automaticproficiencyassessmentofkoreanspeechreadaloudbynonnativesusingbidirectionallstmbasedspeechrecognition
AT jeonguepark automaticproficiencyassessmentofkoreanspeechreadaloudbynonnativesusingbidirectionallstmbasedspeechrecognition
_version_ 1724348556602507264