A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech

Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running spee...

Full description

Bibliographic Details
Main Authors: Ahmed M. Yousef, Dimitar D. Deliyski, Stephanie R. C. Zacharias, Alessandro de Alarcon, Robert F. Orlikoff, Maryam Naghibolhosseini
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/3/1179
id doaj-73e0aecc3dfa4678a2c64b32aa58a341
record_format Article
spelling doaj-73e0aecc3dfa4678a2c64b32aa58a3412021-01-28T00:05:47ZengMDPI AGApplied Sciences2076-34172021-01-01111179117910.3390/app11031179A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected SpeechAhmed M. Yousef0Dimitar D. Deliyski1Stephanie R. C. Zacharias2Alessandro de Alarcon3Robert F. Orlikoff4Maryam Naghibolhosseini5Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USADepartment of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USAMayo Clinic, Scottsdale, AZ 85259, USADivision of Pediatric Otolaryngology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USACollege of Allied Health Sciences, East Carolina University, Greenville, NC 27834, USADepartment of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USAInvestigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the “Rainbow Passage.” The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech.https://www.mdpi.com/2076-3417/11/3/1179high-speed videoendoscopyconnected speechautomated machine-learning-based edge detection
collection DOAJ
language English
format Article
sources DOAJ
author Ahmed M. Yousef
Dimitar D. Deliyski
Stephanie R. C. Zacharias
Alessandro de Alarcon
Robert F. Orlikoff
Maryam Naghibolhosseini
spellingShingle Ahmed M. Yousef
Dimitar D. Deliyski
Stephanie R. C. Zacharias
Alessandro de Alarcon
Robert F. Orlikoff
Maryam Naghibolhosseini
A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
Applied Sciences
high-speed videoendoscopy
connected speech
automated machine-learning-based edge detection
author_facet Ahmed M. Yousef
Dimitar D. Deliyski
Stephanie R. C. Zacharias
Alessandro de Alarcon
Robert F. Orlikoff
Maryam Naghibolhosseini
author_sort Ahmed M. Yousef
title A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
title_short A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
title_full A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
title_fullStr A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
title_full_unstemmed A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
title_sort hybrid machine-learning-based method for analytic representation of the vocal fold edges during connected speech
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-01-01
description Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the “Rainbow Passage.” The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech.
topic high-speed videoendoscopy
connected speech
automated machine-learning-based edge detection
url https://www.mdpi.com/2076-3417/11/3/1179
work_keys_str_mv AT ahmedmyousef ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT dimitarddeliyski ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT stephanierczacharias ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT alessandrodealarcon ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT robertforlikoff ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT maryamnaghibolhosseini ahybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT ahmedmyousef hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT dimitarddeliyski hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT stephanierczacharias hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT alessandrodealarcon hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT robertforlikoff hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
AT maryamnaghibolhosseini hybridmachinelearningbasedmethodforanalyticrepresentationofthevocalfoldedgesduringconnectedspeech
_version_ 1724320219376123904