Spoken Language Identification from Processing and Pattern Analysis of Spectrograms

Prior speech and linguistics research has focused on the use of phonemes recognition in speech, and their use in formulation of recognizable words, to determine language identification. Some languages have additional phoneme sounds, which can help identify a language; however, most of the phonemes a...

Full description

Bibliographic Details
Main Author: Ford, George Harold
Format: Others
Published: NSUWorks 2014
Subjects:
Online Access:http://nsuworks.nova.edu/gscis_etd/152
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1151&context=gscis_etd
id ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1151
record_format oai_dc
spelling ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-11512016-04-25T19:35:08Z Spoken Language Identification from Processing and Pattern Analysis of Spectrograms Ford, George Harold Prior speech and linguistics research has focused on the use of phonemes recognition in speech, and their use in formulation of recognizable words, to determine language identification. Some languages have additional phoneme sounds, which can help identify a language; however, most of the phonemes are common to a wide variety of languages. Legacy approaches recognize strings of phonemes as syllables, used by dictionary queries to see if a word can be found to uniquely identify a language. This dissertation research considers an alternative means of determining language identification of speech data based solely on analysis of frequency-domain data. An analytical approach to speech language identification by three comparative techniques is performed. First, a character-based pattern analysis is performed using the Rix and Forster algorithm to replicate their research on language identification. Second, techniques of phoneme recognition and their relative pattern of occurrence in speech samples are measured for performance in ability for language identification using the Rix and Forster approach. Finally, an experiment using statistical analysis of time-ensemble frequency spectrum data is assessed for its ability to establish spectral patterns for language identification, along with performance. This novel approach is applied to spectrogram audio data using pattern analysis techniques for language identification. It applies the Rix and Forster method to the ensemble of spectral frequencies used over the duration of a speech waveform. This novel approach is compared to the applications of the Rix and Forster algorithm to character-based and phoneme symbols for language identification on the basis of statistical accuracy, processing time requirements, and spatial processing resource needs. The audio spectrum analysis also demonstrates the ability to perform speaker identification using the same techniques performed for language identification. The results of this research demonstrate the efficacy of audio frequency-domain pattern analysis applied to speech waveform data. It provides an efficient technique in language identification without reliance upon linguistic approaches using phonemes or word derivations. This work also demonstrates a quick, automated means by which information gatherers, travelers, and diplomatic officials might obtain rapid language identification supporting time-critical determination of appropriate translator resource needs. 2014-01-01T08:00:00Z text application/pdf http://nsuworks.nova.edu/gscis_etd/152 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1151&context=gscis_etd CEC Theses and Dissertations NSUWorks Identification Language Matching Pattern Speaker Speech Computer Sciences
collection NDLTD
format Others
sources NDLTD
topic Identification
Language
Matching
Pattern
Speaker
Speech
Computer Sciences
spellingShingle Identification
Language
Matching
Pattern
Speaker
Speech
Computer Sciences
Ford, George Harold
Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
description Prior speech and linguistics research has focused on the use of phonemes recognition in speech, and their use in formulation of recognizable words, to determine language identification. Some languages have additional phoneme sounds, which can help identify a language; however, most of the phonemes are common to a wide variety of languages. Legacy approaches recognize strings of phonemes as syllables, used by dictionary queries to see if a word can be found to uniquely identify a language. This dissertation research considers an alternative means of determining language identification of speech data based solely on analysis of frequency-domain data. An analytical approach to speech language identification by three comparative techniques is performed. First, a character-based pattern analysis is performed using the Rix and Forster algorithm to replicate their research on language identification. Second, techniques of phoneme recognition and their relative pattern of occurrence in speech samples are measured for performance in ability for language identification using the Rix and Forster approach. Finally, an experiment using statistical analysis of time-ensemble frequency spectrum data is assessed for its ability to establish spectral patterns for language identification, along with performance. This novel approach is applied to spectrogram audio data using pattern analysis techniques for language identification. It applies the Rix and Forster method to the ensemble of spectral frequencies used over the duration of a speech waveform. This novel approach is compared to the applications of the Rix and Forster algorithm to character-based and phoneme symbols for language identification on the basis of statistical accuracy, processing time requirements, and spatial processing resource needs. The audio spectrum analysis also demonstrates the ability to perform speaker identification using the same techniques performed for language identification. The results of this research demonstrate the efficacy of audio frequency-domain pattern analysis applied to speech waveform data. It provides an efficient technique in language identification without reliance upon linguistic approaches using phonemes or word derivations. This work also demonstrates a quick, automated means by which information gatherers, travelers, and diplomatic officials might obtain rapid language identification supporting time-critical determination of appropriate translator resource needs.
author Ford, George Harold
author_facet Ford, George Harold
author_sort Ford, George Harold
title Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
title_short Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
title_full Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
title_fullStr Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
title_full_unstemmed Spoken Language Identification from Processing and Pattern Analysis of Spectrograms
title_sort spoken language identification from processing and pattern analysis of spectrograms
publisher NSUWorks
publishDate 2014
url http://nsuworks.nova.edu/gscis_etd/152
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1151&context=gscis_etd
work_keys_str_mv AT fordgeorgeharold spokenlanguageidentificationfromprocessingandpatternanalysisofspectrograms
_version_ 1718248481350483968