A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese

A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese

By defining the computable word segmentation unit and studying its probability characteristics, we establish an unsupervised statistical language model (SLM) for a new pre-Trained sequence labeling framework in this article. The proposed SLM is an optimization model, and its objective is to maximize...

Full description

Bibliographic Details
Main Authors:	Chen, L. (Author), Huang, Y. (Author), Liao, X. (Author), Yang, P. (Author)
Format:	Article
Language:	English
Published:	Association for Computing Machinery 2022
Subjects:	Binding forces Case-studies Computational linguistics Condition Dynamic programming Natural language processing systems Optimization models Performance sequence labeling Sequence Labeling Speech recognition statistical language model Statistical language modelling Statistical tests Unsupervised Vietnamese Word segmentation
Online Access:	View Fulltext in Publisher

Similar Items

Infants generalize representations of statistically segmented words
by: Katharine eGraf Estes
Published: (2012-10-01)

Understanding Patterns in Infant-Directed Speech in Context: An Investigation of Statistical Cues to Word Boundaries
by: Hartman, Rose
Published: (2017)

Hierarchical sequence labeling for extracting BEL statements from biomedical literature
by: Suwen Liu, et al.
Published: (2019-04-01)

Cross-Linguistic Analysis of Vietnamese and English with Implications for Vietnamese Language Acquisition and Maintenance in the United States
by: Giang Tang
Published: (2007-01-01)

The Meta-Science of Adult Statistical Word Segmentation: Part 1
by: Joshua K. Hartshorne, et al.
Published: (2019-01-01)

Statistical learning of a tonal language: The influence of bilingualism and previous linguistic experience
by: Tianlin eWang, et al.
Published: (2014-09-01)

Coverbs and case in Vietnamese
by: Clark, Marybeth
Published: (2009)

Extensive Experimental Evaluation of Self-Organizing Maps for Automatic Classification of a Multi-Class Multi-Label Corpus
by: Eleni Giannopoulou, et al.
Published: (2018-01-01)

Do we need statistics when we have linguistics?
by: Cantos Gómez Pascual
Published: (2002-01-01)

Unsupervised Topic Labeling of Text Based on Wikipedia Categorization
by: Tetyana Loskutova
Published: (2019-08-01)

Toddlers’ Ability to Leverage Statistical Information to Support Word Learning
by: Erica M. Ellis, et al.
Published: (2021-04-01)

The naїve language expert: Introduction to the Special Topic
by: Jutta L Mueller, et al.
Published: (2013-08-01)

The Influence of Different Prosodic Cues on Word Segmentation
by: Theresa Matzinger, et al.
Published: (2021-03-01)

A Neurophysiologically-Inspired Statistical Language Model
by: Dehdari, Jonathan
Published: (2014)

Musical expertise and statistical learning of musical and linguistic structures
by: Clément eFrancois, et al.
Published: (2011-07-01)

Discovering Words in Fluent Speech: The Contribution of Two Kinds of Statistical Information
by: Erik D Thiessen, et al.
Published: (2013-01-01)

Statistical Bootstrapping of Speech Segmentation Cues
by: Planet, Nicolas O.
Published: (2010)

Speaking clearly improves speech segmentation by statistical learning under optimal listening conditions
by: Zhe-chen Guo, et al.
Published: (2021-07-01)

The Class of Indefinites in Vietnamese
by: Michaelis, Laura A
Published: (1989-01-01)

Analysis of Morph-Based Language Modeling and Speech Recognition in Slovak
by: Jan Stas, et al.
Published: (2012-01-01)

Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique
by: CUCU, H., et al.
Published: (2015-02-01)

Development of Speech Recognition Threshold and Word Recognition Materials for Native Vietnamese Speakers
by: Hanson, Claire
Published: (2014)

On the Use of Parsing for Named Entity Recognition
by: Miguel A. Alonso, et al.
Published: (2021-01-01)

The First Vietnamese FOSD-Tacotron-2-based Text-to-Speech Model Dataset
by: Duc Chung Tran
Published: (2020-08-01)

Family member information extraction via neural sequence labeling models with different tag schemes
by: Hong-Jie Dai
Published: (2019-12-01)

Information Theory and Language
by: Łukasz Dębowski, et al.
Published: (2020-04-01)

HTLinker: A Head-to-Tail Linker for Nested Named Entity Recognition
by: Xiang Li, et al.
Published: (2021-08-01)

ϕ-statistically quasi Cauchy sequences
by: Bipan Hazarika
Published: (2016-04-01)

Temporal attention as a Scaffold for Language Development
by: Ruth eDe Diego-Balaguer, et al.
Published: (2016-02-01)

Numbering the streaks of the tulip? Reflections on a Challenge to the Use of Statistical Methods in Computational Stylistics
by: J F Burrows
Published: (2005-08-01)

A New Method of Time-Series Event Prediction Based on Sequence Labeling
by: Lv, S., et al.
Published: (2023)

SEGMENTATION OF ANATOMICAL STRUCTURES BY CONNECTED STATISTICAL MODELS
by: Marko Bukovec, et al.
Published: (2011-06-01)

A multi-objective programming perspective to statistical learning problems
by: Yaman, Sibel
Published: (2009)

Statistical learning in late talkers and normal peers
by: Fatemeh Karimian, et al.
Published: (2020-01-01)

Lecture transcription systems in resource-scarce environments / Pieter Theunis de Villiers
by: De Villiers, Pieter Theunis
Published: (2014)

Lecture transcription systems in resource-scarce environments / Pieter Theunis de Villiers
by: De Villiers, Pieter Theunis
Published: (2014)

Continuous space models with neural networks in natural language processing
by: Le, Hai Son
Published: (2012)

Ronny Meyer - Renate Richter: Language Use in Ethiopia from a Network Perspective
by: Olga Kapeliuk
Published: (2012-10-01)

Methods library of embedded R functions at Statistics Norway
by: Øyvind Langsrud
Published: (2017-11-01)

Advancing our understanding of the link between statistical learning and language acquisition: The need for longitudinal data
by: Joanne eArciuli, et al.
Published: (2012-08-01)