Classifying cuneiform symbols using machine learning algorithms with unigram features on a balanced dataset

Recognizing written languages using symbols written in cuneiform is a tough endeavor due to the lack of information and the challenge of the process of tokenization. The Cuneiform Language Identification (CLI) dataset attempts to understand seven cuneiform languages and dialects, including Sumerian...

Full description

Bibliographic Details
Published in:Journal of Intelligent Systems
Main Authors: Mahmood Maha, Jasem Farah Maath, Mukhlif Abdulrahman Abbas, AL-Khateeb Belal
Format: Article
Language:English
Published: De Gruyter 2023-09-01
Subjects:
Online Access:https://doi.org/10.1515/jisys-2023-0087