Semantic markup of nouns and adjectives for the Electronic corpus of texts in Tuvan language

The article examines the progress of semantic markup of the Electronic corpus of texts in Tuvan language (ECTTL), which is another stage of adding Tuvan texts to the database and marking up the corpus. ECTTL is a collaborative project by researchers from Tuvan State University (Research and Educatio...

Full description

Bibliographic Details
Main Authors: Bajlak Ch. Oorzhak, Arzhaana B. Khertek, Marija A. Kuzhuget, Valentina S. Ondar
Format: Article
Language:Russian
Published: Novye Issledovaniâ Tuvy 2016-12-01
Series:Novye Issledovaniâ Tuvy
Subjects:
Online Access:https://nit.tuva.asia/nit/article/view/615
Description
Summary:The article examines the progress of semantic markup of the Electronic corpus of texts in Tuvan language (ECTTL), which is another stage of adding Tuvan texts to the database and marking up the corpus. ECTTL is a collaborative project by researchers from Tuvan State University (Research and Education Center of Turkic Studies and Department of Information Technologies). Semantic markup of Tuvan lexis will come as a search engine and reference system which will help users find text snippets containing words with desired meanings in ECTTL. The first stage of this process is setting up databases of basic lexemes of Tuvan language. All meaningful lexemes were classified into the following semantic groups: humans, animals, objects, natural objects and phenomena, and abstract concepts. All Tuvan object nouns, as well as both descriptive and relative adjectives, were assigned to one of these lexico-semantic classes. Each class, sub-class and descriptor is tagged in Tuvan, Russian and English; these tags, in turn, will help automatize searching. The databases of meaningful lexemes of Tuvan language will also outline their lexical combinations. The automatized system will contain information on semantic combinations of adjectives with nouns, adverbs with verbs, nouns with verbs, as well as on the combinations which are semantically incompatible.
ISSN:2079-8482