On Digitization of Romanian Cyrillic Printings of the 17th-18th Centuries

The paper describes in details recognition of Romanian texts of the \nth{17}--\nth{18} centuries printed in the Cyrillic script, and their conversion to the modern Latin script. The challenges are discussed, and solutions of problems are proposed. The elaborated technology and a tool pack incl...

Full description

Bibliographic Details
Main Authors: Svetlana Cojocaru, Alexandru Colesnicov, Ludmila Malahov, Tudor Bumbu, Ștefan Ungur
Format: Article
Language:English
Published: Institute of Mathematics and Computer Science of the Academy of Sciences of Moldova 2017-08-01
Series:Computer Science Journal of Moldova
Online Access:http://www.math.md/files/csjm/v25-n2/v25-n2-(pp217-225).pdf
Description
Summary:The paper describes in details recognition of Romanian texts of the \nth{17}--\nth{18} centuries printed in the Cyrillic script, and their conversion to the modern Latin script. The challenges are discussed, and solutions of problems are proposed. The elaborated technology and a tool pack include historical alphabets, sets of recognition patterns, and spelling dictionaries in the corresponding orthographies for ABBYY Finereader. In addition, virtual keyboards, fonts, a transliteration utility, and the user manual were developed. This permits successful recognition of old Romanian texts in the Cyrillic script. Transliteration to the Latin script grants no-barrier access to historical documents.
ISSN:1561-4042