Aligning document layouts extracted with different OCR engines with clustering approach

Layout analysis is essential step in information extraction from scanned document images. In this paper we propose an algorithm for aligning layouts generated with different OCR engines. The main requirement is to always generate the same layout for the given document image regardless of OCR engine...

Full description

Bibliographic Details
Main Authors: S. Tomovic, K. Pavlovic, M. Bajceta
Format: Article
Language:English
Published: Elsevier 2021-09-01
Series:Egyptian Informatics Journal
Subjects:
OCR
Online Access:http://www.sciencedirect.com/science/article/pii/S1110866520301638