RNTR-Net: A Robust Natural Text Recognition Network

In this work, a novel robust natural text recognition network (RNTR-Net) is proposed based on a combination of convolutional neural network (CNN) (for feature extraction) and a recurrent neural network (RNN) (for sequence recognition). The pipeline design comprises an improved block of residual lear...

Full description

Bibliographic Details
Main Authors: Qiaokang Liang, Shao Xiang, Yaonan Wang, Wei Sun, Dan Zhang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
CNN
Online Access:https://ieeexplore.ieee.org/document/8950043/
id doaj-647cd6203f8a469d99bdb3a63ad5f01a
record_format Article
spelling doaj-647cd6203f8a469d99bdb3a63ad5f01a2021-03-30T01:18:37ZengIEEEIEEE Access2169-35362020-01-0187719773010.1109/ACCESS.2020.29641488950043RNTR-Net: A Robust Natural Text Recognition NetworkQiaokang Liang0https://orcid.org/0000-0002-5504-9966Shao Xiang1https://orcid.org/0000-0002-2797-1937Yaonan Wang2https://orcid.org/0000-0002-8578-2165Wei Sun3https://orcid.org/0000-0001-9744-4238Dan Zhang4https://orcid.org/0000-0002-7295-4837College of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaDepartment of Mechanical Engineering, York University, Toronto, ON, CanadaIn this work, a novel robust natural text recognition network (RNTR-Net) is proposed based on a combination of convolutional neural network (CNN) (for feature extraction) and a recurrent neural network (RNN) (for sequence recognition). The pipeline design comprises an improved block of residual learning combined with a general residual block to extract feature maps. Two bidirectional Long Short Term Memory (LSTM) networks are used for sequence recognition, and a transcription layer is used for decoding. The proposed network can handle text images suffering from distortion or other degradations. Compared with previous algorithms, we achieve superior results in general datasets, including the IIIT-5K, Street View Text and ICDAR datasets. Moreover, the performance of the presented network is either highly competitive or even state-of-the-art regarding the highly challenging SVT-Perspective and CUTE80 datasets. We obtain considerable performance of 84.7% and 62.6% on lexicon-free IIIT-5K and CUTE80 datasets, respectively. The experimental results demonstrate the effectiveness of our network.https://ieeexplore.ieee.org/document/8950043/Robust natural text recognition networkCNNresidual learningbidirectional LSTMs
collection DOAJ
language English
format Article
sources DOAJ
author Qiaokang Liang
Shao Xiang
Yaonan Wang
Wei Sun
Dan Zhang
spellingShingle Qiaokang Liang
Shao Xiang
Yaonan Wang
Wei Sun
Dan Zhang
RNTR-Net: A Robust Natural Text Recognition Network
IEEE Access
Robust natural text recognition network
CNN
residual learning
bidirectional LSTMs
author_facet Qiaokang Liang
Shao Xiang
Yaonan Wang
Wei Sun
Dan Zhang
author_sort Qiaokang Liang
title RNTR-Net: A Robust Natural Text Recognition Network
title_short RNTR-Net: A Robust Natural Text Recognition Network
title_full RNTR-Net: A Robust Natural Text Recognition Network
title_fullStr RNTR-Net: A Robust Natural Text Recognition Network
title_full_unstemmed RNTR-Net: A Robust Natural Text Recognition Network
title_sort rntr-net: a robust natural text recognition network
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description In this work, a novel robust natural text recognition network (RNTR-Net) is proposed based on a combination of convolutional neural network (CNN) (for feature extraction) and a recurrent neural network (RNN) (for sequence recognition). The pipeline design comprises an improved block of residual learning combined with a general residual block to extract feature maps. Two bidirectional Long Short Term Memory (LSTM) networks are used for sequence recognition, and a transcription layer is used for decoding. The proposed network can handle text images suffering from distortion or other degradations. Compared with previous algorithms, we achieve superior results in general datasets, including the IIIT-5K, Street View Text and ICDAR datasets. Moreover, the performance of the presented network is either highly competitive or even state-of-the-art regarding the highly challenging SVT-Perspective and CUTE80 datasets. We obtain considerable performance of 84.7% and 62.6% on lexicon-free IIIT-5K and CUTE80 datasets, respectively. The experimental results demonstrate the effectiveness of our network.
topic Robust natural text recognition network
CNN
residual learning
bidirectional LSTMs
url https://ieeexplore.ieee.org/document/8950043/
work_keys_str_mv AT qiaokangliang rntrnetarobustnaturaltextrecognitionnetwork
AT shaoxiang rntrnetarobustnaturaltextrecognitionnetwork
AT yaonanwang rntrnetarobustnaturaltextrecognitionnetwork
AT weisun rntrnetarobustnaturaltextrecognitionnetwork
AT danzhang rntrnetarobustnaturaltextrecognitionnetwork
_version_ 1724187251612581888