ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding

This research presents the idea of a novel fully-Convolutional Neural Network (CNN)-based model for probabilistic pixel-wise segmentation, titled Encoder-decoder-based CNN for Road-Scene Understanding (ECRU). Lately, scene understanding has become an evolving research area, and semantic segmentation...

Full description

Bibliographic Details
Main Author: Robail Yasrab
Format: Article
Language:English
Published: MDPI AG 2018-10-01
Series:Journal of Imaging
Subjects:
Online Access:http://www.mdpi.com/2313-433X/4/10/116
id doaj-66a1d0508f2246a5a8410a719963e03d
record_format Article
spelling doaj-66a1d0508f2246a5a8410a719963e03d2020-11-24T21:08:45ZengMDPI AGJournal of Imaging2313-433X2018-10-0141011610.3390/jimaging4100116jimaging4100116ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene UnderstandingRobail Yasrab0Computer Vision Laboratory, School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UKThis research presents the idea of a novel fully-Convolutional Neural Network (CNN)-based model for probabilistic pixel-wise segmentation, titled Encoder-decoder-based CNN for Road-Scene Understanding (ECRU). Lately, scene understanding has become an evolving research area, and semantic segmentation is the most recent method for visual recognition. Among vision-based smart systems, the driving assistance system turns out to be a much preferred research topic. The proposed model is an encoder-decoder that performs pixel-wise class predictions. The encoder network is composed of a VGG-19 layer model, while the decoder network uses 16 upsampling and deconvolution units. The encoder of the network has a very flexible architecture that can be altered and trained for any size and resolution of images. The decoder network upsamples and maps the low-resolution encoder’s features. Consequently, there is a substantial reduction in the trainable parameters, as the network recycles the encoder’s pooling indices for pixel-wise classification and segmentation. The proposed model is intended to offer a simplified CNN model with less overhead and higher performance. The network is trained and tested on the famous road scenes dataset CamVid and offers outstanding outcomes in comparison to similar early approaches like FCN and VGG16 in terms of performance vs. trainable parameters.http://www.mdpi.com/2313-433X/4/10/116convolutional neural network (CNN)ReLUencoder-decoderCamVidpoolingsemantic segmentationVGG-19ADAS
collection DOAJ
language English
format Article
sources DOAJ
author Robail Yasrab
spellingShingle Robail Yasrab
ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
Journal of Imaging
convolutional neural network (CNN)
ReLU
encoder-decoder
CamVid
pooling
semantic segmentation
VGG-19
ADAS
author_facet Robail Yasrab
author_sort Robail Yasrab
title ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
title_short ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
title_full ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
title_fullStr ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
title_full_unstemmed ECRU: An Encoder-Decoder Based Convolution Neural Network (CNN) for Road-Scene Understanding
title_sort ecru: an encoder-decoder based convolution neural network (cnn) for road-scene understanding
publisher MDPI AG
series Journal of Imaging
issn 2313-433X
publishDate 2018-10-01
description This research presents the idea of a novel fully-Convolutional Neural Network (CNN)-based model for probabilistic pixel-wise segmentation, titled Encoder-decoder-based CNN for Road-Scene Understanding (ECRU). Lately, scene understanding has become an evolving research area, and semantic segmentation is the most recent method for visual recognition. Among vision-based smart systems, the driving assistance system turns out to be a much preferred research topic. The proposed model is an encoder-decoder that performs pixel-wise class predictions. The encoder network is composed of a VGG-19 layer model, while the decoder network uses 16 upsampling and deconvolution units. The encoder of the network has a very flexible architecture that can be altered and trained for any size and resolution of images. The decoder network upsamples and maps the low-resolution encoder’s features. Consequently, there is a substantial reduction in the trainable parameters, as the network recycles the encoder’s pooling indices for pixel-wise classification and segmentation. The proposed model is intended to offer a simplified CNN model with less overhead and higher performance. The network is trained and tested on the famous road scenes dataset CamVid and offers outstanding outcomes in comparison to similar early approaches like FCN and VGG16 in terms of performance vs. trainable parameters.
topic convolutional neural network (CNN)
ReLU
encoder-decoder
CamVid
pooling
semantic segmentation
VGG-19
ADAS
url http://www.mdpi.com/2313-433X/4/10/116
work_keys_str_mv AT robailyasrab ecruanencoderdecoderbasedconvolutionneuralnetworkcnnforroadsceneunderstanding
_version_ 1716759526507544576