Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images

Semantic segmentation in high-resolution remote-sensing (RS) images is a fundamental task for RS-based urban understanding and planning. However, various types of artificial objects in urban areas make this task quite challenging. Recently, the use of Deep Convolutional Neural Networks (DCNNs) with...

Full description

Bibliographic Details
Main Authors:	Shichen Guo, Qizhao Jin, Hongzhen Wang, Xuezhi Wang, Yangang Wang, Shiming Xiang
Format:	Article
Language:	English
Published:	MDPI AG 2019-08-01
Series:	Remote Sensing
Subjects:	semantic segmentation CNN deep learning remote sensing gate function multiscale feature fusion
Online Access:	https://www.mdpi.com/2072-4292/11/16/1922

id	doaj-1aba6007f7cf4ea9b6938e7c266183e6
record_format	Article
spelling	doaj-1aba6007f7cf4ea9b6938e7c266183e62020-11-25T01:23:19ZengMDPI AGRemote Sensing2072-42922019-08-011116192210.3390/rs11161922rs11161922Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing ImagesShichen Guo0Qizhao Jin1Hongzhen Wang2Xuezhi Wang3Yangang Wang4Shiming Xiang5Computer Network Information Center, Chinese Academy of Sciences, 4 Zhongguancun Nansi Street, Beijing 100190, ChinaNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95# Zhongguancun East Road, Beijing 100190, ChinaNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95# Zhongguancun East Road, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, 4 Zhongguancun Nansi Street, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, 4 Zhongguancun Nansi Street, Beijing 100190, ChinaNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95# Zhongguancun East Road, Beijing 100190, ChinaSemantic segmentation in high-resolution remote-sensing (RS) images is a fundamental task for RS-based urban understanding and planning. However, various types of artificial objects in urban areas make this task quite challenging. Recently, the use of Deep Convolutional Neural Networks (DCNNs) with multiscale information fusion has demonstrated great potential in enhancing performance. Technically, however, existing fusions are usually implemented by summing or concatenating feature maps in a straightforward way. Seldom do works consider the spatial importance for global-to-local context-information aggregation. This paper proposes a Learnable-Gated CNN (L-GCNN) to address this issue. Methodologically, the Taylor expression of the information-entropy function is first parameterized to design the gate function, which is employed to generate pixelwise weights for coarse-to-fine refinement in the L-GCNN. Accordingly, a Parameterized Gate Module (PGM) was designed to achieve this goal. Then, the single PGM and its densely connected extension were embedded into different levels of the encoder in the L-GCNN to help identify the discriminative feature maps at different scales. With the above designs, the L-GCNN is finally organized as a self-cascaded end-to-end architecture that is able to sequentially aggregate context information for fine segmentation. The proposed model was evaluated on two public challenging benchmarks, the ISPRS 2Dsemantic segmentation challenge Potsdam dataset and the Massachusetts building dataset. The experiment results demonstrate that the proposed method exhibited significant improvement compared with several related segmentation networks, including the FCN, SegNet, RefineNet, PSPNet, DeepLab and GSN.For example, on the Potsdam dataset, our method achieved a 93.65% <inline-formula> <math display="inline"> <semantics> <msub> <mi>F</mi> <mn>1</mn> </msub> </semantics> </math> </inline-formula> score and 88.06% <inline-formula> <math display="inline"> <semantics> <mrow> <mi>I</mi> <mi>o</mi> <mi>U</mi> </mrow> </semantics> </math> </inline-formula> score for the segmentation of tiny cars in high-resolution RS images. As a conclusion, the proposed model showed potential for object segmentation from the RS images of buildings, impervious surfaces, low vegetation, trees and cars in urban settings, which largely varies in size and have confusing appearances.https://www.mdpi.com/2072-4292/11/16/1922semantic segmentationCNNdeep learningremote sensinggate functionmultiscale feature fusion
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Shichen Guo Qizhao Jin Hongzhen Wang Xuezhi Wang Yangang Wang Shiming Xiang
spellingShingle	Shichen Guo Qizhao Jin Hongzhen Wang Xuezhi Wang Yangang Wang Shiming Xiang Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images Remote Sensing semantic segmentation CNN deep learning remote sensing gate function multiscale feature fusion
author_facet	Shichen Guo Qizhao Jin Hongzhen Wang Xuezhi Wang Yangang Wang Shiming Xiang
author_sort	Shichen Guo
title	Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images
title_short	Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images
title_full	Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images
title_fullStr	Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images
title_full_unstemmed	Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images
title_sort	learnable gated convolutional neural network for semantic segmentation in remote-sensing images
publisher	MDPI AG
series	Remote Sensing
issn	2072-4292
publishDate	2019-08-01
description	Semantic segmentation in high-resolution remote-sensing (RS) images is a fundamental task for RS-based urban understanding and planning. However, various types of artificial objects in urban areas make this task quite challenging. Recently, the use of Deep Convolutional Neural Networks (DCNNs) with multiscale information fusion has demonstrated great potential in enhancing performance. Technically, however, existing fusions are usually implemented by summing or concatenating feature maps in a straightforward way. Seldom do works consider the spatial importance for global-to-local context-information aggregation. This paper proposes a Learnable-Gated CNN (L-GCNN) to address this issue. Methodologically, the Taylor expression of the information-entropy function is first parameterized to design the gate function, which is employed to generate pixelwise weights for coarse-to-fine refinement in the L-GCNN. Accordingly, a Parameterized Gate Module (PGM) was designed to achieve this goal. Then, the single PGM and its densely connected extension were embedded into different levels of the encoder in the L-GCNN to help identify the discriminative feature maps at different scales. With the above designs, the L-GCNN is finally organized as a self-cascaded end-to-end architecture that is able to sequentially aggregate context information for fine segmentation. The proposed model was evaluated on two public challenging benchmarks, the ISPRS 2Dsemantic segmentation challenge Potsdam dataset and the Massachusetts building dataset. The experiment results demonstrate that the proposed method exhibited significant improvement compared with several related segmentation networks, including the FCN, SegNet, RefineNet, PSPNet, DeepLab and GSN.For example, on the Potsdam dataset, our method achieved a 93.65% <inline-formula> <math display="inline"> <semantics> <msub> <mi>F</mi> <mn>1</mn> </msub> </semantics> </math> </inline-formula> score and 88.06% <inline-formula> <math display="inline"> <semantics> <mrow> <mi>I</mi> <mi>o</mi> <mi>U</mi> </mrow> </semantics> </math> </inline-formula> score for the segmentation of tiny cars in high-resolution RS images. As a conclusion, the proposed model showed potential for object segmentation from the RS images of buildings, impervious surfaces, low vegetation, trees and cars in urban settings, which largely varies in size and have confusing appearances.
topic	semantic segmentation CNN deep learning remote sensing gate function multiscale feature fusion
url	https://www.mdpi.com/2072-4292/11/16/1922
work_keys_str_mv	AT shichenguo learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages AT qizhaojin learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages AT hongzhenwang learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages AT xuezhiwang learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages AT yangangwang learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages AT shimingxiang learnablegatedconvolutionalneuralnetworkforsemanticsegmentationinremotesensingimages
_version_	1725123064735203328

Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images

Similar Items