Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation

Remote sensing image classification plays a significant role in urban applications, precision agriculture, water resource management. The task of classification in the field of remote sensing is to map raw images to semantic maps. Typically, fully convolutional network (FCN) is one of the most effec...

Full description

Bibliographic Details
Main Authors: Yunfeng Zhang, Mingmin Chi
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9151932/
id doaj-65377cabe322480fbc347ca2aa9d4314
record_format Article
spelling doaj-65377cabe322480fbc347ca2aa9d43142021-03-30T04:21:39ZengIEEEIEEE Access2169-35362020-01-01815575315576510.1109/ACCESS.2020.30127019151932Mask-R-FCN: A Deep Fusion Network for Semantic SegmentationYunfeng Zhang0https://orcid.org/0000-0002-9433-4276Mingmin Chi1https://orcid.org/0000-0003-2650-4146Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, ChinaShanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, ChinaRemote sensing image classification plays a significant role in urban applications, precision agriculture, water resource management. The task of classification in the field of remote sensing is to map raw images to semantic maps. Typically, fully convolutional network (FCN) is one of the most effective deep neural networks for semantic segmentation. However, small objects in remote sensing images can be easily overlooked and misclassified as the majority label, which is often the background of the image. Although many works have attempted to deal with this problem, making a trade-off between background semantics and edge details is still a problem. This is mainly because they are based on a single neural network model. To deal with this problem, a convolutional deep network with regions (R-CNN), which is highly effective for object detection is leveraged as a complementary component in our work. A learning-based and decision-level strategy is applied to fuse both semantic maps from a semantic model and an object detection model. The proposed network is referred to as Mask-R-FCN. Experimental results on real remote sensing images from the Zurich dataset, Gaofen Image Dataset (GID), and DataFountain2017 show that the proposed network can obtain higher accuracy than single deep neural networks and other machine learning algorithms. The proposed network achieved better average accuracies, which are approximately 2% higher than those of any other single deep neural networks on the Zurich, GID, and DataFoundation2017 datasets.https://ieeexplore.ieee.org/document/9151932/Deep fusiondeep semantic segmentationfully convolutional networkobject detectionremote sensing
collection DOAJ
language English
format Article
sources DOAJ
author Yunfeng Zhang
Mingmin Chi
spellingShingle Yunfeng Zhang
Mingmin Chi
Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
IEEE Access
Deep fusion
deep semantic segmentation
fully convolutional network
object detection
remote sensing
author_facet Yunfeng Zhang
Mingmin Chi
author_sort Yunfeng Zhang
title Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
title_short Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
title_full Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
title_fullStr Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
title_full_unstemmed Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation
title_sort mask-r-fcn: a deep fusion network for semantic segmentation
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Remote sensing image classification plays a significant role in urban applications, precision agriculture, water resource management. The task of classification in the field of remote sensing is to map raw images to semantic maps. Typically, fully convolutional network (FCN) is one of the most effective deep neural networks for semantic segmentation. However, small objects in remote sensing images can be easily overlooked and misclassified as the majority label, which is often the background of the image. Although many works have attempted to deal with this problem, making a trade-off between background semantics and edge details is still a problem. This is mainly because they are based on a single neural network model. To deal with this problem, a convolutional deep network with regions (R-CNN), which is highly effective for object detection is leveraged as a complementary component in our work. A learning-based and decision-level strategy is applied to fuse both semantic maps from a semantic model and an object detection model. The proposed network is referred to as Mask-R-FCN. Experimental results on real remote sensing images from the Zurich dataset, Gaofen Image Dataset (GID), and DataFountain2017 show that the proposed network can obtain higher accuracy than single deep neural networks and other machine learning algorithms. The proposed network achieved better average accuracies, which are approximately 2% higher than those of any other single deep neural networks on the Zurich, GID, and DataFoundation2017 datasets.
topic Deep fusion
deep semantic segmentation
fully convolutional network
object detection
remote sensing
url https://ieeexplore.ieee.org/document/9151932/
work_keys_str_mv AT yunfengzhang maskrfcnadeepfusionnetworkforsemanticsegmentation
AT mingminchi maskrfcnadeepfusionnetworkforsemanticsegmentation
_version_ 1724181977752403968