INVESTIGATIONS ON SKIP-CONNECTIONS WITH AN ADDITIONAL COSINE SIMILARITY LOSS FOR LAND COVER CLASSIFICATION

Pixel-based <i>land cover</i> classification of aerial images is a standard task in remote sensing, whose goal is to identify the physical material of the earth’s surface. Recently, most of the well-performing methods rely on encoder-decoder structure based convolutional neural networks...

Full description

Bibliographic Details
Main Authors: C. Yang, F. Rottensteiner, C. Heipke
Format: Article
Language:English
Published: Copernicus Publications 2020-08-01
Series:ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/V-3-2020/339/2020/isprs-annals-V-3-2020-339-2020.pdf
Description
Summary:Pixel-based <i>land cover</i> classification of aerial images is a standard task in remote sensing, whose goal is to identify the physical material of the earth’s surface. Recently, most of the well-performing methods rely on encoder-decoder structure based convolutional neural networks (CNN). In the encoder part, many successive convolution and pooling operations are applied to obtain features at a lower spatial resolution, and in the decoder part these features are up-sampled gradually and layer by layer, in order to make predictions in the original spatial resolution. However, the loss of spatial resolution caused by pooling affects the final classification performance negatively, which is compensated by <i>skip-connections</i> between corresponding features in the encoder and the decoder. The most popular ways to combine features are element-wise addition of feature maps and 1x1 convolution. In this work, we investigate <i>skip-connections</i>. We argue that not every skip-connections are equally important. Therefore, we conducted experiments designed to find out which <i>skip-connections</i> are important. Moreover, we propose a new cosine similarity loss function to utilize the relationship of the features of the pixels belonging to the same category inside one mini-batch, i.e. these features should be close in feature space. Our experiments show that the new cosine similarity loss does help the classification. We evaluated our methods using the Vaihingen and Potsdam dataset of the ISPRS 2D semantic labelling challenge and achieved an overall accuracy of 91.1% for both test sites.
ISSN:2194-9042
2194-9050