FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE

Scene classification plays an important role in remote sensing field. Traditional approaches use high-resolution remote sensing images as data source to extract powerful features. Although these kind of methods are common, the model performance is severely affected by the image quality of the datase...

Full description

Bibliographic Details
Main Authors: W. Geng, W. Zhou, S. Jin
Format: Article
Language:English
Published: Copernicus Publications 2021-08-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIV-M-3-2021/63/2021/isprs-archives-XLIV-M-3-2021-63-2021.pdf
id doaj-0d79093ab866400dba5154c9a98adcfe
record_format Article
spelling doaj-0d79093ab866400dba5154c9a98adcfe2021-08-11T00:34:36ZengCopernicus PublicationsThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences1682-17502194-90342021-08-01XLIV-M-3-2021636610.5194/isprs-archives-XLIV-M-3-2021-63-2021FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGEW. Geng0W. Zhou1S. Jin2S. Jin3School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing, ChinaShanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai, ChinaScene classification plays an important role in remote sensing field. Traditional approaches use high-resolution remote sensing images as data source to extract powerful features. Although these kind of methods are common, the model performance is severely affected by the image quality of the dataset, and the single modal (source) of images tend to cause the mission of some scene semantic information, which eventually degrade the classification accuracy. Nowadays, multi-modal remote sensing data become easy to obtain since the development of remote sensing technology. How to carry out scene classification of cross-modal data has become an interesting topic in the field. To solve the above problems, this paper proposes using feature fusion for cross-modal scene classification of remote sensing image, i.e., aerial and ground street view images, expecting to use the advantages of aerial images and ground street view data to complement each other. Our cross- modal model is based on Siamese Network. Specifically, we first train the cross-modal model by pairing different sources of data with aerial image and ground data. Then, the trained model is used to extract the deep features of the aerial and ground image pair, and the features of the two perspectives are fused to train a SVM classifier for scene classification. Our approach has been demonstrated using two public benchmark datasets, AiRound and CV-BrCT. The preliminary results show that the proposed method achieves state-of-the-art performance compared with the traditional methods, indicating that the information from ground data can contribute to aerial image classification.https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIV-M-3-2021/63/2021/isprs-archives-XLIV-M-3-2021-63-2021.pdf
collection DOAJ
language English
format Article
sources DOAJ
author W. Geng
W. Zhou
S. Jin
S. Jin
spellingShingle W. Geng
W. Zhou
S. Jin
S. Jin
FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
author_facet W. Geng
W. Zhou
S. Jin
S. Jin
author_sort W. Geng
title FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
title_short FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
title_full FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
title_fullStr FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
title_full_unstemmed FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE
title_sort feature fusion for cross-modal scene classification of remote sensing image
publisher Copernicus Publications
series The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
issn 1682-1750
2194-9034
publishDate 2021-08-01
description Scene classification plays an important role in remote sensing field. Traditional approaches use high-resolution remote sensing images as data source to extract powerful features. Although these kind of methods are common, the model performance is severely affected by the image quality of the dataset, and the single modal (source) of images tend to cause the mission of some scene semantic information, which eventually degrade the classification accuracy. Nowadays, multi-modal remote sensing data become easy to obtain since the development of remote sensing technology. How to carry out scene classification of cross-modal data has become an interesting topic in the field. To solve the above problems, this paper proposes using feature fusion for cross-modal scene classification of remote sensing image, i.e., aerial and ground street view images, expecting to use the advantages of aerial images and ground street view data to complement each other. Our cross- modal model is based on Siamese Network. Specifically, we first train the cross-modal model by pairing different sources of data with aerial image and ground data. Then, the trained model is used to extract the deep features of the aerial and ground image pair, and the features of the two perspectives are fused to train a SVM classifier for scene classification. Our approach has been demonstrated using two public benchmark datasets, AiRound and CV-BrCT. The preliminary results show that the proposed method achieves state-of-the-art performance compared with the traditional methods, indicating that the information from ground data can contribute to aerial image classification.
url https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLIV-M-3-2021/63/2021/isprs-archives-XLIV-M-3-2021-63-2021.pdf
work_keys_str_mv AT wgeng featurefusionforcrossmodalsceneclassificationofremotesensingimage
AT wzhou featurefusionforcrossmodalsceneclassificationofremotesensingimage
AT sjin featurefusionforcrossmodalsceneclassificationofremotesensingimage
AT sjin featurefusionforcrossmodalsceneclassificationofremotesensingimage
_version_ 1721211748460003328