Learning to Fuse Multiscale Features for Visual Place Recognition
Efficient and robust visual place recognition is of great importance to autonomous mobile robots. Recent work has shown that features learned from convolutional neural networks achieve impressed performance with efficient feature size, where most of them are pooled or aggregated from a convolutional...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8585013/ |
id |
doaj-b6f38de914b84263b22c9b11742cf4b1 |
---|---|
record_format |
Article |
spelling |
doaj-b6f38de914b84263b22c9b11742cf4b12021-03-29T22:07:06ZengIEEEIEEE Access2169-35362019-01-0175723573510.1109/ACCESS.2018.28890308585013Learning to Fuse Multiscale Features for Visual Place RecognitionJun Mao0https://orcid.org/0000-0002-2477-0742Xiaoping Hu1Xiaofeng He2Lilian Zhang3Liao Wu4Michael J. Milford5Department of Automation, National University of Defense Technology, Changsha, ChinaDepartment of Automation, National University of Defense Technology, Changsha, ChinaDepartment of Automation, National University of Defense Technology, Changsha, ChinaDepartment of Automation, National University of Defense Technology, Changsha, ChinaSchool of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, AustraliaSchool of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, AustraliaEfficient and robust visual place recognition is of great importance to autonomous mobile robots. Recent work has shown that features learned from convolutional neural networks achieve impressed performance with efficient feature size, where most of them are pooled or aggregated from a convolutional feature map. However, convolutional filters only capture the appearance of their perceptive fields, which lack the considerations on how to combine the multiscale appearance for place recognition. In this paper, we propose a novel method to build a multiscale feature pyramid and present two approaches to use the pyramid to augment the place recognition capability. The first approach fuses the pyramid to obtain a new feature map, which has an awareness of both the local and semi-global appearance, and the second approach learns an attention model from the feature pyramid to weight the spatial grids on the original feature map. Both approaches combine the multiscale features in the pyramid to suppress the confusing local features while tackling the problem in two different ways. Extensive experiments have been conducted on benchmark datasets with varying degrees of appearance and viewpoint variations. The results show that the proposed approaches achieve superior performance over the networks without the multiscale feature fusion and the multiscale attention components. Analyses on the performance of using different feature pyramids are also provided.https://ieeexplore.ieee.org/document/8585013/Visual place recognitiondeep learningmobile robotslocalization |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jun Mao Xiaoping Hu Xiaofeng He Lilian Zhang Liao Wu Michael J. Milford |
spellingShingle |
Jun Mao Xiaoping Hu Xiaofeng He Lilian Zhang Liao Wu Michael J. Milford Learning to Fuse Multiscale Features for Visual Place Recognition IEEE Access Visual place recognition deep learning mobile robots localization |
author_facet |
Jun Mao Xiaoping Hu Xiaofeng He Lilian Zhang Liao Wu Michael J. Milford |
author_sort |
Jun Mao |
title |
Learning to Fuse Multiscale Features for Visual Place Recognition |
title_short |
Learning to Fuse Multiscale Features for Visual Place Recognition |
title_full |
Learning to Fuse Multiscale Features for Visual Place Recognition |
title_fullStr |
Learning to Fuse Multiscale Features for Visual Place Recognition |
title_full_unstemmed |
Learning to Fuse Multiscale Features for Visual Place Recognition |
title_sort |
learning to fuse multiscale features for visual place recognition |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Efficient and robust visual place recognition is of great importance to autonomous mobile robots. Recent work has shown that features learned from convolutional neural networks achieve impressed performance with efficient feature size, where most of them are pooled or aggregated from a convolutional feature map. However, convolutional filters only capture the appearance of their perceptive fields, which lack the considerations on how to combine the multiscale appearance for place recognition. In this paper, we propose a novel method to build a multiscale feature pyramid and present two approaches to use the pyramid to augment the place recognition capability. The first approach fuses the pyramid to obtain a new feature map, which has an awareness of both the local and semi-global appearance, and the second approach learns an attention model from the feature pyramid to weight the spatial grids on the original feature map. Both approaches combine the multiscale features in the pyramid to suppress the confusing local features while tackling the problem in two different ways. Extensive experiments have been conducted on benchmark datasets with varying degrees of appearance and viewpoint variations. The results show that the proposed approaches achieve superior performance over the networks without the multiscale feature fusion and the multiscale attention components. Analyses on the performance of using different feature pyramids are also provided. |
topic |
Visual place recognition deep learning mobile robots localization |
url |
https://ieeexplore.ieee.org/document/8585013/ |
work_keys_str_mv |
AT junmao learningtofusemultiscalefeaturesforvisualplacerecognition AT xiaopinghu learningtofusemultiscalefeaturesforvisualplacerecognition AT xiaofenghe learningtofusemultiscalefeaturesforvisualplacerecognition AT lilianzhang learningtofusemultiscalefeaturesforvisualplacerecognition AT liaowu learningtofusemultiscalefeaturesforvisualplacerecognition AT michaeljmilford learningtofusemultiscalefeaturesforvisualplacerecognition |
_version_ |
1724192203347066880 |