Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization
Cross-view geo-localization (CVGL) aims to determine the capture location of street-view images by matching them with corresponding 2D maps, such as satellite imagery. While recent bird’s eye view (BEV)-based methods have advanced this task by addressing viewpoint and appearance differences, the exi...
| Published in: | Sensors |
|---|---|
| Main Authors: | , , |
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/1/44 |
| _version_ | 1849559393576681472 |
|---|---|
| author | Yuekun Hu Yingfan Liu Bin Hui |
| author_facet | Yuekun Hu Yingfan Liu Bin Hui |
| author_sort | Yuekun Hu |
| collection | DOAJ |
| container_title | Sensors |
| description | Cross-view geo-localization (CVGL) aims to determine the capture location of street-view images by matching them with corresponding 2D maps, such as satellite imagery. While recent bird’s eye view (BEV)-based methods have advanced this task by addressing viewpoint and appearance differences, the existing approaches typically rely solely on either OpenStreetMap (OSM) data or satellite imagery, limiting localization robustness due to single-modality constraints. This paper presents a novel CVGL method that fuses OSM data with satellite imagery, leveraging their complementary strengths to enhance localization robustness. We integrate the semantic richness and structural information from OSM with the high-resolution visual details of satellite imagery, creating a unified 2D geospatial representation. Additionally, we employ a transformer-based BEV perception module that utilizes attention mechanisms to construct fine-grained BEV features from street-view images for matching with fused map features. Compared to state-of-the-art methods that utilize only OSM data, our approach achieves substantial improvements, with 12.05% and 12.06% recall enhancements on the KITTI benchmark for lateral and longitudinal localization within a 1-m error, respectively. |
| format | Article |
| id | doaj-art-e9ade4e2cb0645edbade9004f3b26aca |
| institution | Directory of Open Access Journals |
| issn | 1424-8220 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | MDPI AG |
| record_format | Article |
| spelling | doaj-art-e9ade4e2cb0645edbade9004f3b26aca2025-08-20T02:36:08ZengMDPI AGSensors1424-82202024-12-012514410.3390/s25010044Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-LocalizationYuekun Hu0Yingfan Liu1Bin Hui2Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, ChinaKey Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, ChinaCross-view geo-localization (CVGL) aims to determine the capture location of street-view images by matching them with corresponding 2D maps, such as satellite imagery. While recent bird’s eye view (BEV)-based methods have advanced this task by addressing viewpoint and appearance differences, the existing approaches typically rely solely on either OpenStreetMap (OSM) data or satellite imagery, limiting localization robustness due to single-modality constraints. This paper presents a novel CVGL method that fuses OSM data with satellite imagery, leveraging their complementary strengths to enhance localization robustness. We integrate the semantic richness and structural information from OSM with the high-resolution visual details of satellite imagery, creating a unified 2D geospatial representation. Additionally, we employ a transformer-based BEV perception module that utilizes attention mechanisms to construct fine-grained BEV features from street-view images for matching with fused map features. Compared to state-of-the-art methods that utilize only OSM data, our approach achieves substantial improvements, with 12.05% and 12.06% recall enhancements on the KITTI benchmark for lateral and longitudinal localization within a 1-m error, respectively.https://www.mdpi.com/1424-8220/25/1/44cross-view geo-localizationOpenStreetMapsatellite imagerydata fusion |
| spellingShingle | Yuekun Hu Yingfan Liu Bin Hui Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization cross-view geo-localization OpenStreetMap satellite imagery data fusion |
| title | Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization |
| title_full | Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization |
| title_fullStr | Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization |
| title_full_unstemmed | Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization |
| title_short | Combining OpenStreetMap with Satellite Imagery to Enhance Cross-View Geo-Localization |
| title_sort | combining openstreetmap with satellite imagery to enhance cross view geo localization |
| topic | cross-view geo-localization OpenStreetMap satellite imagery data fusion |
| url | https://www.mdpi.com/1424-8220/25/1/44 |
| work_keys_str_mv | AT yuekunhu combiningopenstreetmapwithsatelliteimagerytoenhancecrossviewgeolocalization AT yingfanliu combiningopenstreetmapwithsatelliteimagerytoenhancecrossviewgeolocalization AT binhui combiningopenstreetmapwithsatelliteimagerytoenhancecrossviewgeolocalization |
