A multi-scale contextual attention network for remote sensing visual question answering
Remote sensing visual question answering (RSVQA) is a user-friendly method used for analyzing remote sensing images (RSIs) in various tasks. However, current methods often overlook geospatial objects, which possess a multi-scale representation and require contextual information. Furthermore, limited...
| Published in: | International Journal of Applied Earth Observations and Geoinformation |
|---|---|
| Main Authors: | Jiangfan Feng, Hui Wang |
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-02-01
|
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S156984322300465X |
Similar Items
Improving visual question answering for remote sensing via alternate-guided attention and combined loss
by: Jiangfan Feng, et al.
Published: (2023-08-01)
by: Jiangfan Feng, et al.
Published: (2023-08-01)
Review of Visual Question Answering Technology
by: WANG Yu, SUN Haichun
Published: (2023-07-01)
by: WANG Yu, SUN Haichun
Published: (2023-07-01)
RSMoDM: Multimodal Momentum Distillation Model for Remote Sensing Visual Question Answering
by: Pengfei Li, et al.
Published: (2024-01-01)
by: Pengfei Li, et al.
Published: (2024-01-01)
Multilingual visual question answering for visually impaired people
by: Ratnabali Pal, et al.
Published: (2025-08-01)
by: Ratnabali Pal, et al.
Published: (2025-08-01)
A Comprehensive Review and Open Challenges on Visual Question Answering Models
by: Fasi Ahamad Shaik, et al.
Published: (2023-09-01)
by: Fasi Ahamad Shaik, et al.
Published: (2023-09-01)
Medical Visual Question Answering Based on Cross-Modal Attention Feature Enhancement
by: LIU Kai, REN Hongyi, LI Ying, JI Yi, LIU Chunping
Published: (2025-06-01)
by: LIU Kai, REN Hongyi, LI Ying, JI Yi, LIU Chunping
Published: (2025-06-01)
Answer Distillation Network With Bi-Text-Image Attention for Medical Visual Question Answering
by: Hongfang Gong, et al.
Published: (2025-01-01)
by: Hongfang Gong, et al.
Published: (2025-01-01)
PERS: Parameter-Efficient Multimodal Transfer Learning for Remote Sensing Visual Question Answering
by: Jinlong He, et al.
Published: (2024-01-01)
by: Jinlong He, et al.
Published: (2024-01-01)
SBVQA 2.0: Robust End-to-End Speech-Based Visual Question Answering for Open-Ended Questions
by: Faris Alasmary, et al.
Published: (2023-01-01)
by: Faris Alasmary, et al.
Published: (2023-01-01)
Unified Transformer with Cross-Modal Mixture Experts for Remote-Sensing Visual Question Answering
by: Gang Liu, et al.
Published: (2023-09-01)
by: Gang Liu, et al.
Published: (2023-09-01)
Multi-Module Co-Attention Model for Visual Question Answering
by: ZOU Pinrong, XIAO Feng, ZHANG Wenjuan, ZHANG Wanyu, WANG Chenyang
Published: (2022-02-01)
by: ZOU Pinrong, XIAO Feng, ZHANG Wenjuan, ZHANG Wanyu, WANG Chenyang
Published: (2022-02-01)
A Semantic Weight Adaptive Model Based on Visual Question Answering
by: Li Huimin, et al.
Published: (2025-01-01)
by: Li Huimin, et al.
Published: (2025-01-01)
Adaptive Conditional Reasoning for Remote Sensing Visual Question Answering
by: Yiqun Gao, et al.
Published: (2025-04-01)
by: Yiqun Gao, et al.
Published: (2025-04-01)
Envisioning Answers: Unleashing Deep Learning for Visual Question Answering in Artistic Images
by: Erfan Zolghadriha, et al.
Published: (2024-03-01)
by: Erfan Zolghadriha, et al.
Published: (2024-03-01)
Designing and Evaluating a Dual-Stream Transformer-Based Architecture for Visual Question Answering
by: Faheem Shehzad, et al.
Published: (2024-01-01)
by: Faheem Shehzad, et al.
Published: (2024-01-01)
Co-LLaVA: Efficient Remote Sensing Visual Question Answering via Model Collaboration
by: Fan Liu, et al.
Published: (2025-01-01)
by: Fan Liu, et al.
Published: (2025-01-01)
ECSA: Mitigating Catastrophic Forgetting and Few-Shot Generalization in Medical Visual Question Answering
by: Qinhao Jia, et al.
Published: (2025-10-01)
by: Qinhao Jia, et al.
Published: (2025-10-01)
Zero-Shot Knowledge-Based Visual Question Answering with Frozen Language Models
by: Jing Liu, et al.
Published: (2025-12-01)
by: Jing Liu, et al.
Published: (2025-12-01)
Seeing and Reasoning: A Simple Deep Learning Approach to Visual Question Answering
by: Rufai Yusuf Zakari, et al.
Published: (2025-04-01)
by: Rufai Yusuf Zakari, et al.
Published: (2025-04-01)
A Multi-Modal Attentive Framework That Can Interpret Text (MMAT)
by: Vijay Kumari, et al.
Published: (2025-01-01)
by: Vijay Kumari, et al.
Published: (2025-01-01)
Intelligent visual question answering in TCM education: An innovative application of IoT and multimodal fusion
by: Wei Bi, et al.
Published: (2025-04-01)
by: Wei Bi, et al.
Published: (2025-04-01)
RS-LLaVA: A Large Vision-Language Model for Joint Captioning and Question Answering in Remote Sensing Imagery
by: Yakoub Bazi, et al.
Published: (2024-04-01)
by: Yakoub Bazi, et al.
Published: (2024-04-01)
Survey of Visual Question Answering Based on Deep Learning
by: LI Xiang, FAN Zhiguang, LI Xuexiang, ZHANG Weixing, YANG Cong, CAO Yangjie
Published: (2023-05-01)
by: LI Xiang, FAN Zhiguang, LI Xuexiang, ZHANG Weixing, YANG Cong, CAO Yangjie
Published: (2023-05-01)
Deep Modular Bilinear Attention Network for Visual Question Answering
by: Feng Yan, et al.
Published: (2022-01-01)
by: Feng Yan, et al.
Published: (2022-01-01)
Adversarial Learning with Bidirectional Attention for Visual Question Answering
by: Qifeng Li, et al.
Published: (2021-10-01)
by: Qifeng Li, et al.
Published: (2021-10-01)
PTCR: Knowledge-Based Visual Question Answering Framework Based on Large Language Model
by: XUE Di, LI Xin, LIU Mingshuai
Published: (2024-11-01)
by: XUE Di, LI Xin, LIU Mingshuai
Published: (2024-11-01)
SAR Strikes Back: A New Hope for RSVQA
by: Lucrezia Tosato, et al.
Published: (2025-01-01)
by: Lucrezia Tosato, et al.
Published: (2025-01-01)
Cross-modal Information Filtering-based Networks for Visual Question Answering
by: HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan
Published: (2024-05-01)
by: HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan
Published: (2024-05-01)
Adaptable Closed-Domain Question Answering Using Contextualized CNN-Attention Models and Question Expansion
by: Mahsa Abazari Kia, et al.
Published: (2022-01-01)
by: Mahsa Abazari Kia, et al.
Published: (2022-01-01)
Multi-Modal Explicit Sparse Attention Networks for Visual Question Answering
by: Zihan Guo, et al.
Published: (2020-11-01)
by: Zihan Guo, et al.
Published: (2020-11-01)
Machine-to-Machine Visual Dialoguing with ChatGPT for Enriched Textual Image Description
by: Riccardo Ricci, et al.
Published: (2024-01-01)
by: Riccardo Ricci, et al.
Published: (2024-01-01)
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
by: Junkai Zhang, et al.
Published: (2025-04-01)
by: Junkai Zhang, et al.
Published: (2025-04-01)
Visual Question Answering Model Based on Multi-modal Deep Feature Fusion
by: ZOU Yunzhu, DU Shengdong, TENG Fei, LI Tianrui
Published: (2023-02-01)
by: ZOU Yunzhu, DU Shengdong, TENG Fei, LI Tianrui
Published: (2023-02-01)
MSAM:Video Question Answering Based on Multi-Stage Attention Model
by: LIANG Li-li, et al.
Published: (2022-08-01)
by: LIANG Li-li, et al.
Published: (2022-08-01)
Object sequences: encoding categorical and spatial information for a yes/no visual question answering task
by: Shivam Garg, et al.
Published: (2018-12-01)
by: Shivam Garg, et al.
Published: (2018-12-01)
Towards Robust Chain-of-Thought Prompting with Self-Consistency for Remote Sensing VQA: An Empirical Study Across Large Multimodal Models
by: Fatema Tuj Johora Faria, et al.
Published: (2025-09-01)
by: Fatema Tuj Johora Faria, et al.
Published: (2025-09-01)
Survey of Multimodal Medical Question Answering
by: Hilmi Demirhan, et al.
Published: (2023-12-01)
by: Hilmi Demirhan, et al.
Published: (2023-12-01)
Prompting Large Language Models with Knowledge-Injection for Knowledge-Based Visual Question Answering
by: Zhongjian Hu, et al.
Published: (2024-09-01)
by: Zhongjian Hu, et al.
Published: (2024-09-01)
Dual modality prompt learning for visual question-grounded answering in robotic surgery
by: Yue Zhang, et al.
Published: (2024-04-01)
by: Yue Zhang, et al.
Published: (2024-04-01)
Question Difficulty Estimation Based on Attention Model for Question Answering
by: Hyun-Je Song, et al.
Published: (2021-12-01)
by: Hyun-Je Song, et al.
Published: (2021-12-01)
Similar Items
-
Improving visual question answering for remote sensing via alternate-guided attention and combined loss
by: Jiangfan Feng, et al.
Published: (2023-08-01) -
Review of Visual Question Answering Technology
by: WANG Yu, SUN Haichun
Published: (2023-07-01) -
RSMoDM: Multimodal Momentum Distillation Model for Remote Sensing Visual Question Answering
by: Pengfei Li, et al.
Published: (2024-01-01) -
Multilingual visual question answering for visually impaired people
by: Ratnabali Pal, et al.
Published: (2025-08-01) -
A Comprehensive Review and Open Challenges on Visual Question Answering Models
by: Fasi Ahamad Shaik, et al.
Published: (2023-09-01)
