Cross-Lingual Visual Grounding

Visual grounding is a vision and language understanding task aiming at locating a region in an image according to a specific query phrase. However, most previous studies only address this task for the English language. Although there are previous cross-lingual vision and language studies, they work...

Full description

Bibliographic Details
Main Authors: Wenjian Dong, Mayu Otani, Noa Garcia, Yuta Nakashima, Chenhui Chu
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9305199/