Joint Multimodal Embedding and Backtracking Search in Vision-and-Language Navigation

Joint Multimodal Embedding and Backtracking Search in Vision-and-Language Navigation

Due to the development of computer vision and natural language processing technologies in recent years, there has been a growing interest in multimodal intelligent tasks that require the ability to concurrently understand various forms of input data such as images and text. Vision-and-language navig...

Full description

Bibliographic Details
Main Authors:	Jisu Hwang, Incheol Kim
Format:	Article
Language:	English
Published:	MDPI AG 2021-02-01
Series:	Sensors
Subjects:	multimodal embedding natural language instruction panoramic image vision-and-language navigation task deep neural network pretrained model
Online Access:	https://www.mdpi.com/1424-8220/21/3/1012

Similar Items

Vision–Language–Knowledge Co-Embedding for Visual Commonsense Reasoning
by: JaeYun Lee, et al.
Published: (2021-04-01)

Delivering task instructions in multimodal synchronous online language teaching
by: Müge Satar, et al.
Published: (2020-09-01)

Affect Analysis in Arabic Text: Further Pre-Training Language Models for Sentiment and Emotion
by: Alothaim, A., et al.
Published: (2023)

The Evolution of Language Models Applied to Emotion Analysis of Arabic Tweets
by: Nora Al-Twairesh
Published: (2021-02-01)

Vision/INS Integrated Navigation System for Poor Vision Navigation Environments
by: Youngsun Kim, et al.
Published: (2016-10-01)

Natural language understanding of map navigation queries in Roman Urdu by joint entity and intent determination
by: Javeria Hassan, et al.
Published: (2021-07-01)

Parallelization of Graph Mining using Backtrack Search Algorithm
by: Okuno, Shingo
Published: (2017)

When Gesture “Takes Over”: Speech-Embedded Nonverbal Depictions in Multimodal Interaction
by: Hui-Chieh Hsu, et al.
Published: (2021-02-01)

Multimodal Encoder-Decoder Attention Networks for Visual Question Answering
by: Chongqing Chen, et al.
Published: (2020-01-01)

Multimodal Navigation Systems for Users with Visual Impairments—A Review and Analysis
by: Bineeth Kuriakose, et al.
Published: (2020-10-01)

Nested Named Entity Recognition via an Independent-Layered Pretrained Model
by: Liruizhi Jia, et al.
Published: (2021-01-01)

Leveraging Multimodal Perspectives to Learn Common Sense for Vision and Language Tasks
by: Lin, Xiao
Published: (2017)

Impact of Video Compression and Multimodal Embedding on Scene Description
by: Jin Young Lee
Published: (2019-08-01)

Deep Learning Approach for Vision Navigation in Flight
by: McNally, Branden Timothy
Published: (2018)

Evaluation of Computer Vision Algorithms Optimized for Embedded GPU:s.
by: Nilsson, Mattias
Published: (2014)

Run-Time Reconfigurable MPSoC-Based On-Board Processor for Vision-Based Space Navigation
by: Arturo Perez, et al.
Published: (2020-01-01)

A Deeper Look at Sheet Music Composer Classification Using Self-Supervised Pretraining
by: Daniel Yang, et al.
Published: (2021-02-01)

Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models
by: Chanhee Lee, et al.
Published: (2021-02-01)

Why We Should Study Multimodal Language
by: Pamela Perniss
Published: (2018-06-01)

Review on vision-based tracking in surgical navigation
by: Liangjing Yang, et al.
Published: (2020-09-01)

Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation
by: Sugyeong Eo, et al.
Published: (2021-07-01)

A Review on Explainability in Multimodal Deep Neural Nets
by: Gargi Joshi, et al.
Published: (2021-01-01)

Landmarks or panoramas: what do navigating ants attend to for guidance?
by: Beugnon Guy, et al.
Published: (2011-08-01)

Aspect-Level Drug Reviews Sentiment Analysis Based on Double BiGRU and Knowledge Transfer
by: Yue Han, et al.
Published: (2020-01-01)

Integrity Monitoring for Multiple Errors in Vision Navigation Systems
by: Baine, Nicholas Allen
Published: (2013)

SCANNING VISION SYSTEM FOR VEHICLE NAVIGATION
by: O. Sergiyenko
Published: (2012-01-01)

An intelligent multimodal medical diagnosis system based on patients’ medical questions and structured symptoms for telemedicine
by: Hossam Faris, et al.
Published: (2021-01-01)

Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms
by: Apewokin, Senyo
Published: (2009)

Embedded early vision techniques for efficient background modeling and midground detection
by: Valentine, Brian Evans
Published: (2010)

Multi-focal Vision and Gaze Control Improve Navigation Performance
by: Kolja Kuehnlenz, et al.
Published: (2008-11-01)

Full-Abstract Biomedical Relation Extraction with Keyword-Attentive Domain Knowledge Infusion
by: Xian Zhu, et al.
Published: (2021-08-01)

Formation of a Receptive Vocabulary and its Effect on the Rate of Acquisition of its Expressive Counterpart in an Autistic Child
by: O'Banion, Dan R.
Published: (1972)

Slovene and Croatian word embeddings in terms of gender occupational analogies
by: Matej Ulčar, et al.
Published: (2021-07-01)

Redefining Multimodality
by: Sandler, W.
Published: (2022)

TO THE QUESTION OF THEORETICAL JUSTIFICATION NAVIGATION CARTOGRAPHY
by: L. K. Radchenko
Published: (2016-01-01)

Situating Language in the Real-World: The Role of Multimodal Iconicity and Indexicality
by: Margherita Murgiano, et al.
Published: (2021-08-01)

Learning Multimodal Word Representations by Explicitly Embedding Syntactic and Phonetic Information
by: Wenhao Zhu, et al.
Published: (2020-01-01)

Context-Aware Personal Navigation Using Embedded Sensor Fusion in Smartphones
by: Sara Saeedi, et al.
Published: (2014-03-01)

Embedded vision system for intra-row weeding
by: Oberndorfer, Thomas
Published: (2006)

Three Landmark Optimization Strategies for Mobile Robot Visual Homing
by: Xun Ji, et al.
Published: (2018-09-01)