Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it...

Full description

Bibliographic Details
Main Authors:	Chen, Y. (Author), Li, F. (Author), Shi, G. (Author), Wu, L. (Author)
Format:	Article
Language:	English
Published:	MDPI 2022
Subjects:	Correlation graphs Cross-modal cross-modal hash learning Cross-modal hash learning Deep learning deep model Deep model Dependency relationship Hamming distance Hash functions Hashing method hashing retrieval Hashing retrieval Higher dimensional features Image regions Intermodality Modal analysis Semantics
Online Access:	View Fulltext in Publisher


LEADER	02923nam a2200385Ia 4500
001	10-3390-s22082921
008	220425s2022 CNT 000 0 und d
020			\|a 14248220 (ISSN)
245	1	0	\|a Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
260		0	\|b MDPI \|c 2022
856			\|z View Fulltext in Publisher \|u https://doi.org/10.3390/s22082921
520	3		\|a The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
650	0	4	\|a Correlation graphs
650	0	4	\|a Cross-modal
650	0	4	\|a cross-modal hash learning
650	0	4	\|a Cross-modal hash learning
650	0	4	\|a Deep learning
650	0	4	\|a deep model
650	0	4	\|a Deep model
650	0	4	\|a Dependency relationship
650	0	4	\|a Hamming distance
650	0	4	\|a Hash functions
650	0	4	\|a Hashing method
650	0	4	\|a hashing retrieval
650	0	4	\|a Hashing retrieval
650	0	4	\|a Higher dimensional features
650	0	4	\|a Image regions
650	0	4	\|a Intermodality
650	0	4	\|a Modal analysis
650	0	4	\|a Semantics
700	1		\|a Chen, Y. \|e author
700	1		\|a Li, F. \|e author
700	1		\|a Shi, G. \|e author
700	1		\|a Wu, L. \|e author
773			\|t Sensors

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

Similar Items