A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval

The selection of semantic concepts for modal construction and data collection remains an open research issue. It is highly demanding to choose good multimedia concepts with small semantic gaps to facilitate the work of cross-media system developers. However, very little work has been done in this ar...

Full description

Bibliographic Details
Main Authors:	Sadaqat Ur Rehman, Shanshan Tu, Yongfeng Huang, Obaid Ur Rehman
Format:	Article
Language:	English
Published:	IEEE 2018-01-01
Series:	IEEE Access
Subjects:	Cross-media retrieval FB5K dataset high-level semantic embeddings
Online Access:	https://ieeexplore.ieee.org/document/8516912/

id	doaj-3cffecbd8e4f41e983437b0ee84bfc17
record_format	Article
spelling	doaj-3cffecbd8e4f41e983437b0ee84bfc172021-03-29T20:27:57ZengIEEEIEEE Access2169-35362018-01-016671766718810.1109/ACCESS.2018.28788688516912A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media RetrievalSadaqat Ur Rehman0https://orcid.org/0000-0002-4449-1708Shanshan Tu1Yongfeng Huang2Obaid Ur Rehman3https://orcid.org/0000-0003-4577-6059Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, ChinaFaculty of Information Technology, Beijing University of Technology, Beijing, ChinaTsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, ChinaSarhad University of Science and Information Technology, Peshawar, PakistanThe selection of semantic concepts for modal construction and data collection remains an open research issue. It is highly demanding to choose good multimedia concepts with small semantic gaps to facilitate the work of cross-media system developers. However, very little work has been done in this area. This paper contributes a new, real-world web image dataset for cross-media retrieval called FB5K. The proposed FB5K dataset contains the following attributes: 1) 5130 images crawled from Facebook; 2) images that are categorized according to users’ feelings; 3) images independent of text and language rather than using feelings for search. Furthermore, we propose a novel approach through the use of Optical Character Recognition and explicit incorporation of high-level semantic information. We comprehensively compute the performance of four different subspace-learning methods and three modified versions of the Correspondence Auto Encoder, alongside numerous text features and similarity measurements comparing Wikipedia, Flickr30k, and FB5K. To check the characteristics of FB5K, we propose a semantic-based cross-media retrieval method. To accomplish cross-media retrieval, we introduced a new similarity measurement in the embedded space, which significantly improved system performance compared with the conventional Euclidean distance. Our experimental results demonstrated the efficiency of the proposed retrieval method on three different datasets to simplify and improve general image retrieval.https://ieeexplore.ieee.org/document/8516912/Cross-media retrievalFB5K datasethigh-level semantic embeddings
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sadaqat Ur Rehman Shanshan Tu Yongfeng Huang Obaid Ur Rehman
spellingShingle	Sadaqat Ur Rehman Shanshan Tu Yongfeng Huang Obaid Ur Rehman A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval IEEE Access Cross-media retrieval FB5K dataset high-level semantic embeddings
author_facet	Sadaqat Ur Rehman Shanshan Tu Yongfeng Huang Obaid Ur Rehman
author_sort	Sadaqat Ur Rehman
title	A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
title_short	A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
title_full	A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
title_fullStr	A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
title_full_unstemmed	A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval
title_sort	benchmark dataset and learning high-level semantic embeddings of multimedia for cross-media retrieval
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2018-01-01
description	The selection of semantic concepts for modal construction and data collection remains an open research issue. It is highly demanding to choose good multimedia concepts with small semantic gaps to facilitate the work of cross-media system developers. However, very little work has been done in this area. This paper contributes a new, real-world web image dataset for cross-media retrieval called FB5K. The proposed FB5K dataset contains the following attributes: 1) 5130 images crawled from Facebook; 2) images that are categorized according to users’ feelings; 3) images independent of text and language rather than using feelings for search. Furthermore, we propose a novel approach through the use of Optical Character Recognition and explicit incorporation of high-level semantic information. We comprehensively compute the performance of four different subspace-learning methods and three modified versions of the Correspondence Auto Encoder, alongside numerous text features and similarity measurements comparing Wikipedia, Flickr30k, and FB5K. To check the characteristics of FB5K, we propose a semantic-based cross-media retrieval method. To accomplish cross-media retrieval, we introduced a new similarity measurement in the embedded space, which significantly improved system performance compared with the conventional Euclidean distance. Our experimental results demonstrated the efficiency of the proposed retrieval method on three different datasets to simplify and improve general image retrieval.
topic	Cross-media retrieval FB5K dataset high-level semantic embeddings
url	https://ieeexplore.ieee.org/document/8516912/
work_keys_str_mv	AT sadaqaturrehman abenchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT shanshantu abenchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT yongfenghuang abenchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT obaidurrehman abenchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT sadaqaturrehman benchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT shanshantu benchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT yongfenghuang benchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval AT obaidurrehman benchmarkdatasetandlearninghighlevelsemanticembeddingsofmultimediaforcrossmediaretrieval
_version_	1724194774463807488

A Benchmark Dataset and Learning High-Level Semantic Embeddings of Multimedia for Cross-Media Retrieval

Similar Items