Context-Driven Image Caption With Global Semantic Relations of the Named Entities

Automatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations...

Full description

Bibliographic Details
Main Authors:	Yun Jing, Xu Zhiwei, Gao Guanglai
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Image caption named entity semantic relation
Online Access:	https://ieeexplore.ieee.org/document/9153759/

id	doaj-d3aff2e8280d4903a4863aff4dcb0ee3
record_format	Article
spelling	doaj-d3aff2e8280d4903a4863aff4dcb0ee32021-03-30T04:51:34ZengIEEEIEEE Access2169-35362020-01-01814358414359410.1109/ACCESS.2020.30133219153759Context-Driven Image Caption With Global Semantic Relations of the Named EntitiesYun Jing0https://orcid.org/0000-0001-9411-7207Xu Zhiwei1https://orcid.org/0000-0001-7733-8219Gao Guanglai2Department of Computer Science, Inner Mongolia University, Hohhot, ChinaInstitute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaDepartment of Computer Science, Inner Mongolia University, Hohhot, ChinaAutomatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations and buildings. On the contrary, humans interpret images in a specific way by providing real-world knowledge with relations of the aforementioned named entities. To generate human-like captions, we focus on captioning the images of news, which could provide real-world knowledge of the whole story behind the images. Then we propose a novel model that makes captions closer to the human-like description of the image, by leveraging the semantic relevance of the named entities. The named entities are not only extracted from news under the guidance of the image content, but also extended with external knowledge based on the semantic relations. In detail, we propose a sentence correlation analysis algorithm to selectively draw the contextual information from news, and use entity-linking algorithm based on knowledge graph to discover the relations of entities with a global sight. The results of extensive experiments based on real-world dataset which is collected from the news show that our model generates image captions closer to the corresponding real-world captions.https://ieeexplore.ieee.org/document/9153759/Image captionnamed entitysemantic relation
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Yun Jing Xu Zhiwei Gao Guanglai
spellingShingle	Yun Jing Xu Zhiwei Gao Guanglai Context-Driven Image Caption With Global Semantic Relations of the Named Entities IEEE Access Image caption named entity semantic relation
author_facet	Yun Jing Xu Zhiwei Gao Guanglai
author_sort	Yun Jing
title	Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_short	Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_full	Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_fullStr	Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_full_unstemmed	Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_sort	context-driven image caption with global semantic relations of the named entities
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2020-01-01
description	Automatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations and buildings. On the contrary, humans interpret images in a specific way by providing real-world knowledge with relations of the aforementioned named entities. To generate human-like captions, we focus on captioning the images of news, which could provide real-world knowledge of the whole story behind the images. Then we propose a novel model that makes captions closer to the human-like description of the image, by leveraging the semantic relevance of the named entities. The named entities are not only extracted from news under the guidance of the image content, but also extended with external knowledge based on the semantic relations. In detail, we propose a sentence correlation analysis algorithm to selectively draw the contextual information from news, and use entity-linking algorithm based on knowledge graph to discover the relations of entities with a global sight. The results of extensive experiments based on real-world dataset which is collected from the news show that our model generates image captions closer to the corresponding real-world captions.
topic	Image caption named entity semantic relation
url	https://ieeexplore.ieee.org/document/9153759/
work_keys_str_mv	AT yunjing contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities AT xuzhiwei contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities AT gaoguanglai contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities
_version_	1724181153347272704

Context-Driven Image Caption With Global Semantic Relations of the Named Entities

Similar Items