Context-Driven Image Caption With Global Semantic Relations of the Named Entities

Automatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations...

Full description

Bibliographic Details
Main Authors: Yun Jing, Xu Zhiwei, Gao Guanglai
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9153759/
id doaj-d3aff2e8280d4903a4863aff4dcb0ee3
record_format Article
spelling doaj-d3aff2e8280d4903a4863aff4dcb0ee32021-03-30T04:51:34ZengIEEEIEEE Access2169-35362020-01-01814358414359410.1109/ACCESS.2020.30133219153759Context-Driven Image Caption With Global Semantic Relations of the Named EntitiesYun Jing0https://orcid.org/0000-0001-9411-7207Xu Zhiwei1https://orcid.org/0000-0001-7733-8219Gao Guanglai2Department of Computer Science, Inner Mongolia University, Hohhot, ChinaInstitute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaDepartment of Computer Science, Inner Mongolia University, Hohhot, ChinaAutomatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations and buildings. On the contrary, humans interpret images in a specific way by providing real-world knowledge with relations of the aforementioned named entities. To generate human-like captions, we focus on captioning the images of news, which could provide real-world knowledge of the whole story behind the images. Then we propose a novel model that makes captions closer to the human-like description of the image, by leveraging the semantic relevance of the named entities. The named entities are not only extracted from news under the guidance of the image content, but also extended with external knowledge based on the semantic relations. In detail, we propose a sentence correlation analysis algorithm to selectively draw the contextual information from news, and use entity-linking algorithm based on knowledge graph to discover the relations of entities with a global sight. The results of extensive experiments based on real-world dataset which is collected from the news show that our model generates image captions closer to the corresponding real-world captions.https://ieeexplore.ieee.org/document/9153759/Image captionnamed entitysemantic relation
collection DOAJ
language English
format Article
sources DOAJ
author Yun Jing
Xu Zhiwei
Gao Guanglai
spellingShingle Yun Jing
Xu Zhiwei
Gao Guanglai
Context-Driven Image Caption With Global Semantic Relations of the Named Entities
IEEE Access
Image caption
named entity
semantic relation
author_facet Yun Jing
Xu Zhiwei
Gao Guanglai
author_sort Yun Jing
title Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_short Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_full Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_fullStr Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_full_unstemmed Context-Driven Image Caption With Global Semantic Relations of the Named Entities
title_sort context-driven image caption with global semantic relations of the named entities
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Automatic image captioning has achieved a great progress. However, the existing captioning frameworks basically enumerate the objects in the image. The generated captions lack the real-world knowledge about named entities and their relations, such as the relations among famous persons, organizations and buildings. On the contrary, humans interpret images in a specific way by providing real-world knowledge with relations of the aforementioned named entities. To generate human-like captions, we focus on captioning the images of news, which could provide real-world knowledge of the whole story behind the images. Then we propose a novel model that makes captions closer to the human-like description of the image, by leveraging the semantic relevance of the named entities. The named entities are not only extracted from news under the guidance of the image content, but also extended with external knowledge based on the semantic relations. In detail, we propose a sentence correlation analysis algorithm to selectively draw the contextual information from news, and use entity-linking algorithm based on knowledge graph to discover the relations of entities with a global sight. The results of extensive experiments based on real-world dataset which is collected from the news show that our model generates image captions closer to the corresponding real-world captions.
topic Image caption
named entity
semantic relation
url https://ieeexplore.ieee.org/document/9153759/
work_keys_str_mv AT yunjing contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities
AT xuzhiwei contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities
AT gaoguanglai contextdrivenimagecaptionwithglobalsemanticrelationsofthenamedentities
_version_ 1724181153347272704