Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning
Image captioning is one of the hot research topics in the field of computer vision.It is a cross-media data analysis task that combines computer vision and natural language processing.It describes the image by understanding the content of the image and generating captions that are both semantically...
| Published in: | Jisuanji kexue |
|---|---|
| Main Author: | |
| Format: | Article |
| Language: | Chinese |
| Published: |
Editorial office of Computer Science
2022-10-01
|
| Subjects: | |
| Online Access: | https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-151.pdf |
