Advancement in Image Caption Generation
- DOI
- 10.2991/978-94-6463-738-0_47How to use a DOI?
- Keywords
- Image Captioning; Flickr8k; LSTM; CNN
- Abstract
Image captioning represents a convergence of computer vision along with natural language processing, with the objective of creating an effective image caption generator through deep learning methodologies. This process entails annotating images with English keywords by employing datasets during the training phase of the model, leveraging both of them natural language processing integrating with computer vision to formulate captions. The dataset comprises pairs of images and their corresponding captions, where a CNN encoder is utilized to draw out semantic information arising out of the visual components, while the LSTM decoder is trained to generate captions sequentially. The findings indicate the efficacy of the deep learning approach having an accuracy of about 90%, highlighting its potential applications in diverse areas such as assistive technology and multimedia retrieval. Future research concentrate on improving model performance and enhancing contextual understanding to yield more human-like descriptions.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Kaamya Sarda AU - Anshu Mehta AU - Aleem Ali PY - 2025 DA - 2025/06/22 TI - Advancement in Image Caption Generation BT - Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025) PB - Atlantis Press SP - 586 EP - 600 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-738-0_47 DO - 10.2991/978-94-6463-738-0_47 ID - Sarda2025 ER -