Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)

Advancement in Image Caption Generation

Authors
Kaamya Sarda1, *, Anshu Mehta2, Aleem Ali3
1Chandigarh University, Mohali, Punjab, India
2Chandigarh University, Mohali, Punjab, India
3Chandigarh University, Mohali, Punjab, India
*Corresponding author. Email: sardakaamya@gmail.com
Corresponding Author
Kaamya Sarda
Available Online 22 June 2025.
DOI
10.2991/978-94-6463-738-0_47How to use a DOI?
Keywords
Image Captioning; Flickr8k; LSTM; CNN
Abstract

Image captioning represents a convergence of computer vision along with natural language processing, with the objective of creating an effective image caption generator through deep learning methodologies. This process entails annotating images with English keywords by employing datasets during the training phase of the model, leveraging both of them natural language processing integrating with computer vision to formulate captions. The dataset comprises pairs of images and their corresponding captions, where a CNN encoder is utilized to draw out semantic information arising out of the visual components, while the LSTM decoder is trained to generate captions sequentially. The findings indicate the efficacy of the deep learning approach having an accuracy of about 90%, highlighting its potential applications in diverse areas such as assistive technology and multimedia retrieval. Future research concentrate on improving model performance and enhancing contextual understanding to yield more human-like descriptions.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
Series
Advances in Intelligent Systems Research
Publication Date
22 June 2025
ISBN
978-94-6463-738-0
ISSN
1951-6851
DOI
10.2991/978-94-6463-738-0_47How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Kaamya Sarda
AU  - Anshu Mehta
AU  - Aleem Ali
PY  - 2025
DA  - 2025/06/22
TI  - Advancement in Image Caption Generation
BT  - Proceedings of the International Conference on Advances and Applications in Artificial Intelligence (ICAAAI 2025)
PB  - Atlantis Press
SP  - 586
EP  - 600
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-738-0_47
DO  - 10.2991/978-94-6463-738-0_47
ID  - Sarda2025
ER  -