Evaluating the Impact of Self-Attention in Pix2Pix for Image-to-Image Translation

Zheng Liao

doi:10.2991/978-94-6239-648-7_83

<Previous Article In Volume

Next Article In Volume>

Evaluating the Impact of Self-Attention in Pix2Pix for Image-to-Image Translation

Authors

Zheng Liao¹^{, *}

¹School of Computer Science and Network Engineering, Guangzhou University, Guangzhou, China

^*Corresponding author. Email: 32306300021@e.gzhu.edu.cn

Corresponding Author

Zheng Liao

Available Online 24 April 2026.

DOI: 10.2991/978-94-6239-648-7_83 How to use a DOI?
Keywords: GAN; Pix2Pix; Attention Mechanism; Self-Attention
Abstract: In this work, a self-attention module is incorporated into the generator of the Pix2Pix model and its effects on image-to-image translation with Facades dataset is assessed. The proposed architecture adds self-attention at the bottleneck of the U-Net generator to capture global context while retaining the original generator–discriminator structure. A detailed assessment, including training loss curves, discriminator dynamics, qualitative image comparison and Fréchet Inception Distance (FID), was performed to study the influence of attention on the perceptual output quality and on the optimization behavior. The experimental results demonstrate that including self-attention leads to more stable adversarial loss curves, a lower and more stable L1 reconstruction loss, and more balanced discriminator responses, indicating dynamics of training that are different from that of the vanilla Deep Convolutional Generative Adversarial Network (DCGAN). However, this modification does not produce the perceptual faithfulness benefits: the synthesized images are still visually on par with those produced by the baseline Pix2Pix model and the FID score does not display a visible drop. These results show that at least for the considered Facades dataset and the current experimental training setup, the benefits of the self-attention module manifest more in the way of training stability than in perceptual quality improvements of the generated images. The work provides empirical understanding of attentional mechanisms in conditional GANs as well as suggestions for further research such as multi-level attention, perceptual loss integration, and evaluation on more challenging datasets.
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
Series: Advances in Computer Science Research
Publication Date: 24 April 2026
ISBN: 978-94-6239-648-7
ISSN: 2352-538X
DOI: 10.2991/978-94-6239-648-7_83 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Zheng Liao
PY  - 2026
DA  - 2026/04/24
TI  - Evaluating the Impact of Self-Attention in Pix2Pix for Image-to-Image Translation
BT  - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
PB  - Atlantis Press
SP  - 761
EP  - 773
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-648-7_83
DO  - 10.2991/978-94-6239-648-7_83
ID  - Liao2026
ER  -

download .riscopy to clipboard