Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)

Data Cleaning and Visualization Analysis Based on Pan-das and Matplotlib a Case Study of the Titanic Dataset

Authors
Kaiwen Zuo1, *
1School of Science and Technology, Beijing Normal-Hong Kong Baptist University, Zhuhai, 519087, China
*Corresponding author. Email: t330026244@mail.bnbu.edu.cn
Corresponding Author
Kaiwen Zuo
Available Online 24 April 2026.
DOI
10.2991/978-94-6239-648-7_85How to use a DOI?
Keywords
Data Cleaning; Data Visualization; Survival Analysis
Abstract

In the big data time, data cleaning and visualization become essential for valuable information from raw materials. The very heady challenge of such work is to know what works for whom and how it does. It needs systematic exercises for data preprocessing and exploratory visualization. The classic Titanic passenger’s dataset is used in this work to fill this gap. The Pandas library was applied for systematic data cleaning. This involved dealing with missing values in ‘Age’ and ‘Embarked’, dropping the ‘Cabin’ column, as well as creating new features such as ‘Family Size’. Then, the paper plotted a sequence of charts in Matplotlib. These plots looked at the relationship between passenger survival rates and important factors such as gender, class, age. Social-demographic factors influencing survival were indicated in the results. This validates the efficacy of pairing thorough data cleaning with deep visualization. Further research may include additional external variables and more sophisticated visual tools. This would enhance the depth and explicative power of analysis.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
Series
Advances in Computer Science Research
Publication Date
24 April 2026
ISBN
978-94-6239-648-7
ISSN
2352-538X
DOI
10.2991/978-94-6239-648-7_85How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Kaiwen Zuo
PY  - 2026
DA  - 2026/04/24
TI  - Data Cleaning and Visualization Analysis Based on Pan-das and Matplotlib a Case Study of the Titanic Dataset
BT  - Proceedings of the International Workshop on Advances in Deep Learning for Image Analysis and Computer Vision (IWADIC 2025)
PB  - Atlantis Press
SP  - 786
EP  - 794
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-648-7_85
DO  - 10.2991/978-94-6239-648-7_85
ID  - Zuo2026
ER  -