Proceedings of the 8th FIRST 2024 International Conference on Global Innovations (FIRST-ESCSI 2024 )

Optimizing Data Preprocessing on Stunting Datasets: Identifying Relevant Attributes for Machine Learning Analysis

Authors
Devi Sartika1, Febie Elfaladonna1, *, Ayu Octarina1
1Department of Informatics Management, Politeknik Negeri Sriwijaya, Palembang, Indonesia
*Corresponding author. Email: febie_elfaladonna_mi@polsri.ac.id
Corresponding Author
Febie Elfaladonna
Available Online 1 May 2025.
DOI
10.2991/978-94-6463-678-9_45How to use a DOI?
Keywords
Stunting Dataset; Preprocessing Data; Relevant Variables
Abstract

In Indonesia, approximately 8.9 million children are affected by stunting, with a prevalence rate of 30.8%. This issue is more prevalent in children over 12 months of age due to increased nutritional needs. Health centers, particularly the UPTD Puskesmas Mariana in Banyuasin Regency, play a crucial role in stunting monitoring efforts. However, data collection is still conducted manually, limiting efficiency and responsiveness. Additionally, low community awareness and limited skills among posyandu cadres in performing physical measurements contribute to the persistently high stunting rates. This study aims to preprocess data from 122 under-five observations collected in August 2024, with a primary focus on identifying relevant variables. The dataset includes 32 variables, all of which are outlier-free, allowing for more precise analysis. The results of this analysis are expected to highlight key factors that contribute to stunting and to inform more effective health policies and interventions. From this research, 29 attributes were selected after data preprocessing, and these new attributes are expected to aid in future cluster selection using K-means Clustering. This data will serve as a foundation for developing a toddler stunting monitoring application that incorporates K-means Clustering.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 8th FIRST 2024 International Conference on Global Innovations (FIRST-ESCSI 2024 )
Series
Advances in Engineering Research
Publication Date
1 May 2025
ISBN
978-94-6463-678-9
ISSN
2352-5401
DOI
10.2991/978-94-6463-678-9_45How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Devi Sartika
AU  - Febie Elfaladonna
AU  - Ayu Octarina
PY  - 2025
DA  - 2025/05/01
TI  - Optimizing Data Preprocessing on Stunting Datasets: Identifying Relevant Attributes for Machine Learning Analysis
BT  - Proceedings of the 8th FIRST 2024 International Conference on Global Innovations (FIRST-ESCSI 2024 )
PB  - Atlantis Press
SP  - 481
EP  - 501
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-678-9_45
DO  - 10.2991/978-94-6463-678-9_45
ID  - Sartika2025
ER  -