Prediction of Stunting Prevalence in Toddlers Using Support Vector Machine Algorithm and Synthetic Minority Oversampling Technique (SMOTE)

Main Article Content

Taufik Hidayat
Irwan Sembiring
Hindriyanto Dwi Purnomo
Ade Iriani

Abstract

Stunting in toddlers represents a condition where isa nutritional deficiency. This becomes more complex when there is insufficient information regarding stunting in toddlers available. Predicting the prevalence of stunting in toddlers involves studying a dataset of stunting prevalence among toddlers through a supervised learning model using Support Vector Machine (SVM) and synthetic minority oversampling technique (SMOTE). The use of SMOTE serves as a data balancing method, while exploratory data analysis (EDA) acts as the preprocessing method for the toddler dataset. From the research implementation on a dataset consisting of 6879 toddlers, an accuracy of 94% was obtained for predictions. This accuracy is comprised of 94% accuracy, 95% precision, 94% recall, and a 94% F1-score.

Article Details

Section
Informatics

References

Amanda, R., & Negara, E. S. (2020). Analysis and Implementation Machine Learning for YouTube Data Classification by Comparing the Performance of Classification Algorithms. Jurnal Online Informatika, 5(1), 61–72. https://doi.org/10.15575/join.v5i1.505

Austin, R. R., Mathiason, M. A., & Monsen, K. A. (2022). Using data visualization to detect patterns in whole-person health data. Research in Nursing and Health, 45(4), 466–476. https://doi.org/10.1002/nur.22248

Barros, M. T., Siljak, H., Mullen, P., Papadias, C., Hyttinen, J., & Marchetti, N. (2022). Objective Supervised Machine Learning-Based Classification and Inference of Biological Neuronal Networks. Molecules, 27(19), 1–23. https://doi.org/10.3390/molecules27196256

Bharti, Gill, N. S., & Gulia, P. (2023). Exploring machine learning techniques for fake profile detection in online social networks. International Journal of Electrical and Computer Engineering, 13(3), 2962–2971. https://doi.org/10.11591/ijece.v13i3.pp2962-2971

Booeshaghi, A. S., Sullivan, D. K., & Pachter, L. (2023). Universal preprocessing of single-cell genomics data. BioRxiv, 2023.09.14.543267. https://www.biorxiv.org/content/10.1101/2023.09.14.543267v1%0Ahttps://www.biorxiv.org/content/10.1101/2023.09.14.543267v1.abstract

Doshi, N., Gundam, S., & Chaudhury, B. (2021). Strategizing University Rank Improvement using Interpretable Machine Learning and Data Visualization. http://arxiv.org/abs/2110.09050

Dritsas, E., & Trigka, M. (2023). Supervised Machine Learning Models to Identify Early-Stage Symptoms of SARS-CoV-2. Sensors, 23(1). https://doi.org/10.3390/s23010040

Erol, G., Uzbaş, B., Yücelbaş, C., & Yücelbaş, Ş. (2022). Analyzing the effect of data preprocessing techniques using machine learning algorithms on the diagnosis of COVID-19. Concurrency and Computation: Practice and Experience, 34(28), 1–16. https://doi.org/10.1002/cpe.7393

Indrakumari, R., Poongodi, T., & Jena, S. R. (2020). Heart Disease Prediction using Exploratory Data Analysis. Procedia Computer Science, 173(2019), 130–139. https://doi.org/10.1016/j.procs.2020.06.017

Kong, X., Ravikumar, V., Mulpuru, S. K., Roukoz, H., & Tolkacheva, E. G. (2023). A Data-Driven Preprocessing Framework for Atrial Fibrillation Intracardiac Electrocardiogram Analysis. Entropy, 25(2), 1–15. https://doi.org/10.3390/e25020332

Laengsri, V., Shoombuatong, W., Adirojananon, W., Nantasenamart, C., Prachayasittikul, V., & Nuchnoi, P. (2019). ThalPred: A web-based prediction tool for discriminating thalassemia trait and iron deficiency anemia. BMC Medical Informatics and Decision Making, 19(1), 1–14. https://doi.org/10.1186/s12911-019-0929-2

Malau, C. G. M., Sholihah, B., & Salim, A. (2023). Visualisasi Data Pembelian Barang dan Jasa Pada PT. Transcoal Pacific Menggunakan Exploratory Data Analysis. Intelmatics, 3(1), 7–12. https://doi.org/10.25105/itm.v3i1.16302

Mayasari, R., Nugraha, B., Juwita, A. R., & Heryana, N. (2023). Analisis Produktifitas Padi di Pulau Sumatera menggunakan Exploratory Data Analysis ( EDA ). Jurnal Elektronik Sistem Informasi Unsika, 1(1), 17–24.

Muhajir, M., & Widiastuti, J. (2022). Random Forest Method Approach to Customer Classification Based on Non-Performing Loan in Micro Business. Jurnal Online Informatika, 7(2), 177–183. https://doi.org/10.15575/join.v7i2.842

Mustaqim, M., Warsito, B., & Surarso, B. (2019). Kombinasi Synthetic Minority Oversampling Technique (SMOTE) dan Neural Network Backpropagation untuk menangani data tidak seimbang pada prediksi pemakaian alat kontrasepsi implan. Register: Jurnal Ilmiah Teknologi Sistem Informasi, 5(2), 128. https://doi.org/10.26594/register.v5i2.1705

Nadhiroh, S. R., Riyanto, E. D., & Salsabil, I. S. (2022). Potensi Balita Risiko Stunting dan Hubungannya dengan Keluarga Pra-Sejahtera di Jawa Timur : Analisis Data PK-21. 1, 112–119.

Nofriani, N. (2019). Comparations of Supervised Machine Learning Techniques in Predicting the Classification of the Household’s Welfare Status. Journal Pekommas, 4(1), 43. https://doi.org/10.30818/jpkm.2019.2040105

Pohan, H., Zarlis, M., Irawan, E., Okprana, H., & Pranayama, Y. (2021). Penerapan Algoritma K-Medoids dalam Pengelompokan Balita Stunting di Indonesia. JUKI : Jurnal Komputer Dan Informatika, 3(2), 97–104. https://doi.org/10.53842/juki.v3i2.69

Siambaton, M. Z., & Husein, A. M. (2022). Menganalisis Data Kesehatan Global : Pendekatan Analisis Data Eksplorasi Visual. Data Sciences Indonesia (DSI), 1(2), 41–49. https://doi.org/10.47709/dsi.v1i2.1315

Syahruddin, A. N., & Sari, N. P. (2023). Water Sanitation and Hygiene ( WASH ) and feeding patterns : Linkages with stunting among children aged 6-23 months Water Sanitation and Hygiene ( WASH ) dan pola pemberian makan : Hubungannya dengan stunting pada anak usia 6-23 bulan Abstrak. 8(3), 466–477.

Tiwari, S. (2022). Supervised Machine Learning: A Brief Introduction. Proceedings of the International Conference on Virtual Learning, 17(5), 219–230. https://doi.org/10.58503/icvl-v17y202218

Torres-Martos, Á., Bustos-Aibar, M., Ramírez-Mena, A., Cámara-Sánchez, S., Anguita-Ruiz, A., Alcalá, R., Aguilera, C. M., & Alcalá-Fdez, J. (2023). Omics Data Preprocessing for Machine Learning: A Case Study in Childhood Obesity. Genes, 14(2). https://doi.org/10.3390/genes14020248

Uddin, S., Khan, A., Hossain, M. E., & Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1), 1–16. https://doi.org/10.1186/s12911-019-1004-8

Wibowo, A. (2022). Analisa Dan Visualisasi Data Penjualan Menggunakan Exploratory Data Analysis Pada PT. Telkominfra. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 9(3), 2292–2304. https://doi.org/10.35957/jatisi.v9i3.2737

Wittek, N., Wittek, K., Keibel, C., & Güntürkün, O. (2023). Supervised machine learning aided behavior classification in pigeons. Behavior Research Methods, 55(4), 1624–1640. https://doi.org/10.3758/s13428-022-01881-w