Health Risk Level Prediction for Hajj Pilgrims Using Random Forest and Bayesian Optimization (Case Study: Hajj Pilgrims of Balikpapan Embarkation)

Luthfi Bhaktiawan Husag

doi:10.33650/jeecom.v7i2.12391

Health Risk Level Prediction for Hajj Pilgrims Using Random Forest and Bayesian Optimization (Case Study: Hajj Pilgrims of Balikpapan Embarkation)

DOI: https://doi.org/10.33650/jeecom.v7i2.12391

Authors

(1) * Luthfi Bhaktiawan Husag

(UNIVERSITAS AMIKOM YOGYAKARTA)
Indonesia
(*) Corresponding Author

Abstract

The Hajj pilgrimage is one of the largest religious rituals in the world, involving millions of pilgrims from various countries. The physical condition and health of pilgrims are crucial factors in ensuring the smooth execution of the Hajj. Data from the Ministry of Health indicates that the mortality risk among Hajj pilgrims tends to increase annually, particularly in the elderly age group and among pilgrims with a history of certain diseases, such as hypertension, diabetes, and heart disease. This study aims to compare the performance of a Random Forest model optimized with Bayesian optimization against a Random Forest model without any optimization in predicting the health risk level of Hajj pilgrims at the Balikpapan Embarkation. The research findings show that the Random Forest model optimized with Bayesian Optimization provides superior performance compared to the non-optimized model, using K-Fold Cross-Validation for data splitting to avoid imbalance. The optimized model achieved an average Accuracy of 88.25% and an F1 Score of 88.19%, higher than the standard model which recorded 87.99% and 87.95% on the same metrics. Although their AUC scores were nearly identical (95.46% vs. 95.47%), the improvement in accuracy and F1 Score indicates that Bayesian Optimization can produce a more balanced and accurate classification model. In conclusion, the application of Bayesian Optimization to Random Forest is proven effective for enhancing the predictive accuracy of Hajj pilgrims' health risks, potentially supporting more proactive Hajj healthcare services.

Keywords

Health,Risk,Prediction; Hajj,Pilgrims; Random,Forest; Bayesian,Optimization

Full Text: PDF

References

H. N. Alhazmi, “A Prediction Triage System for Emergency Department During Hajj Period using Machine Learning Models,” IJCSNS International Journal of Computer Science and Network Security, vol. 24, no. 7, p. 11, 2024, doi: 10.22937/IJCSNS.2024.24.7.2.

L. Gao and Y. Ding, “Disease prediction via Bayesian hyperparameter optimization and ensemble learning,” BMC Res Notes, vol. 13, no. 1, Apr. 2020, doi: 10.1186/s13104-020-05050-0.

Y. Zhang, X. Zheng, S. Yang, S. Meng, Z. Yang, and X. Fei, “A Random Forest Stock Prediction Model Based on Bayesian Optimization,” 2024 7th International Conference on Artificial Intelligence and Big Data

(ICAIBD), pp. 42–46, 2024, doi: 10.1109/ICAIBD62003.2024.10604441.

C. Yang, Y. Wang, A. Zhang, H. Fan, and L. Guo, “A Random Forest Algorithm Combined with Bayesian Optimization for Atmospheric Duct Estimation,” Remote Sens (Basel), vol. 15, no. 17, Sep. 2023, doi: 10.3390/rs15174296.

S. Wang, J. Zhuang, J. Zheng, H. Fan, J. Kong, and J. Zhan, “Application of Bayesian Hyperparameter Optimized Random Forest and XGBoost Model for Landslide Susceptibility Mapping,” Front Earth Sci (Lausanne), vol. 9, Jul. 2021, doi: 10.3389/feart.2021.712240.

E. Jaya Kusuma et al., “OPTIMASI MODEL EXTREME GRADIENT BOOSTING DALAM UPAYA PENENTUAN TINGKAT RISIKO PADA IBU HAMIL BERBASIS BAYESIAN OPTIMIZATION (BOXGB) MACHINE LEARNING OPTIMIZATION IN DETERMINING THE MATERNAL RISK LEVEL BASED ON BAYESIAN OPTIMIZATION,” vol. 12, no. 1, 2025, doi: 10.25126/jtiik.2025129001.

P. Rodríguez, M. A. Bautista, J. Gonzàlez, and S. Escalera, “Beyond One-hot Encoding: lower dimensional target embedding,” Jun. 2018, doi: 10.1016/j.imavis.2018.04.004.

E. Jackson and R. Agrawal, “Performance Evaluation of Different Feature Encoding Schemes on Cybersecurity Logs,” in 2019 SoutheastCon, 2019, pp. 1–9. doi: 10.1109/SoutheastCon42311.2019.9020560.

T. A. Runkler, “Data Preprocessing,” in Data Analytics: Models and Algorithms for Intelligent Data Analysis, T. A. Runkler, Ed., Wiesbaden: Springer Fachmedien Wiesbaden, 2020, pp. 23–36. doi: 10.1007/978-3-658-29779-4_3.

M. Schonlau and R. Y. Zou, “The random forest algorithm for statistical learning,” Stata Journal, vol. 20, no. 1, pp. 3–29, Mar. 2020, doi: 10.1177/1536867X20909688.

T. Wang et al., “Random Forest-Bayesian Optimization for Product Quality Prediction with Large- Scale Dimensions in Process Industrial Cyber-Physical Systems,” IEEE Internet Things J, vol. 7, no. 9, pp. 8641–8653, Sep. 2020, doi: 10.1109/JIOT.2020.2992811.

G. Rong et al., “Rainfall induced landslide susceptibility mapping based on bayesian optimized random forest and gradient boosting decision tree models—a case study of shuicheng county, china,” Water (Switzerland), vol. 12, no. 11, pp. 1–22, Nov. 2020, doi: 10.3390/w12113066.

J. Zhang et al., “Enhanced Crop Leaf Area Index Estimation via Random Forest Regression: Bayesian Optimization and Feature Selection Approach,” Remote Sens (Basel), vol. 16, no. 21, Nov. 2024, doi: 10.3390/rs16213917.

P. I. Frazier, “A Tutorial on Bayesian Optimization,” Jul. 2018, [Online]. Available: http://arxiv.org/abs/1807.02811

T. T. Wong, “Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation,” Pattern Recognit, vol. 48, no. 9, pp. 2839–2846, Sep. 2015, doi: 10.1016/j.patcog.2015.03.009.

Dimensions, PlumX, and Google Scholar Metrics

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution License (CC BY-SA 4.0)

Journal of Electrical Engineering and Computer (JEECOM)

Published by LP3M Nurul Jadid University, Indonesia, Probolinggo, East Java, Indonesia.

Username
Password
Remember me