Analysis of Dengue Fever Spread Prediction Using Ensemble Learning Approach with Xgboost and Random
##plugins.themes.bootstrap3.article.main##
Abstract
Dengue hemorrhagic fever (DHF) is a significant infectious disease in tropical countries, with a major public health impact. This study aims to develop a predictive model to estimate the number of dengue cases in two cities, San Juan and Iquitos, using the Random Forest and XGBoost algorithms. The dataset used is DengAI: Predicting Disease Spread, which includes various environmental and weather features such as temperature, rainfall, humidity, and vegetation index, as well as the number of dengue cases reported. The research process begins with data pre-processing to ensure data quality and suitability. After that, the predictive model was built using Random Forest and XGBoost. The model performance evaluation was carried out using Mean Absolute Error (MAE). The results showed that the XGBoost model had a better performance in predicting the number of dengue cases than the Random Forest model, with a lower MAE for both cities. The resulting predictive model can assist health authorities in planning and implementing more effective preventive measures. This study confirms the potential use of machine learning techniques in infectious disease epidemiology and provides important insights into environmental factors that influence the spread of dengue.
##plugins.themes.bootstrap3.article.details##
[2] Azis, H., Tangguh Admojo, F. and Susanti, E. (2020) "Comparative Analysis of Classification Method Performance on Multiclass Dataset of Arrow Bow Imagery," Techno.Com, 19(3), p. 286–294. doi: 10.33633/tc.v19i3.3646
[3] Amiruddin and Ishak, R., Prediction of the Number of Students Registered Per Semester Using Regression Linear at Ichsan Gorontalo University, ILKOM Scientific Journal, 10(2), 2018, pp. 136–143.
[4] Breiman, L. (2014). Random Forests. In Machine Learning (Vol. 45, Issue 1). Cambridge University Press; 1st edition. https://doi.org/doi.org/10.1023/A:1010933404324.
[5] Breiman, L. (2014). Random Forests. In Machine Learning (Vol. 45, Issue 1). Cambridge University Press; 1st edition. https://doi.org/doi.org/10.1023/A:1010933404324.
[6] Dangeti, P. (2017). Statistics for Machine Learning: Build supervised, unsupervised, and reinforcement learning models using both Python and R (Safis Editing (ed.)). Packt Publishing Ltd.
[7] D., & Fitri, A. (2021). Empowerment Based on Innovative Community-Centered Dengue-Ecosystem Management to Reduce IR Dengue. HIGEIA (Journal of Public Health Research and Development), 5(2). [8] "CHAPTER II THEORETICAL FOUNDATIONS 2.1. Literature Review 2.1.1. Definition of Data Mining."
[9] Fauzi, A., Supriyadi, R., & Maulidah, N. (2020). Detection of Breast Cancer with Feature Selection based on Principal Component Analysis and Random Forest. Journal of Infortech, 2(1), 96–101. https://doi.org/10.31294/infortech.v2i1.8079
[10] Jackins, V., Vimal, S., Kaliappan, M., & Lee, M. Y. (2021). AI-based smart prediction of clinical disease using Random Forest classifier and Naive Bayes. Journal of Supercomputing, 77(5), 5198–5219. https://doi.org/10.1007/s11227-020-03481-x [11] I. Zuhdi, "Data Mining using the Rough Set Method in Predicting the Sales Rate of Computer Equipment," Journal of Business Economics InformaticsPp. 142–147, Sep. 2022, doi: 10.37034/infeb.v4i4.159.
[11] Li, X. F., Huang, Y. Z., Tang, J. Y., Li, R. C., & Wang, X. Q. (2021). Development of a Random Forest model for hypotension prediction after anesthesia induction for cardiac surgery. World Journal of Clinical Cases, 9(29), 8729–8739. https://doi.org/10.12998/wjcc.v9.i29.8729.
[12] Muhammad, I. et al., Forecasting the Number of New Students Using the Double Exponential Smoothing Method (Case Study: New Students of Pattimura University Ambon in 2017), Journal of Variance, 2(1), 2020, pp. 27–33..

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.