TY - JOUR
T1 - Understanding the spatial non-stationarity in the relationships between malaria incidence and environmental risk factors using Geographically Weighted Random Forest
T2 - A case study in Rwanda
AU - Nduwayezu, Gilbert
AU - Zhao, Pengxiang
AU - Kagoyire, Clarisse
AU - Eklund, Lina
AU - Bizimana, Jean Pierre
AU - Pilesjö, Petter
AU - Mansourian, Ali
PY - 2023/5/25
Y1 - 2023/5/25
N2 - As found in the health studies literature, the levels of climate association between epidemiological diseases have been found to vary across regions. Therefore, it seems reasonable to allow for the possibility that relationships might vary spatially within regions. We implemented the geographically weighted random forest (GWRF) machine learning method to analyze ecological disease patterns caused by spatially non-stationary processes using a malaria incidence dataset for Rwanda. We first compared the geographically weighted regression (WGR), the global random forest (GRF), and the geographically weighted random forest (GWRF) to examine the spatial non-stationarity in the non-linear relationships between malaria incidence and their risk factors. We used the Gaussian areal kriging model to disaggregate the malaria incidence at the local administrative cell level to understand the relationships at a fine scale since the model goodness of fit was not satisfactory to explain malaria incidence due to the limited number of sample values. Our results show that in terms of the coefficients of determination and prediction accuracy, the geographical random forest model performs better than the GWR and the global random forest model. The coefficients of determination of the geographically weighted regression (R2), the global RF (R2), and the GWRF (R2) were 4.74, 0.76, and 0.79, respectively. The GWRF algorithm achieves the best result and reveals that risk factors (rainfall, land surface temperature, elevation, and air temperature) have a strong non-linear relationship with the spatial distribution of malaria incidence rates, which could have implications for supporting local initiatives for malaria elimination in Rwanda.
AB - As found in the health studies literature, the levels of climate association between epidemiological diseases have been found to vary across regions. Therefore, it seems reasonable to allow for the possibility that relationships might vary spatially within regions. We implemented the geographically weighted random forest (GWRF) machine learning method to analyze ecological disease patterns caused by spatially non-stationary processes using a malaria incidence dataset for Rwanda. We first compared the geographically weighted regression (WGR), the global random forest (GRF), and the geographically weighted random forest (GWRF) to examine the spatial non-stationarity in the non-linear relationships between malaria incidence and their risk factors. We used the Gaussian areal kriging model to disaggregate the malaria incidence at the local administrative cell level to understand the relationships at a fine scale since the model goodness of fit was not satisfactory to explain malaria incidence due to the limited number of sample values. Our results show that in terms of the coefficients of determination and prediction accuracy, the geographical random forest model performs better than the GWR and the global random forest model. The coefficients of determination of the geographically weighted regression (R2), the global RF (R2), and the GWRF (R2) were 4.74, 0.76, and 0.79, respectively. The GWRF algorithm achieves the best result and reveals that risk factors (rainfall, land surface temperature, elevation, and air temperature) have a strong non-linear relationship with the spatial distribution of malaria incidence rates, which could have implications for supporting local initiatives for malaria elimination in Rwanda.
KW - variable importance
KW - partial dependent plot
KW - malaria incidence
KW - geographically weighted random forest
KW - spatial epidemiology
KW - Geographic information system (GIS)
KW - Artificial Intelligence (AI)
KW - Machine Learning (ML)
KW - Geospatial Artificial Intelligence (GeoAI)
U2 - 10.4081/gh.2023.1184
DO - 10.4081/gh.2023.1184
M3 - Article
C2 - 37246535
SN - 1970-7096
VL - 18
JO - Geospatial health
JF - Geospatial health
IS - 1
M1 - 1184
ER -