Isnan, Mahmud. Mapping poverty in Thailand using machine learning, satellite imagery, and crowd-sourced geospatial information. Master's Degree(Artificial Intelligence and Internet of Things). Thammasat University. Thammasat University Library. : Thammasat University, 2021.
Mapping poverty in Thailand using machine learning, satellite imagery, and crowd-sourced geospatial information
Abstract:
Poverty, one of the most major drivers of ill health outcomes internationally, is the primary source of social instability and one of the most significant causes of human potential loss. Humanitarian organizations and policymakers, especially in developing countries, need to map the distribution of poverty to develop targeted programs and assistance. However, acquire socioeconomic data using traditional methods can be costly, time-taking, and labor-intensive. Now, satellite imagery and machine learning can be used as an alternative to map poverty from outer space as a quick, affordable, and scalable methods of giving detailed poverty data. This study aims to investigate poverty related to environmental pollution and NDVI extracted from satellite images, and other geospatial data such as point of interest and road density, a study has not well explored in the previous studies. Google Earth Engine is used to extract satellite images and geospatial data is downloaded from an online repository, OpenStreetMap (OSM) website. This study will compare four different machine learning models (random forest, xgboost, ridge regression, and lasso regression) for poverty prediction at the sub-district level. To evaluate our regression models, we utilize the metric namely coefficient of determination (R2). The result shows that satellite imagery (environmental pollution and NDVI) and geospatial data are potential sources to estimate poverty since they are highly associated with the poverty rate. The random forest model produces the best degree of prediction accuracy among the methods considered in this study with R2 value of 0.83 in the final experiment, due to its capability to handle multicollinearity problem. Finally, we employ feature importance analysis using Shapley Additive Explanations (SHAP) to identify the most influential features to assist decision-makers in gaining a better understanding of poverty. Future research is required to determine if the association shown here is consistent over time and may be utilized to estimate the prevalence of poverty in years without conducting poverty surveys. This paper contributes to academic knowledge as an alternative method and feature for mapping poverty at sub-district level.
Thammasat University. Thammasat University Library