Abstract:
Analyzing trends in carbon dioxide (CO2) emissions in the transportation sector is a critical issue that
impacts both the environment and human health. This research focuses on the application of data
mining techniquesnamely clustering, linear regression analysis, and random forestto group and
predict future CO2 emission trends. Historical fuel consumption data was used as the primary input
for analysis. The performance of each technique was evaluated using metrics such as Mean Absolute
Error (MAE), Root Mean Square Error (RMSE), and the coeffi cient of determination (R2). The results
revealed that the random forest technique provided the highest accuracy, achieving an R2 of 97.14 %.
In comparison, linear regression and clustering yielded R2 values of 86.58 % and 51.14 %. These fi ndings
highlight the potential of the random forest algorithm as an effective tool for forecasting carbon
emissions to support greenhouse gas reduction planning efforts.