Simulation of Long Time Series Spatial Distribution of PM2.5 in Beijing-Tianjin-Hebei Region Based on an Improved Machine Learning Method

The Beijing-Tianjin-Hebei (BTH) region has long been facing serious fine particulate matter (PM2.5) pollution issues due to its geographical characteristics and industrial structure. In this study, we innovatively integrated STL-derived seasonal-trend parameters to replace the conventional time vari...

Full description

Bibliographic Details
Published in:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Main Authors: Zheyuan Zhang, Huayang Song, Guang Tian, Hongyu Zhang, Jia Wang, Nina Xiong
Format: Article
Language:English
Published: IEEE 2025-01-01
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11141525/
Description
Summary:The Beijing-Tianjin-Hebei (BTH) region has long been facing serious fine particulate matter (PM2.5) pollution issues due to its geographical characteristics and industrial structure. In this study, we innovatively integrated STL-derived seasonal-trend parameters to replace the conventional time variables as inputs to XGBoost, combined with Bayesian optimization and hyperband (BOHB) for hyperparameter tuning. This integrated STL-XGBoost-BOHB framework significantly addressed the bottleneck of missing early monitoring data in long-term PM2.5 inversion. Through the STL time series decomposition method, seasonal trend parameters reflecting the variation of PM2.5 in the BTH region were obtained. These parameters were used as substitutes for time data, addressed the limitations of ground-based PM2.5 monitoring and overcoming the limitation of the lack of early PM2.5 monitoring data in China. The BOHB algorithm was chosen to comparison. The STL-XGBoost-BOHB model has a coefficient of determination (<italic>R</italic><sup>2</sup>) reaching 0.78 and root mean square error of 15.8 <italic>&#x03BC;</italic>g/m<sup>3</sup>, demonstrating outstanding performance in PM2.5 retrieval. Model results revealed a distinct spatial distribution of PM2.5, with concentrations decreasing from southeast to northwest. In terms of the temporal variation of PM2.5 concentration, there was a significant decrease in PM2.5 concentration in the BTH region from 2011 to 2020. However, combined with the PM2.5 pollution exposure study based on population data, it was found that the majority of the population in the region mainly concentrated in areas with higher PM2.5 concentrations, and the population-weighted PM2.5 concentration was significantly higher than the original PM2.5 concentration values without population weighting. This highlights the need for more targeted pollution control in densely populated areas.
ISSN:1939-1404
2151-1535