Estimating ground-level PM2.5 using micro-satellite images by a convolutional neural network and random forest approach
© 2020 Elsevier Ltd PM2.5 poses a serious threat to public health, however its spatial concentrations are not well characterized due to the sparseness of regulatory air quality monitoring (AQM) stations. This motivates novel low-cost methods to estimate ground-level PM2.5 at a fine spatial resolution so that PM2.5 exposure in epidemiological research can be better quantified. Satellite-retrieved aerosol products are widely used to estimate the spatial distribution of ground-level PM2.5. However, these aerosol products can be subject to large uncertainties due to many approximations and assumptions made in multiple stages of their retrieval algorithms. Therefore, estimating ground-level PM2.5 directly from satellites (e.g. satellite images) by skipping the intermediate step of aerosol retrieval can potentially yield lower errors because it avoids retrieval error propagating into PM2.5 estimation and is desirable compared to current ground-level PM2.5 retrieval methods. Additionally, the spatial resolutions of estimated PM2.5 are usually constrained by those of the aerosol products and are currently largely at a comparatively coarse 1 km or greater resolution. Such coarse spatial resolutions are unable to support scientific studies that thrive on highly spatially-resolved PM2.5. These limitations have motivated us to devise a computer vision algorithm for estimating ground-level PM2.5 at a high spatiotemporal resolution by directly processing the global-coverage, daily, near real-time updated, 3 m/pixel resolution, three-band micro-satellite imagery of spatial coverages significantly smaller than 1 × 1 km (e.g., 200 × 200 m) available from Planet Labs. In this study, we employ a deep convolutional neural network (CNN) to process the imagery by extracting image features that characterize the day-to-day dynamic changes in the built environment and more importantly the image colors related to aerosol loading, and a random forest (RF) regressor to estimate PM2.5 based on the extracted image features along with meteorological conditions. We conducted the experiment on 35 AQM stations in Beijing over a period of ~3 years from 2017 to 2019. We trained our CNN-RF model on 10,400 available daily images of the AQM stations labeled with the corresponding ground-truth PM2.5 and evaluated the model performance on 2622 holdout images. Our model estimates ground-level PM2.5 accurately at a 200 m spatial resolution with a mean absolute error (MAE) as low as 10.1 μg m−3 (equivalent to 23.7% error) and Pearson and Spearman r scores up to 0.91 and 0.90, respectively. Our trained CNN from Beijing is then applied to Shanghai, a similar urban area. By quickly retraining only RF but not CNN on the new Shanghai imagery dataset, our model estimates Shanghai 10 AQM stations' PM2.5 accurately with a MAE and both Pearson and Spearman r scores of 7.7 μg m−3 (18.6% error) and 0.85, respectively. The finest 200 m spatial resolution of ground-level PM2.5 estimates from our model in this study is higher than the vast majority of existing state-of-the-art satellite-based PM2.5 retrieval methods. And our 200 m model's estimation performance is also at the high end of these state-of-the-art methods. Our results highlight the potential of augmenting existing spatial predictors of PM2.5 with high-resolution satellite imagery to enhance the spatial resolution of PM2.5 estimates for a wide range of applications, including pollutant emission hotspot determination, PM2.5 exposure assessment, and fusion of satellite remote sensing and low-cost air quality sensor network information.
Zheng, T; Bergin, MH; Hu, S; Miller, J; Carlson, DE
Volume / Issue
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)