Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model

Particulate matter (PM) air pollution is one of the major causes of death worldwide, with demonstrated adverse effects from both short-term and long-term exposure. Most of the epidemiological studies have been conducted in cities because of the lack of reliable spatiotemporal estimates of particles...

Full description

Bibliographic Details
Main Authors: Massimo Stafoggia, Tom Bellander, Simone Bucci, Marina Davoli, Kees de Hoogh, Francesca de' Donato, Claudio Gariazzo, Alexei Lyapustin, Paola Michelozzi, Matteo Renzi, Matteo Scortichini, Alexandra Shtein, Giovanni Viegi, Itai Kloog, Joel Schwartz
Format: Article
Language:English
Published: Elsevier 2019-03-01
Series:Environment International
Online Access:http://www.sciencedirect.com/science/article/pii/S0160412018327685
id doaj-2409dfc36f8748e497741fe310f7358d
record_format Article
spelling doaj-2409dfc36f8748e497741fe310f7358d2020-11-24T21:10:33ZengElsevierEnvironment International0160-41202019-03-01124170179Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest modelMassimo Stafoggia0Tom Bellander1Simone Bucci2Marina Davoli3Kees de Hoogh4Francesca de' Donato5Claudio Gariazzo6Alexei Lyapustin7Paola Michelozzi8Matteo Renzi9Matteo Scortichini10Alexandra Shtein11Giovanni Viegi12Itai Kloog13Joel Schwartz14Department of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, Italy; Karolinska Institutet, Institute of Environmental Medicine, Stockholm, Sweden; Corresponding author at: Department of Epidemiology of the Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, Italy.Karolinska Institutet, Institute of Environmental Medicine, Stockholm, SwedenDepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalyDepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalySwiss Tropical and Public Health Institute, Basel, Switzerland; University of Basel, Basel, SwitzerlandDepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalyINAIL, Department of Occupational & Environmental Medicine, Monteporzio Catone, ItalyNational Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC), Greenbelt, MD, USADepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalyDepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalyDepartment of Epidemiology, Lazio Regional Health Service/ASL Roma 1, Via C. Colombo 112, 00147 Rome, ItalyDepartment of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer Sheva, IsraelInstitute of Biomedicine and Molecular Immunology “Alberto Monroy”, National Research Council, Palermo, ItalyDepartment of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer Sheva, IsraelDepartment of Environmental Health, Harvard T. H. Chan School of Public Health, Cambridge, MA, USAParticulate matter (PM) air pollution is one of the major causes of death worldwide, with demonstrated adverse effects from both short-term and long-term exposure. Most of the epidemiological studies have been conducted in cities because of the lack of reliable spatiotemporal estimates of particles exposure in nonurban settings. The objective of this study is to estimate daily PM10 (PM < 10 μm), fine (PM < 2.5 μm, PM2.5) and coarse particles (PM between 2.5 and 10 μm, PM2.5–10) at 1-km2 grid for 2013–2015 using a machine learning approach, the Random Forest (RF). Separate RF models were defined to: predict PM2.5 and PM2.5–10 concentrations in monitors where only PM10 data were available (stage 1); impute missing satellite Aerosol Optical Depth (AOD) data using estimates from atmospheric ensemble models (stage 2); establish a relationship between measured PM and satellite, land use and meteorological parameters (stage 3); predict stage 3 model over each 1-km2 grid cell of Italy (stage 4); and improve stage 3 predictions by using small-scale predictors computed at the monitor locations or within a small buffer (stage 5). Our models were able to capture most of PM variability, with mean cross-validation (CV) R2 of 0.75 and 0.80 (stage 3) and 0.84 and 0.86 (stage 5) for PM10 and PM2.5, respectively. Model fitting was less optimal for PM2.5–10, in summer months and in southern Italy. Finally, predictions were equally good in capturing annual and daily PM variability, therefore they can be used as reliable exposure estimates for investigating long-term and short-term health effects. Keywords: Aerosol optical depth, Exposure assessment, Machine learning, Particulate matter, Random forest, Satellitehttp://www.sciencedirect.com/science/article/pii/S0160412018327685
collection DOAJ
language English
format Article
sources DOAJ
author Massimo Stafoggia
Tom Bellander
Simone Bucci
Marina Davoli
Kees de Hoogh
Francesca de' Donato
Claudio Gariazzo
Alexei Lyapustin
Paola Michelozzi
Matteo Renzi
Matteo Scortichini
Alexandra Shtein
Giovanni Viegi
Itai Kloog
Joel Schwartz
spellingShingle Massimo Stafoggia
Tom Bellander
Simone Bucci
Marina Davoli
Kees de Hoogh
Francesca de' Donato
Claudio Gariazzo
Alexei Lyapustin
Paola Michelozzi
Matteo Renzi
Matteo Scortichini
Alexandra Shtein
Giovanni Viegi
Itai Kloog
Joel Schwartz
Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
Environment International
author_facet Massimo Stafoggia
Tom Bellander
Simone Bucci
Marina Davoli
Kees de Hoogh
Francesca de' Donato
Claudio Gariazzo
Alexei Lyapustin
Paola Michelozzi
Matteo Renzi
Matteo Scortichini
Alexandra Shtein
Giovanni Viegi
Itai Kloog
Joel Schwartz
author_sort Massimo Stafoggia
title Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
title_short Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
title_full Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
title_fullStr Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
title_full_unstemmed Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model
title_sort estimation of daily pm10 and pm2.5 concentrations in italy, 2013–2015, using a spatiotemporal land-use random-forest model
publisher Elsevier
series Environment International
issn 0160-4120
publishDate 2019-03-01
description Particulate matter (PM) air pollution is one of the major causes of death worldwide, with demonstrated adverse effects from both short-term and long-term exposure. Most of the epidemiological studies have been conducted in cities because of the lack of reliable spatiotemporal estimates of particles exposure in nonurban settings. The objective of this study is to estimate daily PM10 (PM < 10 μm), fine (PM < 2.5 μm, PM2.5) and coarse particles (PM between 2.5 and 10 μm, PM2.5–10) at 1-km2 grid for 2013–2015 using a machine learning approach, the Random Forest (RF). Separate RF models were defined to: predict PM2.5 and PM2.5–10 concentrations in monitors where only PM10 data were available (stage 1); impute missing satellite Aerosol Optical Depth (AOD) data using estimates from atmospheric ensemble models (stage 2); establish a relationship between measured PM and satellite, land use and meteorological parameters (stage 3); predict stage 3 model over each 1-km2 grid cell of Italy (stage 4); and improve stage 3 predictions by using small-scale predictors computed at the monitor locations or within a small buffer (stage 5). Our models were able to capture most of PM variability, with mean cross-validation (CV) R2 of 0.75 and 0.80 (stage 3) and 0.84 and 0.86 (stage 5) for PM10 and PM2.5, respectively. Model fitting was less optimal for PM2.5–10, in summer months and in southern Italy. Finally, predictions were equally good in capturing annual and daily PM variability, therefore they can be used as reliable exposure estimates for investigating long-term and short-term health effects. Keywords: Aerosol optical depth, Exposure assessment, Machine learning, Particulate matter, Random forest, Satellite
url http://www.sciencedirect.com/science/article/pii/S0160412018327685
work_keys_str_mv AT massimostafoggia estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT tombellander estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT simonebucci estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT marinadavoli estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT keesdehoogh estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT francescadedonato estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT claudiogariazzo estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT alexeilyapustin estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT paolamichelozzi estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT matteorenzi estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT matteoscortichini estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT alexandrashtein estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT giovanniviegi estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT itaikloog estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
AT joelschwartz estimationofdailypm10andpm25concentrationsinitaly20132015usingaspatiotemporallanduserandomforestmodel
_version_ 1716756134575996928