# Bayesian spatio-temporal modelling of air pollution burden in Europe: Integrating data from monitors, satellites and chemical transport models

Beloconi, Anton. Bayesian spatio-temporal modelling of air pollution burden in Europe: Integrating data from monitors, satellites and chemical transport models. 2020, Doctoral Thesis, University of Basel, Associated Institution, Faculty of Science.

Official URL: https://edoc.unibas.ch/89807/

## Abstract

Ambient air quality is a growing global public health concern. Reducing air pollution and its burden on health is the aim of several environmental agencies and global agendas including the sustainable development goals (SDGs). Reliable estimates of the spatio-temporal distribution of pollutants' concentration are important for monitoring compliance to air quality guidelines (AQGs), for evaluating the environmental policies and for assessing the health impacts associated with air pollution exposure. In the last two decades, substantial research efforts have been placed in developing data-driven methods which generate gridded estimates of pollutants' concentration using data from several sources, including monitoring stations, satellites and chemical transport models. However, it is still unclear which determinants contribute most to a more accurate exposure estimation. It is important to build models that allow valid inferences regarding the association between the pollutants of interest and the corresponding predictors. Additionally, as the modelling becomes increasingly complex and computationally demanding, there is a pressing need to better understand the advantages and limitations associated with different approaches for different scales of analysis. A crucial aspect, often neglected in the health impact assessment studies, is the quality of the exposure estimates, due to a lack of rigorous modelling that can quantify the prediction uncertainty.
The overarching goal of the PhD thesis is to advance the methodology for assessing the burden of air pollution and generate estimates that support future environmental policies in Europe and beyond. The specific objectives of this research are to: (i) develop Bayesian geostatistical regression (GR) models that predict surface particulate matter ($PM$) concentration at high spatial resolution across Europe, quantify the prediction uncertainty and estimate population exposure to levels above the air quality thresholds dictated by the international AQGs (Chapter 2); (ii) evaluate within a Bayesian GR modelling framework the parameters that contribute most to a more accurate estimation of the pan-European surface nitrogen dioxide ($NO_2$) concentration and quantify the number of people exposed to $NO_2$ levels above the AQG limits (Chapter 3); (iii) compare Bayesian GR models with the state-of-the-art machine learning (ML) methods for estimating $PM_{2.5}$ and $NO_2$ exposure at large geographical scale, focusing on the predictive ability, computational demand and ease of interpretation (Chapter 4); (iv) estimate trends in the long-term $PM$ pollution dynamics from 2006 to 2019 in Europe at high spatial resolution using Bayesian spatio-temporal (BST) models (Chapter 5); and (v) develop BST models that could rigorously assess the effect of the lockdown measures implemented by many European countries to stop the spread of the SARS-CoV-2 (COVID-19) virus pandemic on the air pollution burden across the continent (Chapter 6).
In Chapter 2, we investigate the potential of satellite-derived products to improve estimates of particulate matter ($PM$) concentration, both fine ($PM_{2.5}$) and coarse ($PM_{10}$), by developing Bayesian GR models that address confounding between the spatial distribution of pollutants and remotely sensed predictors. The proposed methodology is compared to geostatistical, geographically weighted (GWR) and land-use regression (LUR) formulations. A rigorous model selection is used to identify the Earth observation data which contribute most to pollutants' estimation. We apply the best resulting model to predict yearly averaged concentration of $PM_{10}$ and $PM_{2.5}$ at 1 $km^2$ spatial resolution over 46 European countries and to estimate the number of people that were breathing air above the thresholds set by the European Union (EU) Directive and the World Health Organization (WHO) AQGs.
In Chapter 3, we apply Bayesian GR models to estimate yearly averaged $NO_2$ concentrations at 1 $km^2$ spatial resolution across Europe, integrating information from in situ monitoring stations, satellites and chemical transport model (CTM) simulations. We combine tropospheric values of $NO_2$ derived from the ozone monitoring instrument (OMI) onboard Aura satellite with simulations from the 3D global CTM (GEOS-Chem) to convert columnar $NO_2$ values to surface quasi-observations. For the first time, we evaluate the contribution of this conversion to the predictive capability of GR models. Furthermore, we compare the satellite-derived proxy with the higher resolution Ensemble of regional CTMs from the Copernicus atmosphere monitoring service (CAMS). The best formulation is used to quantify the European population exposed to $NO_2$ levels above the international AQG limits.
In Chapter 4, we assess the predictive performance of the machine learning (ML) methods most commonly used in environmental epidemiology compared to that of the Bayesian GR models. We consider random forests (RF), artificial neural networks (NN) and support vector regression (SVR) and recently proposed "spatial" ML methods, including random forest for spatial predictions (RFsp), geographical random forest (GRF) and convolutional neural networks (CNN). In depth evaluation is carried out via two case studies of estimating large-scale (pan-European) exposure maps of $NO_2$ and $PM_{2.5}$ concentration. We assess and discuss the advantages and limitations associated with each modelling framework, focusing on their predictive ability, computational demand and ease of interpretation.
In Chapter 5, we link a large database of raw pollutant data with novel remotely-sensed and CTM products within a Bayesian spatio-temporal (BST) modelling framework and for the first time estimate pan-European near surface concentrations of $PM_{2.5}$ and $PM_{10}$ at 1 $km^2$ spatial resolution from 2006 to 2019. We assess country-wise trends in $PM$ dynamics in the last 14 years and evaluate their compliance with the AQG thresholds set in 2005 by WHO.
In Chapter 6, we develop BST regression models to assess changes in surface $NO_2$ and $PM_{2.5}$ concentrations that followed the lockdown measures implemented by many European countries to stop the spread of the SARS-CoV-2 (COVID-19). We propose two different model formulations that enable to differentiate the variation due to seasonality and due to the lockdown policies. The one model compares the changes that occurred during the lockdown in 2020 with the ones during the same period in previous years. The other model adjusts for factors that contribute to each pollutants' formation, dispersion and transportation, such as weather conditions, local combustion sources and/or land surface characteristics.
The main contribution of the PhD thesis to the field of environmental epidemiology is a comprehensive evaluation and further development of model-based methodology for estimating air pollution exposure on large spatial scales, integrating information from monitors, satellites and chemical transport models. Furthermore, it generates a rigorous framework for policy assessment and provides data-driven evidence to support decision makers to develop locally adapted environment protection and public health strategies. This is achieved through: (i) spatial (geostatistical) and spatio-temporal Bayesian methodology for estimating air pollution exposure at high spatial resolution and large scale; (ii) precise pan-European estimates of $PM_{10}$, $PM_{2.5}$ and $NO_2$ concentration surfaces together with their corresponding prediction uncertainty; (iii) probabilistic statements at high geographical resolution about the areas that exceed the international AQGs thresholds; (iv) estimates of the total number of people living in regions that exceed the AQG limits; (v) assessment of the advantages and limitations associated with the state-of-the-art data-driven modelling approaches, including machine learning algorithms; (vi) Bayesian spatio-temporal methodology to assess the impact of COVID-19 lockdown policies on the air pollution burden.
Advisors: Utzinger, Jürg and Vounatsou, Penelope and Parlow, Eberhard 05 Faculty of Science > Departement Umweltwissenschaften > Ehemalige Einheiten Umweltwissenschaften > Meteorologie (Parlow)09 Associated Institutions > Swiss Tropical and Public Health Institute (Swiss TPH) > Department of Epidemiology and Public Health (EPH) > Biostatistics > Bayesian Modelling and Analysis (Vounatsou)09 Associated Institutions > Swiss Tropical and Public Health Institute (Swiss TPH) > Former Units within Swiss TPH > Health Impact Assessment (Utzinger) Utzinger, Jürg and Vounatsou, Penelope and Parlow, Eberhard Thesis Doctoral Thesis 14825 Complete xxii, 169 English urn: urn:nbn:ch:bel-bau-diss148250 10.5451/unibas-ep89807 29 Oct 2022 04:30 28 Oct 2022 09:30

Repository Staff Only: item control page