Application of machine learning and statistical modeling to identify sources of air pollutant levels in Kitchener, Ontario, Canada

Wisam Mohammed; Adrian Adamescu; Lucas Neil; Nicole Shantz; Tom Townend; Martin Lysy; Hind A. Al-Abadleh

doi:10.1039/D2EA00084A

Application of machine learning and statistical modeling to identify sources of air pollutant levels in Kitchener, Ontario, Canada†

Wisam Mohammed,

^a Adrian Adamescu,^a Lucas Neil,

^b Nicole Shantz,^ab Tom Townend,^c Martin Lysy

*^d and Hind A. Al-Abadleh

*^a

Author affiliations

* Corresponding authors

^a Department of Chemistry and Biochemistry, Wilfrid Laurier University, 75 University Ave West, Waterloo, Canada
E-mail: halabadleh@wlu.ca
Tel: +1 519-884-0710 ext. 2873

^b Ausenco, 100–1016B Sutton Dr, Burlington, Ontario L7L 6B8, Canada

^c AQMesh, Environmental Instruments Ltd, Unit 5, The Mansley Centre, Timothy's Bridge Road, Stratford-upon-Avon, UK

^d Department of Statistics and Actuarial Science, University of Waterloo, 200 University Ave West, Waterloo, Canada
E-mail: mlysy@uwaterloo.ca
Tel: +1 519-888-4567 ext. 45503

Abstract

Machine learning is used across many disciplines to identify complex relations between outcomes and numerous potential predictors. In the case of air quality research in heavily populated urban centers, such techniques were used to correlate the impacts of Traffic-Related Air Pollutants (TRAP) on vulnerable members of communities, future pollutant levels, and potential solutions that mitigate adverse effects of poor air quality. However, machine learning tools have not been used to assess the variables that influence measured pollutant levels in a suburban environment. The objective of this study is to apply a novel combination of Random Forest (RF) modeling, a machine learning algorithm, and statistical significance analysis to assess the impacts of anthropogenic and meteorological variables on observed pollutant levels in two separate datasets collected during and after the COVID-19 lockdowns in Kitchener, Ontario, Canada. The results highlight that TRAP levels studied here are linked to meteorology and traffic count/type, with relatively higher sensitivity to the former. Upon taking statistical significance into account when assessing relative importance of variables affecting pollutant levels, our study found that traffic variables had a more discernible influence than many meteorological variables. Additional studies with a larger dataset and spread throughout the year are needed to expand upon these initial findings. The proposed approach outlines a “blueprint” method of quantifying the importance of traffic in mid-size cities experiencing fast population growth and development.

This article is part of the themed collections: The Use of Machine Learning in Atmospheric Science Research - Topic Highlight and A collection on dense networks and low-cost sensors, including work presented at ASIC 2022

Supplementary files

Article information

DOI: https://doi.org/10.1039/D2EA00084A
Article type: Paper
Submitted: 11 Jul 2022
Accepted: 05 Oct 2022
First published: 06 Oct 2022
This article is Open Access

Download Citation

Environ. Sci.: Atmos., 2022,2, 1389-1399

Permissions

Request permissions

Application of machine learning and statistical modeling to identify sources of air pollutant levels in Kitchener, Ontario, Canada

W. Mohammed, A. Adamescu, L. Neil, N. Shantz, T. Townend, M. Lysy and H. A. Al-Abadleh, Environ. Sci.: Atmos., 2022, 2, 1389 DOI: 10.1039/D2EA00084A

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Environmental Science: Atmospheres

Application of machine learning and statistical modeling to identify sources of air pollutant levels in Kitchener, Ontario, Canada†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Application of machine learning and statistical modeling to identify sources of air pollutant levels in Kitchener, Ontario, Canada

Social activity

Search articles by author

Spotlight

Advertisements