 Open Access Article
 Open Access Article
      
        
          
            Karl 
            Ropkins
          
        
       *a, 
      
        
          
            James E. 
            Tate
          
        
      a, 
      
        
          
            Anthony 
            Walker
*a, 
      
        
          
            James E. 
            Tate
          
        
      a, 
      
        
          
            Anthony 
            Walker
          
        
       b and 
      
        
          
            Tony 
            Clark
          
        
      b
b and 
      
        
          
            Tony 
            Clark
          
        
      b
      
aInstitute for Transport Studies, University of Leeds, Leeds, LS2 9JT, UK. E-mail: k.ropkins@its.leeds.ac.uk
      
bJoint Air Quality Unit, Department for Transport & Department for Environment, Food and Rural Affairs, Marsham Street, London, SW1P 4DF, UK
    
First published on 15th April 2022
As part of air quality management plans, administrative authorities commonly implement interventions, such as Low Emission Zones (LEZs) and Clean Air Zones (CAZs), to improve air quality. The associated benefits are often difficult to quantify due to the high variability in ambient time-series measurements and influence of contributions from meteorology, background and other emission sources. Break-point techniques have previously been used on their own to detect large changes, and in combination with deseasonalisation and deweathering methods to detect smaller changes. However, getting down to the detection limits needed to measure change at the levels predicted for most contemporary air quality interventions remains a challenge, as does the conversion of such higher-level analytical techniques into tools that are suitable for routine use by those tasked with the evaluation of interventions. Here, methods are presented that incorporate background subtraction to improve sensitivity and confidently quantify changes not readily detected in initial air quality time-series. Applied to air quality data collected in Leeds in the UK, the methods indicate a general reduction in the local NO2 contribution across the studied period, 01 January 2015 to 31 January 2019, but also superimposed on that two discrete reductions: the first 2.4 μg m−3 (0.03 to −4.8 μg m−3; 95% confidence) in late 2015, and a second of 3.6 μg m−3 (1.2–6.1 μg m−3; 95% confidence), equivalent to a 12% (4% to 21%; 95% confidence) reduction in ambient air that coincides with the period when the local 2018 bus fleet was upgraded to cleaner Euro VI vehicles.
| Environmental significanceBreak-point and change-segment detection methods can be used to independently detect and quantify change within time-series. However, the inherent variability in air quality data often hinders direct measurement of the impact of all but the largest change events. Here, deseasonalisation, deweathering and background subtraction are used to pre-process data, to improve sensitivity and detect change associated with an urban bus fleet upgrade not readily detected in ambient air. Such real-world evidence is much needed as part of efforts to evaluate the impacts of in-coming air quality initiatives, e.g. the Clean Air Zones in the UK, measure the impacts of discrete events, e.g. traffic network disruptions and forest fires, and to inform those developing next-generation environmental management policies. | 
Local air quality improvement activities include a wide range of interventions and actions to accelerate the renewal of vehicle fleets and reduction of on-road vehicle numbers and emission levels, including: vehicular access restrictions for vehicles deemed as excessive emitters e.g. in Low Emission, Ultra Low Emission and Clean Air Zones (LEZs, ULEZs and CAZs), cleaner public transport services, alternative vehicle procurement and older vehicle retrofit and scrappage incentivisation schemes, infrastructure development (e.g. electric vehicle charging points and alternative fuelling stations), traffic flow management and calming activities, and the promotion of active travel.8,9 Some large-scale interventions are reported to have demonstrable air quality benefits, e.g. reductions of 10–30% and 5–10% for PM10 and NO2, respectively, have been reported following the introduction of some LEZs in Europe.10–12 Integrated multi-action approaches have been reported to be even more effective, e.g. the combined air quality actions employed in China during the 2008 Beijing Olympic Games were reported to lower PM10 by 55% and NOx by 47%, respectively.13 The air quality impacts of most interventions are, however, more limited and less certain.8,9,14
Here, it is important to acknowledge both the complexity of such analyses, e.g. the inherent variability of air quality data,15 the influence of meteorology,16,17 the challenges of attributing impact to individual interventions that rarely occur in isolation,18 and even the limitation of the analytical methods themselves.14 It is also important to note that as air quality is improving (albeit slowly) in many countries with more progressive air quality management strategies,15,19,20 any intervention-related improvements are becoming not just harder to earn but also more challenging to isolate and quantify. It is also perhaps all too easy to dismiss local interventions as local actions with only local relevance. They are key elements of city and regional plans being rolled out across many countries, and their effectiveness is critical to the delivery of national air quality strategies, internationally.21–23 For example, UK Government is working with 61 local authorities across the country to tackle NO2 exceedances, the allocated budget is £880 million and local interventions are central to all these activities.7 Associated impact assessment is fundamental information, needed to support evidence-based policy making and those debating intervention performance, benchmarking, prioritisation and justification of investment.
Likewise, climate change is increasing the likelihood of atypical meteorological and environmental events, e.g. extreme weather events24 and wildfires,25 and with these the relative significance of related air pollution impacts. As a result, the identification, quantification and apportionment of air pollution associated with discrete changes is becoming an increasing important element of environmental research, and various statistical methods have been applied as part of this work.26
Break-point detection methods test for points in a series of observations that are better explained, with greater statistical significance, by an abrupt change within the monitored system rather than chance, noise or underlying trends.27–30 They have been widely used in many commercial and research areas, including several air quality applications.16,31–33 Various signal isolation methods have also been used as a data ‘clean-up’ step prior to air quality data analyses. Background subtraction or correction methods have perhaps been most widely used, and provide a measure of local contributions or ‘increments’.34–37 Classical trend deconvolution methods such as ‘deseasonalisation’ assume there are regular frequency cycles in time-series, e.g. hour-of-day, day-of-week, and week-of-year cycles, and that modelling and subtracting these frequency patterns from time-series provides a clearer measure of underlying trends.38 Deweathering, sometimes also called ‘weather normalisation’ or ‘meteorological detrending’, extends this approach to the removal of variance associated with changes in meteorological conditions, such as wind speed and direction, air temperature and humidity.16,39–41 Conditional extraction,42 molecular tracers43 and diagnostic ratios,44,45 amongst other methods, have also all been used to isolate source-specific contributions. In the few cases where such signal isolation methods have been applied in combination with break-point methods, improved sensitivity26 and/or easier trend visualisation16 have been reported.
Methods extending the approach to break-segments, regions of change about a break-point, have been recently applied to the characterisation of major wide-scale air quality changes during the COVID-19 related UK lockdown.46 There, NO, NO2 and NOx decreases of (on average) 32% to 50% were observed at roadside Automatic Urban and Rural Network (AURN) sites across the UK, and associated O3 increases. That work also highlighted the extra insights to be gained if independent detection of the change-point is incorporated in such analysis, rather than explicitly assumed, as in conventional ‘before and after’ analyses. Here, extending on this work, a novel combination of local contribution isolation, employing a conservative implementation of deseasonalisation, deweathering and background subtraction, and break-point and change-segment analysis methods, is used to isolate, detect and then quantify change on smaller and more localised scales. The methods are applied to NO2 air quality data from Leeds in the UK as part of an investigation of the potential environmental impact of a single traffic intervention. Leeds is one of the largest cities in the UK. It is 272 km NNW of London, has a population of 789![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) 194 (third largest after London and Birmingham47), and the city centre, which is both heavily urbanised and densely trafficked, has been actively air quality managed by Leeds City Council (LCC) for over a decade. As part of local air quality actions, from Autumn 2016 the local bus operator introduced a rapid bus fleet overhaul, upgrading buses on selected routes to Euro VI. The investigation of the air quality impacts of interventions like this bus fleet upgrade, allows impact measurement methods to be developed and tested, and provides evidence on the potential effectiveness of bus-related interventions that is of relevance to both their use as standalone actions and as elements of larger air quality activities, e.g. the selection of vehicle-type-related strategies as part of LEZ and CAZ implementations.
194 (third largest after London and Birmingham47), and the city centre, which is both heavily urbanised and densely trafficked, has been actively air quality managed by Leeds City Council (LCC) for over a decade. As part of local air quality actions, from Autumn 2016 the local bus operator introduced a rapid bus fleet overhaul, upgrading buses on selected routes to Euro VI. The investigation of the air quality impacts of interventions like this bus fleet upgrade, allows impact measurement methods to be developed and tested, and provides evidence on the potential effectiveness of bus-related interventions that is of relevance to both their use as standalone actions and as elements of larger air quality activities, e.g. the selection of vehicle-type-related strategies as part of LEZ and CAZ implementations.
|  | ||
| Fig. 1 Locations of monitoring stations: UK map (left) showing Automatic Urban and Rural Network (AURN) Rural Background sites used (in blue), and Leeds local map (right) showing Leeds City Council (LCC) monitoring stations (in red). Headingley is AURN affiliated, but shown in red here to indicate LCC as source of data used in this study. A wind rose is also included as an insert to indicate wind speeds and directions for the Leeds area during the study period, 01 January 2015 to 31 January 2019 (data source openair MET; see also Table 1) (map tiles produced by Stamen Design, under CC BY 3.0. Data under ODbL using R package OpenStreetMap50). | ||
| Site | Site typea | Data typeb | Data sourcec | Latitude | Longitude | Altitude (m) | Data capture (%) | 
|---|---|---|---|---|---|---|---|
| a Headingley and AURN site type assignments by Defra classification scheme, Kirkstall Road and Temple Newsam assignments by Local Authority. b CL NO2 nitrogen dioxide measured using chemiluminescence analyser, WD and WS wind direction and speed, both measured by anemometer or model thereof. c LCC Leeds City Council, AURN Automatic Urban and Rural Network, NOAA National Oceanic and Atmospheric Administration Integrated Surface Database, WRF Ricardo Weather Research And Forecasting meteorological model. d Although Headingley data was obtained from LCC for this study, the station is AURN affiliated so data is also in the AURN archives. e All time-series were for full study period except Ladybower which was non-operational in 2015 (see also Table 2). f Although Leeds MET data capture was relatively high, some reported values were questionable, so openair MET data was used for main analysis reported here. g Downloaded from AURN archive using openair R Package, data 10 × 10 km resolution model output so area-specific rather than site-specific. | |||||||
| Headingleyd | Kerbside | CL NO2 | LCC | 53.82035 | −1.57669 | 85 | 99 | 
| Kirkstall road | Roadside | CL NO2 | LCC | 53.80873 | −1.58929 | 34 | 90 | 
| Temple Newsam | Background | CL NO2 | LCC | 53.78557 | −1.45607 | 67 | 86 | 
| Glazebury | Rural background | CL NO2 | AURN | 53.46008 | −2.47206 | 21 | 90 | 
| High Muffles | Rural background | CL NO2 | AURN | 54.33494 | −0.80855 | 267 | 90 | 
| Ladybowere | Rural background | CL NO2 | AURN | 53.40337 | −1.75201 | 420 | 95 | 
| Market Harborough | Rural background | CL NO2 | AURN | 52.55444 | −0.77222 | 145 | 95 | 
| Leeds METf | Meteorological | WS, WD | LCC | 53.78634 | −1.54166 | 30 | 97 | 
| Leeds Bradford MET | Meteorological | WS, WD | NOAA | 53.866 | −1.66100 | 208 | 93 | 
| Openair METg | Meteorological | WS, WD | WRF | — | — | — | 98 | 
| Headingley A660 traffic | Traffic counter | Vehicle flow | LCC | 53.80539 | −1.57843 | 31 | 94 | 
| Kirkstall road A65 traffic | Traffic counter | Vehicle flow | LCC | 53.81230 | −1.55824 | 87 | 80 | 
Headingley is operated as an affiliated site as part of Defra's national Automatic Urban and Rural Network (AURN; see also https://uk-air.defra.gov.uk/networks/network-info?view=aurn), while Kirkstall Road and Temple Newsam are operated by LCC as independent local monitoring stations. Headingley and Kirkstall Road are roadside sites on major roads (A660 and A65, respectively), both have similar road layouts, traffic junctions, signalised pedestrian crossings, and are subject to similar traffic flows/volumes. As the Euro VI bus fleet upgrade was only initially applied to buses on routes on the A660, the combination of Headingley (roadside, intervention) and Kirkstall Road (roadside, non-intervention) provides a classical ‘test and control’ combination. They exhibit similar air quality characteristics, as indicated by their NO2 polar plots (Fig. 2), which are both dominated by similar SE (S to ESE) features: main maxima, 40–55 and 40–45 μg m−3, respectively, at ca. 5 m s−1 suggesting highest contributions from nearby sources in that direction and secondary maxima at ca. 0 m s−1 consistent with local NO2 accumulation during periods of air stagnation. The third site, Temple Newsam, is on rural land to the SE of Leeds, where NO2 concentrations are typically much lower (compare e.g. polar plots in Fig. 2). Prevailing winds (indicated by the wind rose for the full study period, inset to Fig. 1) show that both roadside sites were often up-wind of Temple Newsam during the study period, making this a less than ideal candidate for use as an associated background. However, as Stedman and colleagues observe, ideal background sites are exceptional rare,34 but with respect to both Headingley and Kirkstall Road, Temple Newsam meets the key criteria they recommend for a viable background, most importantly that is it is near, all evidence indicates that it is relatively unexposed to local pollution, and foreground/background concentration ratios are typically >1 (see e.g.Fig. 3).
|  | ||
| Fig. 2 Polar plots of NO2 at Headingley, Kirkstall Road and Temple Newsam for the study period, 01 January 2015 to 31 January 2019. All plots generated using WRF meteorological data from Headingley AURN ‘openair’ dataset, see also Table 1 and discussion in Section 2.1. | ||
In addition, 1 hour resolution NO2 data was obtained for the nearest AURN rural background sites for the same period: Glazebury, High Muffles, Ladybower and Market Harborough. These were selected as the nearest sites (all within 200 km of Headingley) that were classified as ‘Rural Backgrounds’ and had NO2 data for the study time period, and accessed using R package ‘openair’.48,49 Data captures were lower at the non-AURN-affiliated background site Temple Newsam, as it is arguably understandable that its upkeep might be of lower priority.
Meteorological data from three sources was used in this study: the Leeds Meteorological Station (Pottery Fields House, Kidacre Street), the Leeds Bradford Airport Meteorological Station, the nearest site submitting data to the NOAA Integrated Surface Database (https://www.ncdc.noaa.gov/isd), accessed using R package ‘worldmet’,50 and modelled data generated by the Ricardo WRF model (https://ee.ricardo.com/air-quality) and supplied with AURN data when downloaded using ‘openair’ function importAURN. Data from the three sources were typically similar for near locations (e.g. r ≈ 0.77–0.88 for wind speed). Here, Ricardo WRF meteorological data was selected for use in the main study for three reasons: (1) where the largest differences were observed between the datasets, the Leeds Meteorological data was often most suspect (e.g. highly noisy or very different from neighbouring measurements). (2) The Ricardo WRF model is implemented at about 10 km2 grid resolution, so meteorological data from the nearest AURN site could be assigned to any site in the Leeds area without dedicated meteorological time-series of its own, making the method readily transferrable to nearby non-AURN sites. And, (3) although break-point/segment analyses using either Ricardo WRF and NOAA meteorological data provided highly similar results, data capture levels were highest for Ricardo WRF meteorological data, an important consideration when using both signal isolation and change detection methods (see also Section 2.3 regarding Background Data handling and ESI† regarding selection of data and methods used for signal isolation). Meteorological data selection should, however, be considered on a case-by-case basis, because, e.g., not all AURN datasets include as extensive WRF time-series as Leeds Headingley and at other sites, NOAA or other local meteorological data sources may provide a better proxy.
Near continuous 15 minutes averaged data from Automatic Traffic Count (ATC) sites (two inductive loop per lane configuration) located on the Headingley Lane (A660) and Kirkstall Road (A65) arterials was accessed via the Drakewell C2-Cloud interface (https://www.drakewell.com/c2-web). The dates of the phases of the Euro VI bus fleet renewal were provided by LCC and the public transport operator First Bus (https://www.firstgroup.com/leeds). The month-by-month share of Euro V and VI double-decker Buses operating on the two arterials were provided by LCC and the public transport operator First Bus (https://www.firstgroup.com/leeds) on the 2nd May 2020.
|  | (1) | 
For all work reported here, the test window size, TW, was set to 10% of full data range, Ttotal, and F-Stat scores were used to test significance. Although the number of break-points searched was not limited during this step, window size and assignment strategy restricts the maximum number that can be detected to approximately (Ttotal/TW) − 2, or ca. 8 in this case.
All potential break-points identified based on F-Stat scores were tested and candidate break-points selected or discarded on the basis of Bayesian Information Criterion (BIC) in accordance with standard strucchange methods20 and subsequent independent testing of BIC-selected break-points in the form eqn (2):
| [NO2]t=1:Ttotal = lm(trendt=1:BP1 + trendt=BP1:BP2 + … + trendt=BPn:Ttotal) | (2) | 
Independent testing of BIC-selected break-points was applied in a stepwise-fashion. The initial model was accepted if it was statistically valid (or more specifically if all terms associated with individual break-points were all statistically significant, at p < 0.05). If not, all model combinations discarding one of the initial break-points were built, tested and compared, and the statistically valid model with the highest correlation (all break-points individually statistically significant, p < 0.05 and highest R) was accepted, or the process was repeated until a statistically valid model was obtained or all break-points were rejected. This step is included as an additional test of BIC-selected break-points, rather than an alternative to BIC based selected. Additional tests were investigated in light of concerns raised about BIC by the stucchange's authors,28 and the current additional method was selected on the basis of performance in simulation testing. That said, at this stage, this is presented as an empirical solution and arguably more work may yet be required on break-point selection.
Elsewhere31 it has been observed that such approaches test for instantaneous change, and that real environmental changes are more likely to happen gradually. For example, local residents and businesses would be expected to start purchasing compliant vehicles ahead of a LEZ/CAZ start date, and for purchases to increase as that deadline approached. To investigate evidence of more gradual change, break-point models were extended to test for more gradual changes using methods reported by Muggeo.53–55 Here, change-segments (regions of change) were allowed around identified break-points. The break-point confidence intervals, calculated using the methods of Bai,56 were applied as initial estimates of segment start and end points and then refined by iteratively testing neighbouring start and end points to provide estimates of break/change-segment time range and magnitude.
Here, a series of Generalised Additive Models (GAMs) were built for Temple Newsam in using R package ‘mgcv’57,58 and the AURN rural background data, starting with eqn (3):
| [NO2,TN] = s1([NO2,BG1]) + s2([NO2,BG2]) + s3([NO2,BG3]) + s4([NO2,BG4]) | (3) | 
Model predictions were then generated for all valid cases (i.e. time intervals with measurements for all four AURN rural backgrounds). This process was then repeated for the best fitting three-input model and any new predictions added to the prediction time-series. This process was then repeated again with the next best fitting three input model and so on until predictions were generated for all cases with three valid inputs in the AURN datasets.
Fig. 3 and 4 provide time-series and polar plot comparisons of the original Temple Newsam monitoring data and model outputs. By comparison to the original Temple Newsam data, the model typically produces a smoothed estimate of local concentrations (lower maxima, high minima, correlation coefficient, r, 0.76). As work here focused on the detection of changes, the modelled outputs, rather than hole-filled original data were used in the subsequent analysis in case data source switching between measured and modelled data introduced artefact changes. As a result, small gaps can be seen in model outputs, e.g. mid and late 2015, indicating periods where NO2 data was missing from 2/4 of the AURN time-series. This is considered an acceptable trade-off for the much larger data coverage gains, e.g. in early 2016 and late 2018/early 2019. One additional interesting feature worth noting here is that while the Temple Newsam model trends are most often similar to monitoring data trends, the model output does not include a high NO2 period seen in Temple Newsam data in late 2016. As similarly elevated levels of NO2 were not also seen at Headingley during this period, nor nearby AURN Rural Background sites either, it seems likely it is from a local pollution source near Temple Newsam rather than background levels more generally. The modelling process removed the high background NO2 feature in late 2016. It is therefore proposed that this type of nearest-neighbour background modelling could also provide an option in cases where local background data was suspect or not available, and work is on-going to investigate this.
|  | ||
| Fig. 4 NO2 polar plots for original Temple Newsam monitoring data-series and the Temple Newsam model (right top and bottom, respectively) and the four AURN Rural Background monitoring sites used to build the Temple Newsam model. All plots generated using WRF meteorological data: for AURN sites own data; for Temple Newsam from Headingley AURN, all via ‘openair’, see also Table 1 and discussion in Section 2.1. | ||
| [NO2] = s1([NO2,TN]) + s2(wd, ws) + s3(hd) + s4(jd) | (4) | 
| [NO2,local] = ([NO2] − [NO2,mod]) + mean([NO2]) | (5) | 
Here, non-local contributions are modelled in a single step, rather than sequential steps to minimise compounding of modelling errors, and a minimal-input design was adopted both as part of efforts focused on making these methods more widely accessible, and as a response to trends observed during simulation testing (see also ESI†).
In addition to GAMs, a number of other modelling strategies have been used in similar work, e.g. Boosted Regression Trees (BRTs)36,58 and Random Forests (RFs).16 One of the advantages commonly cited for many of these methods, GAMs, BRTs and RFs included, is that they are better able to handle nonlinearity and interactions than classical linear modelling approaches, and are not subject to collinearity. While this is true, it is important to recognise that the use of non-linear modelling does not prevent problems, rather it moves the potential modelling issue from collinearity to something more complex. In the case of GAMs, that fit curves, the issue becomes concurvity. Here, one of main reasons for using GAMs was that the R package ‘mgcv’ includes methods to test models for concurvity.59 This potential issue is not widely acknowledged in the literature, and it is unclear how analogous issues can be confidently tested for when using modelling approaches that adopt more complex input response models.
The first GAM component of the model, s1([NO2,TN]), is the background contribution, in this case the Temple Newsam model output. The use of a GAM, rather than a direct subtraction as would be applied in e.g. local increment calculations, means that this is a non-linear estimate of background contribution at the study site rather than an explicit assumption that [NO2,TN] is an absolute measure of the background. Elsewhere, foreground NO2 has been modelled as a function of foreground NOx, and background ozone and NO2 to provide a more sophisticated model of background and ambient NO2 chemistry.32 Here, the simpler background descriptor was adopted because it was more readily applicable and because exclusion of other inputs did not appear to significantly affect subsequent findings. It is, however, important to note that the focus in the current work was on break-point/segment performance rather than the statistical significance of the isolation model. The second GAM component, s2(wd, ws), is the meteorological contribution. Here, wind speed and direction are modelled as a two-input (or surface) model to reflect the more complex wind speed and direction interactions. Elsewhere more meteorological descriptors have been included in such models, including modelling terms for measures such as air temperature, humidity and pressure,17 but again, here, the more simplified description was adopted as a demonstration of what can be done even in cases where data is limited. The final two GAM components, s3(dh) and s4(jd), are included as seasonal terms. Here, these are not included as direct measures of any dependencies but rather as surrogates for contributions that have cyclic frequencies and are not captured by other model inputs. Comparisons of this and other models are provided in the ESI.†
|  | ||
| Fig. 5 General ambient NO2 trends at Headingley (red lines and circles), Kirkstall Road (blue lines and triangles) and Temple Newsam (green lines and squares), estimated using deseasonalised month-average measurement and Theil-Sen method in openair49 [all trends >95 confidence, p < 0.05, see also Table 2 and in ESI†]. | ||
|  | ||
| Fig. 7 Break-point detection and change-segment analysis of normalised local contributions for Headingley and Kirkstall Road time-series shown in Fig. 6 (top and bottom); data (grey), change-segments, with start and ends marked (blue) and associated confidence intervals (blue dashed) and segmented trends (red). The middle plot shows the percentage of Euro VI buses in the Euro V and VI fleet on routes through Headingley. By, comparison the local fleet on Kirkstall Road were 100% Euro V over the same timescales. | ||
|  | ||
| Fig. 8 Partial contribution plots for local contribution isolation (eqn (4) and (5)): left combined wind speed/direction term; middle top background NO2 term; right top hour-of-day term; and, right bottom day-of-year term. | ||
| Site | Mean NO2 concentration (μg m−3) | Trendb,c (μg per m3 per year) | ||||
|---|---|---|---|---|---|---|
| 2015 | 2016 | 2017 | 2018 | 2019a | ||
| a Studied data range was 01 January 2015 to 31 January 2019, so mean for 2019 was for January only. b Trend was calculated using Theil-Sen method in openair with deseasonalisation49 [***p < 0.001, **p < 0.01, *p < 0.05]. c Removing the 2019 data has but non-significant effect on these trends (about ± 15%) well within confidence intervals. | ||||||
| Headingley | 36.47 | 37.02 | 30.79 | 29.55 | 32.51 | −3.06 (−1.88 to 4.04)*** | 
| Kirkstall road | 27.0 | 23.0 | 23.0 | 21.0 | — | −1.04 (−0.24 to −1.93)* | 
| Temple Newsam | 15.0 | 17.0 | 13.0 | 13.0 | — | −1.01 (−0.61 to −1.32)** | 
| Glazebury | 11.14 | 11.21 | 9.01 | 10.19 | 21.54 | −0.85 (−0.29 to −1.34)** | 
| High Muffles | 2.94 | 3.12 | 2.77 | 2.93 | 3.63 | −0.09 (−0.33 to 0.16) | 
| Ladybower | — | 5.02 | 4.67 | 4.15 | 6.63 | −0.44 (−1.11 to 0.04) | 
| Market Harborough | 6.79 | 7.25 | 6.58 | 5.72 | 10.73 | −0.51 (−0.8 to −0.18)** | 
Break-point tests identified two break-points in the local NO2 at Headingley, the first on 30 October 2015 and the second on 07 April 2018. Change-segment modelling provided estimates of associated change periods of 25 September to 22 December 2015 (88 days) and 16 March to 27 April 2018 (42 days), respectively. The 2015 change was least distinct: −2.4 μg m−3 (0.03 to −4.8 μg m−3; 95% confidence), equivalent to a change of −6.8% (−2.3 to 11%; 95% confidence) in ambient NO2 at that time. The second change was, by contrast, larger, more rapid and more confidently measured: −3.6 μg m−3 (−1.2 to −6.1; 95% confidence), equivalent to a change of −12% (−4 to −21%; 95% confidence) in ambient NO2 at that time. By comparison, no break-points/segments were identified for local NO2 at Kirkstall Road, which maintained a 100% older Euro V Bus fleet through this period. In the periods before, between and after the two changes, the gradients in local NO2 at Headingley were all close to that observed at Kirkstall Road (−1.3, −1.5 and −1.3 μg per m3 per year), supporting the interpretation that general trends at Headingley were the result of two discrete changes, one in late 2015 and the other early 2018, superimposed on a more general decrease, that was highly similar to that seen at Kirkstall Road.
As both the Headingley and Kirkstall Road sites are on urban arterial roads where flow capacities were unchanged throughout the study period and subject to fixed-time signalisation strategies that were also unchanged, overall traffic flows would be expected to be relatively constant. However, when quantifying the air quality impact of a traffic fleet upgrade, it is obviously important to confirm this because a change in local traffic flows, especially one that associates with the introduction of the upgrade, could make an indirect contribution to observed changes. Here, Theil-Sen analysis suggests small changes in traffic flows at both the Headingley and Kirkstall sites, 7.8 and −6.09 vehicles per hour per year, respectively, but neither were statistically significant (p > 0.05; Fig. 9 top), and prediction ranges are much larger (−9.13 to 23.55 and −18.01 to 4.12, respectively), indicating that overall traffic flows did not change significantly at either site during the timescales of this study. Similarly, break-point testing of the deseasonalised and deweathered traffic flow data identified no statistically significant break-points/segments in traffic flows at either site (Fig. 9 middle and bottom). These findings show that there were neither gradual nor abrupt changes in traffic flows at or about the times of the investigated bus fleet upgrade, supporting the assumption that changes seen at the time of the bus upgrade were due to the upgrade itself.
Although month-by-month data on the share of Euro V and VI double-decker Buses operating on the two arterials was not of sufficient resolution for similar analysis, the percent Euro VI time profile for A660 Headingley bus routes included in Fig. 7 shows bus fleet composition changes during the study period. As part of a programme of on-going fleet improvement, the first batch of Euro VI Buses, 6 of 48, were introduced in December 2016, and a further 42 Euro VI Buses, taking the local fleet on the A660 to 100% Euro VI (all Wrightbus Streetdecks with Daimler OM934 diesel engines), were introduced January to April 2018. The first observation is that the second change independently detected by change-point/segment methods coincides very closely with the time period of the main bus fleet upgrade. The combination of Selective Catalytic Reduction (SCR) and Diesel Particulate Filter (DPF) emission abatement used by Euro VI buses should, if correctly setup and maintained, deliver significant benefits by comparison to earlier technologies. Transport for London (TfL), for example, reported that standard Euro VI London buses emitted an estimated 95% and 85% less NOx and PM10, respectively, by comparison to Euro V.60 Concerns have however been raised about Euro VI emissions at lower speeds, when exhaust gases are cooler and SCRs tend to be less effective, but modelling based on emission inventories indicates that they should still deliver appreciable real-world air quality benefits.61 Also, in areas on major bus routes where SCRs have been retrofitted to earlier bus fleets, air quality improvements of 9% and 14% have been reported for NOx and NO2, respectively.62 So, the improvements estimated here for the period of the main intervention are of a magnitude that could realistically be attributed to the Euro VI bus fleet upgrade.”
The earlier change-point/segment and earlier intervention are, however, not as readily associated. Simulation testing, described in further detail in ESI,† indicates that at the time of the intervention the methods used as reported here would have a higher than 70% likelihood of assigning the change to a point within ca. 2 months of actual date of occurrence (see e.g. Fig. S9–S12 and associated discussion in ESI†). So, the earlier break-point/segment, seen a year before the first upgrade, is highly unlikely to be an early prediction of this event. Similarly, the observed magnitude for the event (ca. 6.8%) is significantly larger than that would be anticipated for the first event if it were the result of an intervention one-quarter of the scale (6 versus 24 buses) of the second intervention. The anticipated magnitude, −3% (−12% × 6/24), is also almost half the 5% detection limit estimated for the method by simulation. It is therefore considered highly unlikely that the observed change could be the direct outcome of the first intervention.
Although not associated with method implementation or the investigated intervention, some further observations are made regarding the earlier 2015 event. Briefly summarising the preliminary analysis in Section 4 of the ESI,† although hindered by the completeness of the data, similar analyses of other nearby sites suggests that the 2015 event is more widely observed than the 2018 event, but urban rather than background/regional in nature. This is at about the time Euro 6 vehicle regulations were introduced in the UK and others32 have reported similar changes that aligned with the introduction of earlier vehicle regulations. However, at this stage without further work and the analysis of more sites across the UK, a similar interpretation would be highly speculative in this case.
Direct analysis of the Headingley NO2 time-series using break-point methods alone was hindered by seasonal, meteorological and background contributions to the area (see e.g.Fig. 6). Analysis suggests that these non-local contributes could hinder the detection of discrete changes of less than 35–50% of the local ambient mean concentration, which would be equivalent to a change of 10–14 μg m−3 at Headingley at the time of the interventions. As few air quality-related traffic interventions could realistically be expected to deliver such benefits, this obviously significantly limits the value of such break-point methods when applied to typical urban air quality data.
Previously, it has been reported that various time-series deconvolution procedures, e.g. background correction, deweathering and deseasonalisation,16,32 can improve the sensitivity of break-points analysis. Here, using a relatively simply local contribution isolation method and relatively few inputs, all of which are readily available to many local authorities in the UK, methods are demonstrated that can be used to isolate local changes not readily detectable in ambient air measurement data-series (cf e.g.Fig. 6 and 7).
Using these methods, a discrete and significant NO2 change was observed at Headingley, a decrease of 3.6 μg m−3 (1.2–6.1; 95% confidence), at the time of the major investigated intervention, equivalent to a 12% improvement in ambient NO2 levels. This local change, not seen at the Kirkstall Road site, was superimposed on a less pronounced and similar general decrease seen at both sites across the study period, supporting the apportionment of both a smaller scale and more general improvement in air quality at the two sites and a more abrupt step-change in Headingley in early 2018. The break-point analysis was extended using change-segment prediction methods,52–54 which provided an estimate of the change period of 16 March to 27 April 2018 (42 days).
As the urban arterials capacity was broadly stable through the study period, there was no statistically significant change in traffic demand and strong time alignment for the second major change-point/segment and the larger bus upgrade. This change is therefore attributed to the fleet upgrade to cleaner Euro VI powertrains.
Simulation studies indicate that, at the time of the intervention, the methods used as reported here would have a local contribution detection limit of ca. 5% for a decrease (<12% change observed) and a higher than 70% likelihood of assigning the change to a point within 2 months of actual date of occurrence (see e.g. Fig. S9–S12 and associated discussion in ESI†). We therefore conclude that, within the accuracy of the methods, this is also highly consistent with the observed change being the direct outcome of the intervention.
Further work with the simulation methods, included in ESI (Fig. S12–S19† and associated discussion), also provided useful insights into performance of the methods, e.g. predictive power at the start and end of time-series ranges and as detection limits are approached, and method behaviour with near break-points, a situation that could be encountered if e.g. data properties such as the gradient are varied (Fig. S12† and associated discussion).
Looking forward it is important to note that many of the contemporary traffic interventions currently proposed by local authorities in the UK are of the order of 1–2% (ref. 7 and 63) and acknowledge that a detection limit of ca. 10% (5–15% depending on underlying trend and direction of change, see ESI†) is still a relatively large detection limit if these methods are to be used to robustly benchmark the performance of the full range of local air quality actions and interventions. But, here, simulation also provides useful insight regarding options for refinements, e.g. break-point input data variance has a significant effect of detection limit (see e.g. Fig. S12–S14 in ESI†). However, simulation tests using 1 hour to 1 week resolution data indicated an optimum combination of 1 hour resolution when applying the local contribution isolation and 1 day resolution when applying the break-point testing to the Leeds 01 January 2015 to 31 January 2019 time-series. In particular, when used in combination with 1 hour contribution isolation, break-point method performance did not improve significantly when used at resolutions lower than 1 day (e.g. 1 week or 1 month). This was because time averaging decreases both variance and mean. So, simply using lower time resolutions is not necessarily the answer.
Some signal isolation methods decrease variance without a pronounced effect on mean, e.g., multiple-site time-series averaging, time-frequency filtering and ‘ensemble of ensembles’ methods (which build multiple models using different subsets of the input data and averaging the predictions of these). However, here again caution may be needed because simulation also suggests that there are likely to be trade-offs, and that perhaps sometimes some of the variance removed by more-aggressive or larger-scale normalisation strategies could actually be some of the variance needed to robustly detect and quantify small-scale change (see e.g. Fig. S13, S18 and S19 and associated discussion in ESI†). There was no need to apply further variance reduction here because the change of interest was detectable without more aggressive signal isolation. This is a ‘conservative’ strategy for studies applying combined signal isolation and break-point detection methods that it is arguably best both adopted and recommended, at least until any trade-offs between signal isolation and break-point detection are better understood. However, in other applications smaller changes will need to be detectable and such trade-offs do need to be further investigated as part of that process, along with options for the more strategic use of other data types to enhance signal isolation e.g. traffic data, vehicle fleet proportions.
By comparison to their use in other sectors, e.g. share trading in business, process line management in manufacturing and smart diagnostics in clinical practices where break-point methods are routinely applied in near-real time,64 break-point methods have traditionally tended to be employed on time-series of years in environmental studies. While it is important to acknowledge that longer timescale implementations are obviously more easily justified, and that methods are unlikely to be completely transferable, more timely break-point methods were developed in other sectors because they were needed.65 Similarly, local authorities need more timely assessments of the impacts of their traffic management activities and other interventions, so they can more confidently ensure they are delivering intended benefits, ideally at the earliest possible stage of the intervention process. So, there is also a need to investigate how much before and after data is actually needed to quantify a break-point, or more likely how predictive power changes with increasing monitoring time. In addition, there is also a need to consider methods for less certain measures of air quality, e.g. diffusion tubes and low cost-sensors, because in areas where conventional continuous analysers cannot be used, local authorities, and other interested parties, will undoubtedly be looking to use these. However, both case study findings and simulation studies reported here demonstrate that these approaches can already be used with confidence to measure the air quality impacts of larger traffic interventions using continuous analyser data.
| Footnote | 
| † Electronic supplementary information (ESI) available: CRAN (stable) release version of associated R package at https://CRAN.R-project.org/package=AQEval; Project website and developer’s code at https://karlropkins.github.io/AQEval/. See https://doi.org/10.1039/d1ea00073j. | 
| This journal is © The Royal Society of Chemistry 2022 |