Emissions and Secondary Formation of Air Pollutants from Modern Heavy-Duty Trucks in Real-World Traffic—Chemical Characteristics Using On-Line Mass Spectrometry

Complying with stricter emissions standards, a new generation of heavy-duty trucks (HDTs) has gradually increased its market share and now accounts for a large percentage of on-road mileage. The potential to improve air quality depends on an actual reduction in both emissions and subsequent formation of secondary pollutants. In this study, the emissions in real-world traffic from Euro VI-compliant HDTs were compared to those from older classes, represented by Euro V, using high-resolution time-of-flight chemical ionization mass spectrometry. Gas-phase primary emissions of several hundred species were observed for 70 HDTs. Furthermore, the particle phase and secondary pollutant formation (gas and particle phase) were evaluated for a number of HDTs. The reduction in primary emission factors (EFs) was evident (∼90%) and in line with a reduction of 28–97% for the typical regulated pollutants. Secondary production of most gas- and particle-phase compounds, for example, nitric acid, organic acids, and carbonyls, after photochemical aging in an oxidation flow reactor exceeded the primary emissions (EFAged/EFFresh ratio ≥2). Byproducts from urea-selective catalytic reduction systems had both primary and secondary sources. A non-negative matrix factorization analysis highlighted the issue of vehicle maintenance as a remaining concern. However, the adoption of Euro VI has a significant positive effect on emissions in real-world traffic and should be considered in, for example, urban air quality assessments.

, Summary of selected studies and methods to derive secondary PM for Figure 5 Table S6. Background variation and its influence on uncertainties on uncertainties of derived EFs Supplemental information Zhou et al. S2

Non-negative matrix factorization (NMF) and Hierarchical cluster analysis (HCA) NMF
Where the dimensionality of a dataset is large, it is useful to reduce that dimensionality to better understand the properties or processes of the system under study. Non-negative matrix factorization (NMF) 1 is a dimensionality reduction technique used in various fields such as atmospheric chemistry 2,3 to simplify data interpretation.
Here we use NMF to find common factors in CIMS EFs for the sampled vehicles. To find the number of factors for decomposition (k) we use the cophenetic correlation coefficient (CCC) as performed in other studies 2,5 . Each NMF run is initiated with a random seed 20 times to ensure the chosen solution is not a local minimum and to ensure reproducibility. The optimization is allowed to run for 500 iterations. Once k has been selected, the run with the lowest residual error (distance) is selected as the chosen solution.

HCA
Hierarchical cluster analysis (HCA) is a technique used to group observations into clusters that are more easily interpretable than individual observations. HCA is agglomerative such that individual observations merge into clusters, which can then merge further into larger clusters.

S3
HCA is independent of calibration instead it aims to describe the similarity of the trends of observations, which is useful here where measurements span several orders of magnitude.
Here we use HCA to group vehicles into clusters using their NMF factors derived from the CIMS EFs. The implementation of HCA as used in this instance is described elsewhere 6 .
Briefly, the similarity between two observations (A and B) are found by minimizing the root of the sum of squares of a pair of observations (Vehicle EFs, i).

,
The Ward linkage criterion was chosen to determine the distance between sets of observations as it gave similar or more interpretable results compared with other linkage criteria. Where the mean square error between a pair of candidate clusters and their subsequent merge cluster is minimal, the clustering is allowed to proceed; see Priestley et al. 6 for more detail. The user then determines the final number of clusters. This decision is dependent on the level of granularity required to interpret and describe the system. HCA was implemented using the cluster hierarchy module from SciPy (1.2.0) scientific python library using Python 3.6.

Analysis procedure
As many of the vehicles exhibit no detectable EF for a given species, 223 ions that have a data coverage of 75% of the 73 passages are selected for factorization. This data coverage gave the best trade-off between useful data whilst keeping the highest number of dimensions available for NMF. To remove the effect of dominant EFs skewing the process, data were scaled by first taking the log and then scaling between 0 and 1. Any missing values were imputed with the value 0. For the NMF analysis, a range of 2 to 11 factor solutions are explored, which were initialized at 20 randomly seeded points. 500 iterations were allowed before reaching the final solution. The NMF run with the lowest reconstruction error of a given factor solution is chosen as the optimum solution. Here a 2-factor solution is chosen ( Figure S3). The cophenetic correlation matrices can be found in Figure S4.
Factor 1 is the dominant factor for all vehicles although magnitudes vary. Contrastingly, factor 2 is the minority factor and can vary in magnitude from 0, i.e. it is not present at all, to nearly 50% of the total factor contribution ( Figure 3a). Analysis of the top ten species for each factor show Factor 2 is a low molecular weight factor that contains a strong contribution from small organic and inorganic molecules. Conversely, Factor 1 is a high molecular weight factor Supplemental information Zhou et al. S4 comprising many high mass organic compounds. The average mass for the top ten contributors is 287 ± 62 (1σ) for Factor 1 and 93 ± 33 for Factor 2. This separation of low and high mass factors is demonstrated in Figures S5 and S6. For these reasons, factor 1 is designated high mass and high carbon (HMHC) factor, whereas factor 2 is designated low mass and low carbon (LMLC) factor. These species are listed in Table S2. The 73 passages for which the identity is known were then clustered by HCA using these two NMF derived factors as the input variables.

Emission factor calculations
The EF calculation was based on the carbon balance method 7,8 , and details have been given in our previous studies and for this specific campaign in the overview paper by Zhou, et al. 9 . EFs were calculated using equation X1: where EFpollutant is the emission factor of the respective pollutant. The time interval of t1 to t2 represents the period when the instruments measured the concentration of an entire pollutant peak from an individual HDT; typical duration of plumes can be seen in Figure S1. t1 and t2 were determined independently for each pollutant peak to account for differences in the response time of individual instruments to the exhaust plume. The starting time (t1) can easily be identified, while t2 is when the intensity after the peak levels out and becomes indistinguishable from background levels. It is noted that the total integrated peak intensity usually is insensitive to the exact location of t2 since the added integrated signals at or beyond this point are small and represent the noise around the background level. The background concentrations (except for FIGAERO-particle phase) were derived using data points just prior (ca 5 seconds) to the concentration peak for individual HDTs. This minimizes any effects of fluctuation of ambient concentration. At this specific background site, the general variation in background concentration was low for all pollutants and varied on timescales much longer than the duration of each measurement. To get an emission factor per kg fuel, an EFCO2 of 3158 g (kg diesel fuel) −1 was used assuming complete combustion and a carbon content of 86.1 % as given in Edwards, et al. 10 .

Model calculations of OH exposure
The OHexp in Go:PAM was calculated using the model described by Watne, et al. 11 . Briefly, a chemical model containing a comprehensive description of ozone photolysis and HOx chemistry and a skeleton description of NOx, CO, HC and SOx chemistry was used to mimic the gas-phase chemistry in Go:PAM (Table S4). The minimum OH exposure was derived for each HDT passage plume using the maximum NOx, HC and CO concentrations in Go:PAM and the corresponding water and ozone concentrations. The assumed speciation of HC was aldehydes (26%), alkanes (33%), alkenes (14%) and aromatic compounds (27%). The oxidation capacity of Go:PAM was offline calibrated by SO2 as described by Lambe et al. 12 , where the photon flux at 254 nm, PFLUX254= 1.57×10 16 cm -2 s -1, and first order loss rates of OH were derived by matching the measured and modeled SO2 and O3 decreases.
Recently, a concern of non-OH chemistry in the OFR has been raised. 13 In this study, we estimated the ratios of exposures of non-OH species to OH exposure for O3, O( 1 D) and O( 3 P), and they were generally on the orders of 10 3 , 10 -7 and 10 -3 cm s -1 , respectively. The relative importance of non-OH chemistry was evaluated according to Peng et al. 13 , by taking toluene as a surrogate as it is a common SOA precursor found in vehicle emissions. 14

Uncertainty discussion
The uncertainties of derived EFs can be divided into analytical uncertainties for respectively instrument, the combined error on the method to derive EFs (e.g., background correction) and the overall variability of conditions for the on-road vehicle (e.g., engine speed). All these uncertainties are then contributing to the variability of the observed EF of each vehicle class as presented in Table 1 and Table1S.

Analytical uncertainties for respectively instrument
Uncertainties in CO2 and NOx measurements were estimated to be around ±2% for the LI-840 analyzer and ±1% for the chemiluminescent analyzers (model 42i, Thermo Scientific Inc.), respectively. For the remote sensing device measurements (AccuScanTM RSD 5000) (OPUS Inspection Inc.) of CO, NOx, and HC, the uncertainties were about ±15% of the readings 16 (these data was not used for EF calculations but rather to estimate OH exposure for the aged data). For EEPS data the EFPN was compared with EF derived from a CPC unit (see Figure S1 in Zhou et al, 2020). The uncertainty in the slope has a relative standard error of 3%. The uncertainties in the EFPM is more complex to derive and depends on the sizing and nature of emitted particles that for combustion generated particles is somewhat unknown. However, previous diesel engine tests showed that the deviation of the total particle number concentrations measured by the EEPS (soot matrix) and SMPS differed by less than 26%. 17 This corresponds to about 13% in mass concntrations. 17 Consequently similar uncertainties may be assumed for this study. For HR-ToF-CIMS measurements, a sensitivity factor to convert the CIMS signal into concentration is necessary to estimate absolute EFs. The accuracy of the pollutant concentration was limited by the uncertainty in the sensitivity factor. Based on Lopez-Hilfiker et al. 18 , the maximum sensitivity (collision-limited) in this study was determined to be 20 Hz ppt -1 , which falls within previously reported ranges. Using the maximum sensitivity provides a lower-limit estimate of EF for all the oxygenated volatile organic compounds (OVOCs). 19 One may note that the instrument sensitivity do not change during the campaign and do not vary between captured HDTs and therefore do not influence the conclusions of observed relative emission reductions as a result of the change from Euro V to Euro VI or the ratio of aged to fresh emissions.

Error due to background correction
As described in the emission factor calculations section (SI), the consideration of the background for gas-and particle-phase constituents (except for FIGAERO -particle phase) is straightforward, i.e., an average (ca 5s) of the signal prior to the plumes. The uncertainty in background correction relied on the potential variation of pollutant background concentrations, which, however, were relatively stable during the 5s-duration before each peak ( Figure S1).
The relative standard deviation of the background varied for each passage and species with an average of 27% for the 4672 background averages used (73 passages × 64 species matrix).
However, the absolute influence on the EF is usually less since the signal is greater than the background. Therefore, the uncertainties arisen from background subtraction were estimated to be generally less than 20% (see examples in Table S6).

Variability within the fleet
The largest influence on the EF derived in this study come from the variability of the fleet and the real-world driving conditions, which were greater than any of the analytical and background correction uncertainties. One may note that this variability is an important aspect of how EFs          c EFAged after the emissions were exposed to OH of 1.0×10 11 molecules cm -3 s.   Table S5. Summary of selected studies and methods to derive secondary PM for Figure 5.
Instrumentation, the characterized constituents and the density assumption.  b The time interval of t 1 to t 2 represents the duration of the specific plume.