Neighborhood-Level Nitrogen Dioxide Inequalities Contribute to Surface Ozone Variability in Houston, Texas

In Houston, Texas, nitrogen dioxide (NO2) air pollution disproportionately affects Black, Latinx, and Asian communities, and high ozone (O3) days are frequent. There is limited knowledge of how NO2 inequalities vary in urban air quality contexts, in part from the lack of time-varying neighborhood-level NO2 measurements. First, we demonstrate that daily TROPOspheric Monitoring Instrument (TROPOMI) NO2 tropospheric vertical column densities (TVCDs) resolve a major portion of census tract-scale NO2 inequalities in Houston, comparing NO2 inequalities based on TROPOMI TVCDs and spatiotemporally coincident airborne remote sensing (250 m × 560 m) from the NASA TRacking Aerosol Convection ExpeRiment–Air Quality (TRACER-AQ). We further evaluate the application of daily TROPOMI TVCDs to census tract-scale NO2 inequalities (May 2018–November 2022). This includes explaining differences between mean daily NO2 inequalities and those based on TVCDs oversampled to 0.01° × 0.01° and showing daily NO2 column-surface relationships weaken as a function of observation separation distance. Second, census tract-scale NO2 inequalities, city-wide high O3, and mesoscale airflows are found to covary using principal component and cluster analysis. A generalized additive model of O3 mixing ratios versus NO2 inequalities reproduces established nonlinear relationships between O3 production and NO2 concentrations, providing observational evidence that neighborhood-level NO2 inequalities and O3 are coupled. Consequently, emissions controls specifically in Black, Latinx, and Asian communities will have co-benefits, reducing both NO2 disparities and high O3 days city wide.


Date
Houston  Equation S1.Population-weighted NO2 columns are calculated as the product of the census tract averaged NO2 TVCD (NO2,j) and demographic group population (pj) in the i th tract summed over all tracts with NO2 data (n).The summation is divided by the demographic group population (pj).
(Eq.S1) Population-weighted NO2,j = ∑ n NO2,i pi,j / ∑ n pi,j         understanding the interactions between different variables.Linear regression models of Y i and X i using one explanatory variable are expressed by: Y i = α + β × X i + ε i , where ε i ~ N(0, σ 2 ) and α and  are the unknown intercept and slope.Linear models fail when relationships are nonlinear and nonmonotonic, e.g., the PO3 dependence on NO2; therefore, we use a general additive model (GAM), which provides better performance for nonlinear relationships and is more appropriate for limited sample sizes than neural networks.In additive modeling, the relationship between Y i and X i is expressed by: Y i = α + f(X i ) + ε i , where ε i ~ N(0, σ 2 ).[4] We use R to construct the GAM, testing the 'gam' and 'mgcv' packages on NO2 inequalities calculated using TROPOMI NO2 reprocessed on the S5P-PAL system from May 2018-September 2021 and temperature ranges based on quintiles.These data are different than in the final version of the paper but should not substantially affect our decisions based on this evaluation.In the 'gam' package, a back-fitting algorithm was used to estimate one smoother at a time.The 'mgcv' package allowed for cross-validation and generalized mixing modeling.We test a LOESS and spline smoother in 'gam' package and cubic spline smoother in 'mgcv' package.The core R code of each method is shown in Table S12.By using LOESS in the 'gam' package, we fit a local polynomial model of order two and set the percentage of the data inside the window as 0.9 to avoid overwiggling (Figure S8).We fit a smoothing spline using the 'gam' package with three degrees of freedom (df) (Figure S9), which provides the best performance out of 2-10 df based on the Akaike Information Criteria (AIC), a measure of goodness of fit and model complexity (Table S13).
Details on the parameters in the core code are provided in Hastie. 5We used the cubic regression spline function in the 'mgcv' package to fit gam using cross-validation to estimate the optimal amount of smoothing (Figure S10).Briefly, 'mgcv' divided X into different intervals and fit a cubic polynomial ( ) in each interval that was used to construct the smoothing curve.
We check the models for homogeneity and normality, as shown in Figures S11-S13 and S14-16, respectively.Heterogeneity occurs when the data spread is not equal at each X value and can be identified by plotting the residuals against fitted values.We opt against transforming the data, as there is no extreme heterogeneity.We generate QQ-plots of the residuals, finding all metrics have strong normality. 3Finally, we compare the AIC value of all these methods (Table S13).While the AIC are similar, the cubic spline smoother in 'mgcv' package generally gives the best performance, followed by the spline smoother in 'gam' package.However, 'mgcv' over-wiggles in some cases, and because of this, we use the spline smoother in the 'gam' package with 3 df as our model.
Table S12.Three initial methods tested for GAM construction.The envelopes are the 95% confidence intervals.

Figure S3 .
Figure S3.Scatterplots between daily UA and MSA-level relative and absolute inequalities on

Figure S5 .
Figure S5.Scatter plots of absolute NO2 inequalities and surface wind speed during winter months

Figure S6 .
Figure S6.Scatter plots of absolute NO2 inequalities and surface NO2* during winter months

Figure S8 .
Figure S8.MSA-level mean MDA8 O3 with fitted value for NO2 inequalities for Black and

Figure S9 .
Figure S9.MSA-level mean MDA8 O3 with fitted value for NO2 inequalities for Black and

Figure S10 .
Figure S10.MSA-level mean MDA8 O3 with fitted value for NO2 inequalities for Black and

Figure S11 .
Figure S11.Homogeneity of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for

Figure S12 .
Figure S12.Homogeneity of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for

Figure S13 .
Figure S13.Homogeneity of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for

Figure S14 .
Figure S14.Normality of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for Black

Figure S15 .
Figure S15.Normality of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for Black

Figure S16 .
Figure S16.Normality of GAMs for MSA-level mean MDA8 O3 and NO2 inequalities for Black

Table S4 .
Mean daily inequalities (May 2018-November 2022) by TROPOMI pixel size.The pixel size thresholds are defined according to the pixel size quintiles.Uncertainties are 95% confidence intervals from bootstrapped distributions sampled with replacement 10 4 times.

Table S5 .
Mean daily inequalities (May 2018-November 2022) by TROPOMI coverage, defined as the percentage of census tracts with observations.Uncertainties are 95% confidence intervals from bootstrapped distributions sampled with replacement 10 4 times.

Table S6 .
Oversampled versus mean daily NO2 inequalities on days classified in each orbit pattern.

Table S7 .
Mean daily population-weighted NO2 by race-ethnicity in the Houston UA (May 2018-November 2022).Uncertainties are 95% confidence intervals from bootstrapped distributions sampled with replacement 10 4 times.

Table S8 .
Weekday inequalities during TRACER-AQ along spatially coincident census tracts also sampled during DISCOVER-AQ.Spatially coincident census tracts are identified using the two GCAS flights with the greatest tract coverage during DISCOVER-AQ (4 September 2013,

Table S10 .
Demographics and number of census tracts covered along spatially coincident weekday flights during DISCOVER-AQ compared to TRACER-AQ.The differences in demographics with respect to the demographics across the urban area and metro area are reported.

Table S11 .
Demographics and number of census tracts covered along spatially coincident weekday flights during TRACER-AQ compared to DISCOVER-AQ.The differences in demographics with respect to the demographics across the urban area and metro area are reported.

Table S13 .
AIC values of the three methods based on absolute inequalities for each race-ethnicity group.