Source Quantification of South Asian Black Carbon Aerosols with Isotopes and Modeling

Black carbon (BC) aerosols perturb climate and impoverish air quality/human health—affecting ∼1.5 billion people in South Asia. However, the lack of source-diagnostic observations of BC is hindering the evaluation of uncertain bottom-up emission inventories (EIs) and thereby also models/policies. Here, we present dual-isotope-based (Δ14C/δ13C) fingerprinting of wintertime BC at two receptor sites of the continental outflow. Our results show a remarkable similarity in contributions of biomass and fossil combustion, both from the site capturing the highly populated highly polluted Indo-Gangetic Plain footprint (IGP; Δ14C-fbiomass = 50 ± 3%) and the second site in the N. Indian Ocean representing a wider South Asian footprint (52 ± 6%). Yet, both sites reflect distinct δ13C-fingerprints, indicating a distinguishable contribution of C4-biomass burning from peninsular India (PI). Tailored-model-predicted season-averaged BC concentrations (700 ± 440 ng m–3) match observations (740 ± 250 ng m–3), however, unveiling a systematically increasing model-observation bias (+19% to −53%) through winter. Inclusion of BC from open burning alone does not reconcile predictions (fbiomass = 44 ± 8%) with observations. Direct source-segregated comparison reveals regional offsets in anthropogenic emission fluxes in EIs, overestimated fossil-BC in the IGP, and underestimated biomass-BC in PI, which contributes to the model-observation bias. This ground-truthing pinpoints uncertainties in BC emission sources, which benefit both climate/air-quality modeling and mitigation policies in South Asia.


References (S40 -S44) Supplementary Notes
Note S1. A discussion on the radiocarbon (Δ 14 C) isotope endmember for biomass burning, based on trends in atmospheric 14

CO2
Fresh Biomass: A long-term decreasing trend in contemporary Δ 14 C-CO2 has been observed over pristine sites in northern and southern latitudes after the post-bomb period [1][2][3] . Based on observations from the southern hemisphere, the Δ 14 C-CO2 during 2015-2016 was ~ +20‰ 3 . This value also coincides with the observations from a northern mid-latitude site (upon assuming a steady annual decrease of roughly 3‰ since 2012) [1][2][3] . It has been suggested that for tropical regions an additional 5‰ increase in Δ 14 C-CO2 is plausible due to the increasing fossil fuel CO2 emissions in the northern midlatitudes which increases the meridional gradient and thereby the background in the tropics 1 . This aspect was taken into consideration based on the northern and southern hemispherical values for Δ 14 C-CO2. Thus, the Δ 14 C-CO2 for contemporary biomass (meaning one-year plants) for the South Asian region was approximated to be +20±10‰ as of year 2016.
Aged Biomass: The Δ 14 C value for multi-year biomass sources are complicated by the atmospheric nuclear bomb tests in the 1960s, which significantly increased the 14 C levels in CO2 at that time. Although these enriched levels are also decreasing on decadal time-scales, this means that the Δ 14 C signature of a tree is essentially dependent on when/for how long it was growing, and the annual increase in biomass [1][2][3][4][5] . Hence, trees with life-span on the order of 100 years, will have accumulated Δ 14 C signature non-linearly over this time which in general is ~ less than half the age of the tree 4,5 . On the other hand, one-year plants will only represent the atmospheric 14 C signature of CO2 for that year 4 . This complicates the situation as a few assumptions regarding the non-linear biomass accumulation of over time need to be made. Moreover, the age of the trees is region specific and hence the distribution of tree ages is different for different regions of the world (temperate vs. non-temperate) 4,6 .
To test the dependency of the biomass age and the Δ 14 C signature we made a sensitivity analysis.
Although very few trees in South Asia are > 100 years 6 , initially, we assume that biomass was uniformly distributed in trees between 1800 to 2015. This would mean averaging over the entire 14  value was also found to be similar to the endmember value from wood smoke (+102.5‰) derived from averaging sections of an 80-year-old pinewood in an indoor study 7 . A Δ 14 C endmember (+107.5±50‰) from wood burning in temperate regions was also found to be close to the estimated value from the sensitivity test 8 . Thus, based on the observations 5,7,8 and the sensitivity measure we approximated the Δ 14 C endmember for wood logged in the late 1990s and 2000s (~ 20 to 30-year aged biomass) to be +110±70‰.

Note S2. South Asia-specific radiocarbon biomass endmember (Δ 14 Cbiomass)
For South Asia we consider two main biomass combustion sources: C3-plants (the largest group of terrestrial plants e.g., wood, rice, wheat), C4-plants (e.g., sugarcane, maize, millet). This division is based on two principles: i) These source categories together capture most emissions of BC biomass in South Asia 9 and, ii) These sources may be differentiated by using dual-carbon isotope techniques 10,11 .

Note S3. Isotopic fractionation in the biomass source classes
The biomass sources can be divided into two categories based on photosynthetic pathways leading to distinct stable isotopic signatures: C3-plants are named based on the three-carbon compound produced by the CO2 fixation mechanism 10 . The CO2 uptake in this category of plants is limited by the carboxylation step involving large fractionation (~20‰) causing the average  13 CC3 biomass to cluster around -27.1±2.0‰ 10 . The C4-plants produce a four-carbon compound in a more efficient form of CO2 fixation. The rate limiting step is suggested to be diffusion rather than carboxylation involving a relatively smaller isotopic fractionation (~ 4‰). Thus, the average  13 CC4 biomass has been reported to cluster around -13.1± 1.2‰ 10 . Other, presumably less important, factors which contribute to the variability within the  13 C range of C3-and C4 biomass include species, latitude/longitude, soil water deficit, irradiance, topographic position, degree of utilization of respired CO2 10,12,13 .
Combustion of C3-and C4 biomass may induce carbon isotopic fractionation. Both plant type (e.g., initial starting material) and burn conditions (e.g., temperature, humidity and ventilation) have been shown to modulate the fractionation effects 12,13 . The differences in the biochemical fractions of depleted (e.g., lignin, cellulose, lipids) vs enriched (hemicellulose, sugars) compounds in the plant material or differences in the isotopically distinct component diverted to or retained during burning has been reported to affect the magnitude of fractionation 12,13 .
Several field and chamber investigations of aerosol (smoke) and residue (ash) from biomass burning have revealed that the C3-plants generally do not significantly fractionate C isotopes (< 0.7‰), implying that BC produced from burning C3 biomass usually represents the  13 C of the original plant 12,13 . However, the C4-plants have shown to have a plant-specific fractionation (C-13 depletion) ranging from 0.5‰ to 7‰ 10,12,13 . As an example, the isotopic fractionation in indoor burn experiments for sugarcane has yielded only a slight isotopic fractionation (< 1‰) compared to other C4-type grasses which have shown substantial depletion of ~ 4‰ and higher relative to the original plant material 12 .
As the major fraction of crop-residue burnt in South Asia is C3-type followed by C4 (mostly sugarcane) 14 , the isotopic fractionation effects during biomass burning are considered insignificant for these biomass types in South Asia and hence the  13 CBC endmembers for both these categories are assumed to be the same as the original plant material (see SI Table S5 for isotopic endmembers).

Note S4. A discussion on the sample SPX-BHL-121 not used in source apportionment calculations
The sample SPX-BHL-121 was collected between 23 rd Jan 2016 (5:00 pm) -24 th Jan 2016 (8:00 am). The sampling parameters from DH-77 Hi-vol sampler (flow rate-520 l/min, frequency-141, Va-461m 3 ) showed no peculiarity. Further, the OC, EC and IC data were comparable to other samples collected from BCOB during SAPOEX-16. The HYSPLIT based back-trajectories also showed the IGP influence similar to other samples collected during January 2016.
The EC (BC)-isotopic data reported in SI Table S4 for sample SPX-BHL-121 (accession #: OS-141179) was found to be ambiguous. SPX-BHL-121 was isolated for EC-isotopic investigation twice Local emission: BCOB is a rural site with limited activity around the observatory and receives aerosols predominantly from the IGP region during winter (SI Figures S2-S3). Assuming the third-run isolate of SPX-BHL-121 (accession #: OS-141179) is devoid of the aspect of human error, based on the  14 C values showing a highly depleted (> 90% fossil) signature, it is plausible that the sample is affected by emissions from 13 C depleted fuels like natural gas. However, gas-flaring is not a common source of black carbon emissions in South Asia 15,16 . It is possible that there could be an affect from a steamer/cruise ship (running on natural gas) crossing the region in the N. Bay of Bengal. A combination of extremely low wind speeds and stable boundary layer might cause the BCOB footprint to be potentially affected by such an event.
Overall, SPX-BHL-121 has shown large fluctuations in the isotopic signatures and the reasons for these are still unclear. Owing to the uncertainty, we have not included this data in further source apportionment calculations in the current manuscript.

Note S5. Bayesian statistical modeling for SAPOEX-16 study
By combining the dual isotope signatures (Δ 14 C and δ 13 C) and assuming mass balance, it is possible to differentiate the relative contributions from various source classes (SCs): A major complication to solving Equation 1 for realistic applications is the variability in the isotopic signatures of Δ 14 C and δ 13 C of various source classes i.e., endmember variability (e.g., SI Table   S5). For Δ 14 C, the fossil source is completely depleted in 14 C (Δ 14 C = -1000±0‰), the endmember variability thus relates to biomass. For SAPOEX-16, based on weighted black carbon emissions from C3-and C4-biomass source classes, we established a South Asia specific biomass endmember Δ 14 Cbiomass = +70±35‰ (SI Notes S1-S2 and Table S8). For δ 13 C, the endmember variability is larger and less well-constrained, due limited in-source measurements. The δ 13 C endmembers used in this study are based on a thorough literature review of existing δ 13 C data for C3-and C4-biomass source classes (SI Note S3 and Table S5) and from our previous publication where the liquid fossil and fossil coal δ 13 C endmembers were established 11 . As this data is reported in different formats (individual data points; ranges; and mean ± standard deviation) a careful statistical assessment for each source class category was made and then the following endmember values were established: δ 13 Ccoal= -23.4±1.3‰, δ 13 Cliquid fossil= -25.5±1.3‰, δ 13 CC3-biomass= -27.1±2‰, δ 13 CC4-biomass= -13.1±1.2‰. The uncertainties in endmembers dominate over the measurement uncertainties. The narrow variability of our δ 13 C and Δ 14 C data for a given site/month, also suggests this to be the case (SI Tables S4-S5).
It is recognized that in order to correctly estimate the relative source contributions and related uncertainties, the endmember variability as well as other sources of uncertainty needs to be included in the analysis [17][18][19] . Markov Chain Monte Carlo (MCMC)-driven Bayesian approaches have recently been implemented to account for multiple sources of uncertainties/variabilities. Such an approach for isotope-based source apportionment of atmopsheric aerosols was recently established in a previous publication 11 and has been used in multiple studies 15,18,[20][21][22] . Given the authenticity of the δ 13 C endmember distributions and the underlying well-established statistical methodology, the resulting estimates of the relative source contributions are very robust; the resulting probability density functions therefore give a 'least-biased' representation of the precision.
For the determined scenario (as an example MCMC3,coal), we have an isotopic mass-balance as:  Tables S4-S5). The key difference in these scenarios is the number of sources that are used in the source apportionment calculation as well as the type of sources considered. The observed dual-isotope signatures are then apportioned relative to the endmembers of each of these sources (SI Table S5).
Hence, based on the geometry of the data with respect to the endmembers in each of these scenarios, the contribution of the sources in expected to differ. The posterior probability density functions of relative source contributions for the source apportionment conducted using these three MCMC scenarios for SAPOEX-16 is presented in SI Figure S5.

Note S6. FLEXPART-ECLIPSE-GFED modeling for SAPOEX-16 study
For the bottom-up estimates of BC concentrations at MCOH the Lagrangian particle dispersion model FLEXPART version 9.2 23,24 was used. A particular advantage of FLEXPART over ordinary air mass back-trajectory models is that it involves many processes relevant for removal as well as dispersion of aerosols such as wet and dry deposition, convective mixing, turbulence above and below the boundary layer 23 . Furthermore, FLEXPART can be run in two in-time modes: forward and backward. With forward modeling, concentration fields are simulated whereas backward modeling (typically initialized from a measurement location i.e., receptor point) provides source-receptor relationships, also referred to as potential emission sensitivity (PES) 25 . The PES describes the sensitivity of receptor 'z' to source 's'. A matrix with elements forms the source-receptor relation.
The backward modeling is useful to understand contributing source regions and transport pathways to the observation site.  29,30 . The emissions are available on a yearly resolution (from which monthly are derived by splitting into 12 components) for various source types such as residential combustion, transport and shipping, thermal plants. Version 5 has ~44% higher emissions over India than version 4a as regional sources such as wick lamps and diesel generators have been included 16 . For this study, the emissions were explicitly split between biofuel (modern; for example, wood burning) and fossil fuel emissions (SI Tables S6 and S9)

Note S7. Air mass back trajectories and identification of source regions for SAPOEX-16 study
The NOAA Hybrid Single-Particle Lagrangian Integrated Trajectory model (HYSPLIT) version 4 35 was used to generate five-day air mass back trajectories (BTs) for BCOB and nine-day BTs for MCOH using the Global Data Assimilation System (GDAS; 1º × 1º) archived dataset, respectively. The BTs were generated at an arrival height of 100 m for every 3 h. This was followed by detailed air mass cluster analysis using the HYSPLIT desktop-based software (See SI Figure S2). The following air mass To further ascertain the influence of air mass clusters to the aerosol sampling conducted at both sites, the fractional cluster contribution (for every 24h) was computed for the entire duration of the campaign (SI Figure S3 a-b). Overall, it is clear that there was a temporal shift in the meteorology causing air masses from different source regions to be intercepted at BCOB and MCOH during SAPOEX-16. Although this analysis provides quantitative apportionment of the clusters, it is not fully effective in delineating the potential contribution of various source regions to the observed aerosol concentrations at both sites. We, therefore, performed the concentrated weighted trajectory (CWT) analysis 36,37 (SI Figure S3 c-d). In general, CWT analysis has been used to conduct source apportionment of various components such as BC 38 , aerosol optical depth 39 , atmopsheric mercury 40 .
Here we have performed CWT analysis for BC measured at the two receptor sites during SAPOEX-16 (see SI Figure S3 c-d). During CWT analysis, the values of BC concentrations are assigned to the respective BTs and a weighted concentration is allocated to the sequence of grid cells based on the residence time of the BTs as follows : where, Cij is the average weighted concentration in ij th grid cell, Cl is the measured BC concentration of a sample, τijl is the no. of trajectory endpoints in the ij th grid cell associated with Cl sample, L is the total number of BTs. The CWT analysis confirms quantitatively that the most important potential source regions of BC emissions influencing sampling at MCOH during January was the IGP. A particular observation in the CWT analysis is that the peninsular Indian region was the dominant BC emitting source region during ARS and SE Asia air mass clusters at MCOH (see SI Figures S2-S3).
Sampling at BCOB, as inferred from cluster analysis as well, was mainly influenced by BC emitted in the IGP with minor contributions from the N BOB.
The shift in meteorology, and thereby the source regions influencing MCOH, between January to March 2016 is further corroborated by FLEXPART-derived footprint emission sensitivity maps (see SI Figure   S11) and potential source contributions (SI Figure S13). Figure S1.

Fraction fossil estimates from bottom-up emission inventories of black carbon (BC)
for South Asia as shown in SI Table S1.

Figure S2. Air mass clusters during the South Asian Pollution Experiment 2016 (SAPOEX-16).
See methodological description in SI Note S7.        Table S6 for BC-Fire data. This is also corroborated by a slight trend in Δ 14 C-fbio between different air mass influenced periods at MCOH (fbio-Jan: 49±4%; fbio-Mar: 58±5%; see SI Table S4), which together suggest that open burning activities occurred in the latter half of the winter period in peninsular India (see also SI Figures S9, S11 and S13) affecting the sampling at MCOH.   Tables   Table S1.

Fraction fossil estimates from bottom-up emission inventories of BC for South Asia.
The compilation is based on previous publications 9, [21][22][23][24][25][26][27][28][29][30] ; asterisk refers to emission inventory predictions for a future base year. A plot of the same can be seen in SI Figure S1.   Note there were frequent power outages for long duration at BCOB, which was common in Bangladesh in 2016 and more so in remote locations 31 , this was a deterrent for continuous 24h
Fossil endmembers are unchanged as in a previous publication 11 .