A Picture of Disorder in Hydrous Wadsleyite—Under the Combined Microscope of Solid-State NMR Spectroscopy and Ab Initio Random Structure Searching

The Earth’s transition zone, at depths of 410–660 km, while being composed of nominally anhydrous magnesium silicate minerals, may be subject to significant hydration. Little is known about the mechanism of hydration, despite the vital role this plays in the physical and chemical properties of the mantle, leading to a need for improved structural characterization. Here we present an ab initio random structure searching (AIRSS) investigation of semihydrous (1.65 wt % H2O) and fully hydrous (3.3 wt % H2O) wadsleyite. Following the AIRSS process, k-means clustering was used to select sets of structures with duplicates removed, which were then subjected to further geometry optimization with tighter constraints prior to NMR calculations. Semihydrous models identify a ground-state structure (Mg3 vacancies, O1–H hydroxyls) that aligns with a number of previous experimental observations. However, predicted NMR parameters fail to reproduce low-intensity signals observed in solid-state NMR spectra. In contrast, the fully hydrous models produced by AIRSS, which enable both isolated and clustered defects, are able to explain observed NMR signals via just four low-enthalpy structures: (i) a ground state, with isolated Mg3 vacancies and O1–H hydroxyls; (ii/iii) edge-sharing Mg3 vacancies with O1–H and O3–H species; and (iv) edge-sharing Mg1 and Mg3 vacancies with O1–H, O3–H, and O4–H hydroxyls. Thus, the combination of advanced structure searching approaches and solid-state NMR spectroscopy is able to provide new and detailed insight into the structure of this important mantle mineral.


S1. Referencing of DFT-calculated NMR parameters
To compare experimental chemical shift values to those calculated using DFT, the calculated isotropic chemical shielding values, σ iso calc , must be converted to a corresponding calculated shift, δ iso calc , by where σ ref is a reference shielding. The most straightforward way to determine σ ref is to compare the experimental shift and calculated shielding for a simple model compound where the experimental shift is accurately determined and the structure understood.
Though this approach can work well, in some cases uncertainties in the experimental measurements of the NMR parameters and in the structural models (and therefore calculated shielding values), and the limitations of the DFT approximation employed can lead to errors in the computed chemical shifts. If a material contains more than one crystallographically distinct species, an alternative approach is to consider only the relative shift/shielding differences between the peaks (i.e., determining a reference shielding from either one, or the average of several shift/shielding values within the selected reference material. A more robust approach, however, is to compare the experimental shift and calculated shielding for multiple materials. Although this approach alleviates the effect of any experimental errors or structural uncertainties, it cannot correct for any errors in the chosen exchange-correlation functional. Therefore, to mitigate any issues of chemical transferability it is preferable that the set of materials used to generate the computed chemical shifts should be as similar to the system of interest as possible.
Tables S1.1 and S1. 2    where more than one of a particular protonation arrangement is observed, i.e., when two or four protons are all located on O1 sites, the average 1 H σ iso calc is used.

S3. k-means clustering of AIRSS-generated hydrous wadsleyite structures
A k-means clustering approach was adopted for the selection of smaller subsets of AIRSS-generated structures for further study. Scripts were developed in Python 2.7, using the structural and statistical functionality of the Soprano library, 12 which extends the Atomic Simulation Environment (ASE) library 13 for the analysis for computed crystal structures. k-means clustering, a method of "unsupervised machine learning", involves the subdivision of a dataset into k groups, called "clusters", which are determined to be similar according to a normalized set of pre-defined parameters, known as "genes". In an iterative process, k data points are first chosen at random and neighboring data points become "members" of these clusters, thereafter the central data point of each cluster is set to that closest to the cluster mean and the cluster members refined according to their distances to the mean data point. In terms of a set of chemical structures, the end goal is to segregate this into k sets of like structures from which one representative structure can be used for further studies and, which due caution, the remaining cluster members can be discarded.

Semi-hydrous wadsleyite clustering
For clustering, genes were constructed based on: the relative enthalpy, ΔH, with respect to the lowest enthalpy structure in the series of semi-hydrous structures with a

S11
The quality of clustering was assessed by plotting each cluster as a Gaussian function against values contained in that cluster according to a specific gene: or O type of a given structure, µ is the mean of that quantity for all structures in the cluster to which that structure belongs and σ is the standard deviation of that value in the cluster. Equation S3.1 was used for ΔH and OH ! " !!! genes and S3.2 was used for the O-type gene, since a value of σ = 0 was possible and precluded visualization. In the case of the ΔH gene, the aim was to have narrow Gaussians at low ΔH and broad Gaussians at high ΔH (see Figure S3.2a), since the low ΔH structures would be better candidates for study. A weighting was applied to the O-type gene, due to its discreet nature, such that each cluster contained structures with a single common O type. This is illustrated in Figure S3.2c, where the Gaussians are generally narrow. The robustness of the method for structure selection was tested by verifying the selected structures against the series of structures with a Mg3 vacancy studied in previous work, 11 where the aim was to manually select structures for their uniqueness. Indeed, it was seen that the fewest structures were selected where we had found structures to be identical (or very similar), normally forming plateaus in ΔH, and more structures were selected where more structural diversity was found, such as at higher energy.

S4. Additional plots of structural and NMR parameters for semi-hydrous wadsleyite
The plots shown in Figure S4.1 highlight the variation in 1 H δ iso and 2 H C Q with the hydroxyl bond distance for the 58 semi-hydrous wadsleyite structures with a ΔH < 1.0 eV.  The plots shown in Figure S4.2 highlight the variation in 1 H δ iso and 2 H C Q with the hydrogen-bond distance for all 88 semi-hydrous wadsleyite structures. Figure S4.2a and S4.2b show there are reasonably well-defined regions of 1 H δ iso and 2 H C Q for Mg-OH and S17 Si-OH environments. Figure S4.2c and S4.2d show that the hydrogen-bond distance for the O2-H hydroxyl is noticeably larger than the hydrogen-bond distances for protons on either O3 or O4 sites. Figure S4.2c and S4.2d also show that the hydrogen-bond distance, as well as the 1 H δ iso and 2 H C Q for protons on O3 or O4 sites are very similar and cover the same range, meaning no clear distinction between a protonated O3 or a protonated O4 site can be made based on the hydrogen-bond distances.

S5. The enthalpic stability of Si-vacant fully-hydrous wadsleyite
Fully-hydrous wadsleyite models with Si 4+ vacancies were produced by changing the AIRSS protocol such that four H atoms were added to a unit cell of β-Mg 2 SiO 4 with a single Si atom removed (producing 251 structures). This resulted in the enthalpy profile is shown in Figure S5.

S19
In order to compare hydration mechanisms featuring Si vacancies against those with Mg vacancies, a balanced isodesmic reaction was set up interchanging these materials. The formula for a unit cell of wadsleyite structures with Si vacancies (I) is  Table 2 in the main text and

S8. Additional NMR experiments
The plot of calculated 1 H δ iso against 29 Si δ iso for motifs G -J in Figure 9c (main text) shows four groups of points with reasonably well-defined regions of chemical shift, corresponding to the four different types of H⋯Si interactions. In contrast, the 1 H-29 Si CP HETCOR spectrum published by Griffin et al. 10 shows two regions of significant intensity, corresponding to MgOH (more specifically MgOH⋯SiO) and Si-OH environments, with no significant signal observed for the two additional sets of 1 H-29 Si correlations shown in Figure 9c. However, lower intensity signals are present in this region of the spectrum, as seen in Figure S8.1, which shows three 1 H-29 Si HETCOR spectra of fully-hydrous wadsleyite (containing ~3% wt water), acquired with three different CP contact times.

S9. Experimental FTIR and simulated IR spectra of fully-hydrous wadsleyite
The hydrous wadsleyite sample used in NMR experiments herein and by Griffin et al. 10 was also studied using Fourier-transform infrared (FTIR) experiments, providing an opportunity to compare simulated FTIR frequencies and intensities from the set of AIRSSgenerated structures (i.e., G-J) used to rationalize the NMR data with experimental measurements. FTIR spectra were recorded using a Bruker Hyperion 2000 microscope and