Solvatochromic Effects on the Absorption Spectrum of 2-Thiocytosine

The solvatochromic effects of six different solvents on the UV absorption spectrum of 2-thiocytosine have been studied by a combination of experimental and theoretical techniques. The steady-state absorption spectra show significant shifts of the absorption bands, where in more polar solvents the first absorption maximum shifts to higher transition energies and the second maximum to lower energies. The observed solvatochromic shifts have been rationalized using three popular solvatochromic scales and with high-level multireference quantum chemistry calculations including implicit and explicit solvent effects. It has been found that the dipole moments of the excited states account for some general shifts in the excitation energies, whereas the explicit solvent interactions explain the differences in the spectra recorded in the different solvents.


Level of Theory
presents the dielectric constants at low and infinite frequencies employed in the PCM calculations with MOLCAS for the six considered solvents. For the micro-solvated vertical excitation calculations, it was not possible to use the large ANO-RCC-VQZP basis set like in the vacuum reference calculation, since for the EtOH and EtOAc micro-solvated structures this basis set would imply more than 2300 basis functions, currently too many for MS-CASPT2 computations at our available computational capabilities. Hence, the microsolvated vertical excitation calculations were conducted with the more economical cc-pVDZ basis set, which leads to only 500 basis functions for the largest systems.
As shown in Reference S1, with such smaller basis sets it is advantageous to set the IPEA shift S2 to zero, because the effect of the small basis set and the effect of the neglected shift cancel out to a large degree.
In order to scrutinize that the combination "cc-pVDZ+zero IPEA shift" still gives satisfactory results compared to the reference setup "ANO-RCC-VQZP+default IPEA shift", we performed an MS-CASPT2 calculation for the H 2 O-micro-solvated structure, with ANO-RCC-VQZP for 2tCyt, ANO-RCC-VDZP for water, and the default IPEA shift. This computation should give results very close to the "ANO-RCC-VQZP+default IPEA shift" computation as only the water molecules, which do not participate directly in the excitation, are described with the smaller basis set.  Figure S1: Comparison of the MS-CASPT2(14,10) vertical excitation energies for micro-solvated (H 2 O) 2tCyt with "ANO-RCC-VQZP+default IPEA shift" (E QZ+IPEA ) and with "cc-pVDZ+zero IPEA shift" (E DZ−IPEA ). Both computations employ the same geometry (optimized with BP86/augcc-pVDZ, see below for coordinates). Figure S1 compares the results of the quadruple-ζ computation (E QZ+IPEA ) with the double-ζ S2 (E DZ−IPEA , taken from Table 4 in the main manuscript). It can be seen that the two methods agree very well with each other, except for a constant shift of −0.17 eV. The standard deviation of the E DZ−IPEA from the dashed line is 0.08 eV, which is significantly less than the variations of the excitation energies due to the solvent effects in Figure 6 in the main manuscript.

Gaussian fitting of the absorption spectra
The absorption spectra in Figure 3 in the main manuscript were fitted in the energy domain with the following equation: where σ(E) is the total fitted spectrum, A i are the heights of the Gaussians (i.e., the molar absorptivities), E i are the central energies of the Gaussians, F i are the full widths at half maximum (FWHM), and N = 5 or 3 is the number of Gaussians employed. The spectra were fitted to this function using Gnuplot 5.0 S3 with the Marquardt-Levenberg algorithm, under the constraint that the molar absorptivities A i remain larger than zero. The fitted parameters and their standard errors are given in Table S2.
As can be seen in the table, the standard errors for g 1 and g 2 are all relatively small (at most 15% for A 1 of MeOH). For the remaining Gaussians g 3 to g 5 , the central energies also show small relative errors (at most 7%). Some of the FWHM and absorptivities show larger errors, due to different reasons: Gaussian g 4 for EtOH and MeOH has only a very small intensity; Gaussians g 3 S3 and g 4 for ACN and H 2 O are noticeably correlated; and g 5 is fitted to the absorption at the spectral boundary, making it statistically less meaningful.
The fits show that the the EtOAc and DMSO spectra can be well described with three Gaussians. For ACN, at least four Gaussians are required, although the second-lowest absorption bands shape suggests that it is better described with one additional Gaussian, for a total of five. Similarly, the measured spectra of EtOH, MeOH, and H 2 O could possibly be described with four Gaussians, but the band shape suggests that five Gaussians better describe the spectral features. Figure S2 presents the residuals from the Gaussian fits. The absolute values of the residuals never exceed 1000 M −1 cm −1 , and this value is only reached at the edge of the EtOAc spectrum, where fitting is difficult because the second absorption band is covered by solvent absorption. As can be seen in the figure, the relative errors rarely exceed 5%, with the spectrum in EtOAc showing the largest relative errors. In general, the oscillations in the residuals arise due to the deviation of the actual line shape from the idealized Gaussian line shape we assumed in the fits. These oscillations could be reduced by fitting with more Gaussians, but-since the residuals are already small-these additional Gaussians would hardly be statistically meaningful. We therefore are confident that three or five Gaussians are the proper number to describe the measured spectra.  Figure S2: Residuals from the Gaussian decompsitions of the absorptivity spectra of 2tCyt (arranged like in Figure 3 of the main manuscript, but note the different y axis range). The gray area reproduces the experimental spectra divided by 20, therefore showing where the residual exceed a relative error of 5%.

Solvatochromic Analyses
We performed three different solvatochromic analyses with the absorption energies from the six solvents considered here. In these analyses, the solvent-dependent energy shifts of the absorption bands are linearly related to several solvatochromic parameters, as given by Catalán, S5 or Reichardt. S6 All analyses use the same absorption energy data. We consider the energies of the lowest-energy absorption maximum E max , as well as the energies where the red tail falls below a given absorptivity. The latter energies are denoted as E 200 , E 300 , E 400 , and E 500 for the energies where the absorptivity is 200 to 500 M −1 cm −1 , respectively. We focus on these two spectral features, in part because the corresponding transitions play a key role in the photophysics of 2tCyt, but also because in DMSO and EtOAc the high-energy band is overshadowed by the strong absorption of the solvents below 250 nm and thus we cannot determine the energetic position of this band. The fitting data is presented in Table S3.

Catalán Analysis
The Catalán solvent scale employs four parameters, which are the solvent acidity SA, solvent basicity SB, solvent dipolarity SdP, and solvent polarizability SP. The parameters for the six solvents are given in Table S4. whereas the other single-parameter fits did not satisfactorily explain the data (R 2 = 0.52 for "only SB", R 2 = 0.17 for "only SdP", R 2 = 0.08 for "only SP"). The "SA+SB" model yielded (R 2 = 0.99):

S5
Both models also have a high statistical significance (p < 0.001). Adding the SdP and/or SP parameters to these two models did not lead to statistically significant improvements of the fit. For the red tail absorption energies, no single-parameter model can describe the energies in Table S3 well. The best results are obtained for the E 200 set: "only SA" gives R 2 = 0.81, "only SB" R 2 = 0.47, "only SdP" R 2 = 0.05, and "only SP" R 2 = 0.00. The best of those, "only SA" model, is given by: Models with more than one parameter do not significantly improve the fits. Hence, the most important parameter for the position of the red absorption tail seems to be the solvent acidity, although it appears that the Catalán parameters do not describe the tail position as well as the maximum position. This could be due to the low signal-to-noise ratio of the red tail data.

Reichardt Analysis
We also carried out a solvatochromic analysis based on the Reichardt parameter E N T , S6 considering also a second, general parameter f (n 2 ) describing induction and dispersion. S10,S11 The parameters S6 for the six solvents are given in Table S6. Note that we have chosen to use the dimensionless E N T parameter rather than Reichardt's E T (30) values, because the parameters of the Catalán and Kamlet-Taft analyses are also dimensionless. Because Reichardt and coworkers S11 argue that the f (n 2 ) parameter should not be applied to protic solvents, we assume a value of zero for these solvents (see Table S6). 1.000 0 The linear regression for the lowest-energy maximum showed that the E N T parameter can describe the absorption energies quite well (p < 0.01): The addition of the f (n 2 ) parameter to the model does not lead to statistically significant improvements of the fit. However, as the E N T parameter subsumes both hydrogen bond and electrostatic interactions, this good fit does not allow to draw conclusions regarding the influence of the different solvent effects.
For the red absorption tail data, the E N T and f (n 2 ) parameters do not provide satisfactory fits, with the largest R 2 values around 0.6.

Structural Parameters
In Table S7, we compile the most relevant bond lengths of 2tCyt in the different solvents, taken from the coordinate data below. Columns printed in bold (the S=C 2 -N 3 =C 4 -N moiety mentioned in the main manuscript) are the ones exhibiting the largest changes when going from vacuum to water. Table S7: Bond lengths (Å) of 2tCyt from the above coordinate data.

Comparison of Gas Phase Calculations
In Figure S3 we compare the gas phase energies, reported in the main manuscript in Tables 3 and  4. In Table 3, the results were computed with MS(9)-CASPT2(14,10)/ANO-RCC-VQZP (default IPEA shift, no level shift). In Table 4, the results were computed with the smaller cc-pVDZ basis set (IPEA shift set to zero, 0.3 a.u. imaginary level shift). The agreement for these two gas-phase calculations is slightly worse than the one above for the micro-solvated geometries. Nevertheless, the standard deviation is only 0.12 eV for the energies and 0.03 for the oscillator strengths, which shows that the cheaper level of theory can reasonably reproduce the results of the more expensive computation. In particular, for the two states involved in the first absorption band, the oscillator strength of S 4 is well reproduced with 0.69 vs. 0.66, whereas the oscillator strength of S 2 is 0.02 vs. 0.06. These differences are due to slightly different mixing of S 2 and S 4 at the two levels of theory.  f Table 3 f  Figure S3: Comparison of the MS-CASPT2(14,10) vertical excitation energies (a) and oscillator strength (b) for gas-phase 2tCyt with "ANO-RCC-VQZP+default IPEA shift" (E Table 3 ) and with "cc-pVDZ+zero IPEA shift" (E Table 4 ). The computations employ different geometries (" Table 3" optimized with RI-MP2/cc-pVQZ, " Table 4" with BP86/aug-cc-pVDZ).