Fast Time-Dependent Density Functional Theory Calculations of the X ‑ ray Absorption Spectroscopy of Large Systems

: The computational cost of calculations of K -edge X-ray absorption spectra using time-dependent density functional (TDDFT) within the Tamm − Danco ﬀ approximation is signi ﬁ cantly reduced through the introduction of a severe integral screening procedure that includes only integrals that involve the core s basis function of the absorbing atom(s) coupled with a reduced quality numerical quadrature for integrals associated with the exchange and correlation functionals. The memory required for the calculations is reduced through construction of the TDDFT matrix within the absorbing core orbitals excitation space and exploiting further truncation of the virtual orbital space. The resulting method, denoted fTDDFT s , leads to much faster calculations and makes the study of large systems tractable. The capability of the method is demonstrated through calculations of the X-ray absorption spectra at the carbon K -edge of chlorophyll a , C 60 and C 70 .


■ INTRODUCTION
In recent years, advances in the intensity and resolution obtainable with synchrotron radiation has led to significant progress in spectroscopic techniques in the X-ray region. 1 With these techniques it is now possible to achieve a much greater richness in detail combined with the advantage that X-ray spectroscopy provides a local probe of structure. The development of X-ray free-electron lasers that can deliver short femtosecond pulses of hard X-rays has opened up new prospects in time-resolved X-ray absorption measurements that can probe ultrafast chemical processes at an atomic level. 2 In this work, we focus on X-ray absorption spectroscopy (XAS), or more specifically near-edge X-ray absorption fine structure (NEXAFS), which arises from the excitation of a core electron to low-lying virtual orbitals to form a state below the ionization threshold. Currently, XAS is used in a wide range of applications, including surface science, 3,4 bioorganic chemistry, 5−7 and to probe the structure of liquids. 8 Alongside the progress in experimental measurements there has been a considerable effort in the development of computational methods that can accurately predict X-ray absorption spectra. Many early calculations of XAS were based on the direct static exchange (STEX) method. 9−11 The approximations made in this approach include the neglect of electron correlation and the independent channel approximation. In later work the transition potential method 12,13 was introduced in an effort to improve the STEX approach. In this approach, the orbital binding energy is computed as the derivative of the total energy with respect to the orbital occupation number. To take into account the relaxation of the orbitals, the energy is approximated by calculating the derivative at the point corresponding to an occupation of one-half. Formally, this corresponds to a core orbital with half an electron removed which captures a balance between final and initial states. XAS can also be computed using the Bethe− Salpeter equation. 14,15 A more recent trend in the development of methods for the calculation of XAS is to use established methods for the calculation of the valence-electron spectroscopy and adapt them for the calculation of core−electron spectroscopies. Within correlated wave function based theories, calculation of XAS has been reported within coupled cluster theory, 16−20 the second-order algebraic diagrammatic construction scheme (ADC), 21−23 and singles configuration interaction with second-order perturbation theory (CIS(D)). 24 Time-dependent density functional theory (TDDFT) is another approach to simulating XAS, and a review of the application of TDDFT to core excitations is available. 25 Within TDDFT core-excitations can be computed efficiently by limiting the single excitation space to include only excitations from the relevant core orbital, 26 using a projection scheme to find excitation energies in a specified range, 27 or a resonant converged complex polarization propagator approach. 28−30 One of the limitations of TDDFT calculations of core excitations using standard exchange-correlation functionals is that the predicted excitation energies are underestimated relative to experiment. 31 This has been attributed to the selfinteraction error associated with approximate exchange functionals and a number of solutions to this problem have been proposed. 32−37 One such scheme involves the introduction of a large fraction of Hartree−Fock exchange in the short-range 38 with the resulting functionals providing accurate predictions of the transition energies. 39−42 More recent work has extended the application of these methods to study X-ray emission spectroscopy. 43−46 While the computational cost of NEXAFS calculations with standard linear response TDDFT is considerably less than the cost of the wave function based approaches, it still limits the application of TDDFT. For large systems, the cost of simulating NEXAFS spectra is considerably greater than that of valence electronic spectra due to the high density of states in the NEXAFS region. This often requires a very large number of states to be evaluated, particularly when excitations from many core orbitals need to be considered, for example at the carbon K-edge of organic molecules. Evaluating such a large number of states can be challenging with respect to the time required for the calculations and also, more critically, the memory required. Recently, Grimme introduced a simplified TDDFT approach that enabled calculations of UV spectroscopy to be performed for systems with 500−1000 atoms, achieving speed ups of two to three orders of magnitude compared to standard TDDFT. 47 The success of this approach is predicated on imposing a reduction in the excitation space and also simplifying the matrix elements within the TDDFT calculation. Within the context of NEXAFS calculations it has been demonstrated that the use of severe integral screening and reduced quality integration grids results in a considerable reduction in the computational cost with the introduction of a small associated error. 48 In this paper, we describe an implementation of TDDFT within the Tamm−Dancoff approximation (TDA) to compute NEXAFS spectra that adopts a strategy of evaluating only the most significant integrals in conjunction with truncating the available single excitation space to provide a fast TDDFT approach with reduced memory requirements that can be applied to study the NEXAFS of large systems. The method is demonstrated on a several systems including organic molecules, fullerenes, and transition metal complexes.

■ COMPUTATIONAL DETAILS
Calculations of XAS with TDDFT usually adopt the TDA wherein the excitation energies and associated transition dipole moments are obtained from the following equation 49 ω = AX X (1) The matrix A is given by and ϵ i are the orbital energies and E XC is the exchange correlation functional. For the calculation of X-ray absorption spectra it is usual to limit A to include only excitations from the relevant core orbitals. 26 Within this approach, the roots corresponding to the required core-excitations are the lowest energy roots and can be found efficiently using the Davidson procedure. 50 Since there is virtually no mixing between the excitations at a given K-edge and other excitations in the molecule, this approximation has no significant effect on the computed transition energies and oscillator strengths.
For the study of large systems where it may be necessary to compute a very large number of states, calculations within this formalism can become prohibitively expensive owing to a number of contributory factors. One of the primary reasons for the growth in computational cost is associated with the evaluation of all of the integrals required to form the matrix elements of A. It has been shown that the cost of evaluating the two-electron integrals can be reduced significantly through the use of standard integral screening techniques but with larger thresholds, resulting in far fewer integrals being evaluated. In quantum chemistry codes it is standard to screen two-electron integrals in which the integrals where Ω ab is a product of two Gaussian basis functions, are precalculated. Only integrals for which τ > G G ab cd (7) are evaluated. A typical value of τ is 10 −11 a.u., however, for Kedge core-excitations a much larger value of τ = 10 −3 a.u. has been shown to have a very small effect on the computed transition energies and oscillator strengths while resulting in significant computational savings. 48 In some cases, particularly if diffuse basis functions are used, this can lead to spurious low energy excitations being present. In this work, in which we only consider excitations at the K-edge, we explore a different screening strategy where primitive two-electron integrals are only evaluated if they contain the s basis function describing the 1s orbital of the atoms from which core excitations are being computed. Integrals involving the exchange-correlation functional are evaluated numerically using quadrature schemes. We consider two simplifications for this term. The first is the total neglect of the DFT exchange-correlation to the matrix elements. The second is the use of a very coarse quadrature grid of these integrals. The standard quadrature grid used in Q-Chem is a pruned 50-point Euler−Maclaurin radial grid with a 194-point Lebedev angular grid, denoted (50,194), 51 and this grid is reduced to a (10,18) grid. We emphasize that these changes are introduced in the TDDFT part of the calculation only, and it is important that the Kohn−Sham DFT calculation proceeds as normal.
Another limiting factor for the calculation of a large number of roots is the memory required. The Davidson diagonalization procedure is an iterative subspace approach where for a given root k the eigenvectors of interest are expanded in an orthonormal vector space 52 where x k is the approximation to X k for the current iteration. A subspace matrix is diagonalized As the number of expansion vectors b i is increased the eigenvalues and eigenvectors of the subspace matrix will approach the exact (within the model) eigenvalues and eigenvectors. On each iteration if the root is not converged the subspace is expanded by adding a correction vector δ k (11) and r k is the residual vector, p k is the current eigenvalue, and D is the diagonal part of A. The correction vector δ k is orthogonalized with respect to the existing expansion vectors and added to the subspace. In this procedure it is necessary to store all subspace vectors b i , this corresponds to L vectors of length n occ × n virt , where n occ is the number of occupied orbitals and n virt is the number of virtual orbitals. The space required to store these vectors can become considerable when a large number of roots are required for a relatively large system. For core-excitations the initial subspace vectors are created from excitations arising from the specified subset of core orbitals. Transitions outside of the subspace defined by excitations from the core orbitals are removed from the correction vector by projection with a vector that describes the core-excitation subspace (v ss )

Journal of Chemical Theory and Computation
This approach is efficient in the respect that the core-excited states will be the lowest energy states obtained; however, it does not reduce the size of the correction vectors stored and it requires diagonalization of a subspace matrix that is unnecessarily large. In the implementation used here the subspace matrix is formed within the truncated excitation subspace and the space required to store the correction vectors becomes L × n core × n virt , where n core is the number of core orbitals, and the size of the subspace matrix is correspondingly reduced. A further reduction in the size of the subspace vectors can be achieved through truncation of the virtual orbital subspace to include only the lowest energy n virt sub orbitals. The molecules used to assess the accuracy of the fast TDDFT method are shown in Figure 1, with the structures optimized at the B3LYP/6-31G* level of theory. NEXAFS spectra have been computed within a restricted DFT formalism with the hybrid B3LYP 53,54 exchange-correlation functional and short-range corrected (SRC) functionals. There are two variations of SRC functionals (SRC1 and SRC2), 38 and in this work the SRC2 functional was used, unless stated otherwise. For the K-edge of second row elements and transition metals, relativistic effects are incorporated into the calculations by modifying the diagonal elements of A such that 46 (13) where Δ R is the magnitude of the energy change in the core orbital energy due to relativistic effects. This shift can be estimated from the difference in the 1s orbital energy between relativistic and nonrelativistic HF/cc-pCVTZ for the relevant atom with the relativistic effects modeled using the Douglas− Kroll−Hess Hamiltonian. This gives Δ R values for chromium and iron of 38.41 and 51.76 eV, respectively. Previous work 55 has estimated a value of Δ R for sulfur of 7.4 eV from the shift in the 1s ionization potential, and we use this value here. Computational timings correspond to running in serial on an Intel Xeon 2.67 GHz processor. NEXAFS spectra were generated by convoluting the computed transitions energies and oscillator strengths with Gaussian functions with a full-width at half-maximum (fwhm) of 0.35 eV. The calculations were performed with a development version of the Q-Chem program. 56 ■ RESULTS AND DISCUSSION Tables 1 and 2 show the average and maximum error introduced by the different TDDFT models relative to an unmodified TDDFT calculation for the calculation of the lowest eight states of benzene (carbon K-edge), ethanethiol (sulfur K-edge), histidine (nitrogen K-edge), nitrophenylanaline (oxygen K-edge), and Cr(CO) 6 (chromium K-edge). Also shown is the percentage error in the computed oscillator strength for the most intense transition and the percentage computational time compared to the standard TDDFT calculation. Six different models are considered, the first (denoted JKs) introduces the screening to the two-electron integrals to only include integrals that involve the s basis function of the absorbing atom(s), the second (denoted τ(10 −3 )) used standard integral screening with a threshold parameter of τ = 10 −3 a.u., the third (denoted noXC) neglects the DFT exchange-correlation term and the fourth (denoted (10,18)) uses a reduced quality numerical integration grid for the integrals involving the exchange and correlation functionals. The final two models combine the JKs two-electron integral screening with the reduced grid and the τ(10 −3 ) screening with the reduced grid. Table 1 gives data for the B3LYP exchangecorrelation functional and Table 2 gives data for the SRC2 functional with the 6-31G* basis set used in both sets of calculations.
The JKs integral screening introduces only a small error into the computed excitation energies, with the largest error observed being less than 0.1 eV for both B3LYP and SRC2 functionals. This magnitude of error is not very significant,

Journal of Chemical Theory and Computation
Article particularly in the context of other errors inherent in the calculations. Similarly there is essentially no effect on the computed oscillator strengths. However, the reduction in the times for the TDDFT calculations is significant, with the time reduced by over a factor of 2 for the larger molecules with the SRC2 functional. This approximation is particularly effective for Cr(CO) 6 which has the most compact core orbital. The small errors are reduced even more for the τ = 10 −3 a.u. calculation, although the computational savings are not as large. Neglecting the DFT exchange-correlation term does result in relatively large errors in both the transition energies and the oscillator strength. In particular, some large errors are observed for the calculated oscillator strengths which would lead to a noticeably different spectral profile. Consequently, the total neglect of the DFT exchange-correlation term is not a good approximation. A compromise is to treat this term with a more approximate integration grid. The results show that the use of a coarse integration grid retains the majority of the computational savings while not introducing large errors. Combining the integral screening and coarse integration grid gives very large computational savings. For the JKs screening with (10,18) grid there is a large reduction in computational time with little error introduced into the calculated values. In particular for the B3LYP functional, the time is reduced by over a factor of 10 for these moderately sized systems. While the computational savings for the SRC2 functional are less for some of the molecules, they remain substantial. The approximations are particularly effective for the transition metal complex Cr(CO) 6

Journal of Chemical Theory and Computation
Article where excitations from a single compact s orbital are considered. For the SRC2 functional, with default parameters the TDDFT calculation for Cr(CO) 6 takes 874 s, and this is reduced to 34 s for the JKs+(10,18) calculation. We denote this method fTDDFTs for brevity. The reduced integration grid in combination with the τ = 10 −3 a.u. threshold is more expensive (76 s for Cr(CO) 6 ), but the small errors introduced with fTDDFTs are largely removed. Table 3 shows the effect of restricting the virtual orbital space within the fTDDFTs model. Two levels of truncation are considered; v(50%) includes half of the virtual orbitals and v(25%) includes one-quarter of the virtual orbitals. In both cases the orbitals of lowest energy are included. Reducing the virtual orbital space by 50% does not add significantly to the error. The MAD is largely unchanged, but there is some small increase in the maximum error observed. However, decreasing the size of the virtual space further does lead to notable additional errors in both the excitation energies and computed oscillator strength of the most intense transition. The purpose of the truncation of the virtual space is to reduce the memory required and does not significantly affect the calculation time, and the reduction in time for these calculations is comparable to the calculations with the full virtual space. The molecules considered here are relatively small and reducing the size of the virtual space to 25% leaves relatively few virtual orbitals. For larger systems, many low-lying virtual orbitals will remain in the virtual orbital subspace when reducing the size of the virtual space by a large fraction. As will be shown later, for such systems it is possible to reduce the size of the virtual orbital subspace significantly without introducing a noticeable error. Consequently, truncation of the virtual orbital subspace is only recommended for large systems where there is a very large number of virtual orbitals, and following truncation of the orbital subspace a large number of virtual orbitals (>50) are still retained in the calculation. For small systems, truncation of the virtual orbital subspace is usually not necessary since the memory required for these molecules is low. Table 4 shows the effect of the approximations when applied in conjunction with the large 6-311++G** basis set that includes diffuse basis functions. For the 6-311++G** basis set the inner two s basis functions are included within the JKs integral screening. This basis set is not available for chromium, and the 6-31+G* is used for this atom. The results of the larger basis set calculations are consistent with the results for the smaller 6-31G* basis set. The reduction in computational time remains high and the errors introduced for the transition energies are negligible. With the Dunning basis sets aug-cc-pVDZ and aug-cc-pVTZ the cost of the calculation is reduced to 17% and 7%, respectively, with the JKs+(10,18) scheme for ethanethiol. For the larger basis set, the cost of the JKs screening is noticeably lower than the τ(10 −3 ) screening. An ethanethiol truncation of the virtual orbital subspace by 50% leads to a significant error in the oscillator strength. In general there is a greater sensitivity to the truncation of the virtual orbital subspace when diffuse basis functions are present in the basis set. However, this error reduces as the system size increases and illustrates that it is important that the size of the virtual orbital subspace remains large. Figure 2 shows computed spectra for some larger systems. The calculations included 200, 50, and 60 states for coronene, a

Journal of Chemical Theory and Computation
Article merocyanine dye and ferrocene, respectively. The computed spectra with fTDDFTs are indistinguishable from those with default settings for both functionals. However, the reduction in the time for the calculations is substantial, and these are summarized in Table 5. For coronene the time for the fTDDFTs calculations are about 15% of the time for the unmodified calculation, and for the other two molecules the time is reduced to about 4%. The relative reduction in time is not as great for coronene since excitations from all carbon 1s orbitals are included which makes the JKs integral screening less efficient compared with the merocyanine dye and ferrocene in which excitations from only one core orbital is required. It is also of interest to compare the spectra predicted by B3LYP and the SRC functionals. It is well-known that the B3LYP functional underestimates the excitation energy, and it is common to shift the computed spectra to align with experiment. The spectra computed with the SRC functionals are in good agreement with the experimental band positions without the need for a significant shift in energy. The spectrum for the merocyanine dye is about 0.3 eV too high, while the spectrum for coronene is over 1 eV too high. However, previous work, 48 has shown that the spectrum for coronene becomes closer to experiment if a larger basis set is used. The calculations presented here show that the predicted band profiles from the SRC functionals are in better agreement with experiment, particularly for coronene where B3LYP does not predict a split peak for the most intense band.
To demonstrate the application of fTDDFTs to large systems the carbon K-edge NEXAFS spectra for chlorophyll a, C 60 , and C 70 have been studied. These calculations are not feasible with the computational resources available to us without exploiting the efficiencies introduced in this work. In particular the reduction in the memory required is critical for these calculations. The spectra are shown in Figure 3 and have been computed with the SRC2 functional and 6-31G* basis set, and the spectra presented included 600 roots. In all three calculations the virtual orbital space is truncated to include only the 100 orbitals with lowest energy. This represents a reduction in the size of the virtual orbital space to 13−15% of the full virtual space. The carbon NEXAFS spectrum of chlorophyll a has been measured by Legall et al. 60 and the NEXAFS spectra of C 60 and C 70 have been measured and calculated using a Kohn−Sham DFT-based approach reported by Nyberg and coworkers. 61−63 Overall there is good agreement between experiment and theory with the band profiles closely matching experiment, although the computed spectra have a small shift to higher energy. Chlorophyll a has two peaks evident before the steep rise of the absorption edge. The calculations predict these to lie at 285.2 and 286.5 eV compared with the measured values of 284.5 and 285.4 eV. For the two fullerenes, the relative position and intensity of the bands is described accurately, and the calculations capture the subtle differences between the two spectra observed in experiment. For the fullerene systems, the higher energy bands are not reproduced by the calculations since more roots need to be included and also a better quality basis set is required to describe these states well.

■ CONCLUSION
This work has shown how the computational cost of TDDFT calculations within the Tamm−Dancoff approximation of XAS at the K-edge can be significantly reduced while introducing a negligible additional error to the calculations. This is achieved by introducing a severe integral screening procedure that includes only integrals that involve the core s basis function of the absorbing atom(s) coupled with a reduced quality numerical quadrature for integrals associated with the DFT exchange and correlation functionals. The core s basis function integral screening procedure leads to a greater reduction in computational cost compared to the standard integral screening with a large threshold, although the errors in transition error observed are slightly greater. The memory required for the calculations in the Davidson iterative scheme is reduced through construction of the TDDFT matrix within the core orbitals excitation space and further truncation of the virtual orbital space. The resulting method is denoted fTDDFTs and results in a 20-fold speed up in calculations of moderately sized molecules and makes calculations of large systems such as C 70 at the carbon K-edge tractable. In this work only K-edge spectra have been considered. The less localized nature of 2p orbitals will mean that the approximations made are likely to be less applicable to L-edge spectra and would need to be reassessed within the context of L-edge excitations.

Notes
The author declares no competing financial interest.