Density Functional Theory under the Bubbles and Cube Numerical Framework

Density functional theory within the Kohn–Sham density functional theory (KS-DFT) ansatz has been implemented into our bubbles and cube real-space molecular electronic structure framework, where functions containing steep cusps in the vicinity of the nuclei are expanded in atom-centered one-dimensional (1D) numerical grids multiplied with spherical harmonics (bubbles). The remainder, i.e., the cube, which is the cusp-free and smooth difference between the atomic one-center contributions and the exact molecular function, is represented on a three-dimensional (3D) equidistant grid by using a tractable number of grid points. The implementation of the methods is demonstrated by performing 3D numerical KS-DFT calculations on light atoms and small molecules. The accuracy is assessed by comparing the obtained energies with the best available reference energies.


INTRODUCTION
Kohn−Sham density functional theory (KS-DFT) is nowadays among the most popular quantum mechanical methods employed in molecular and material science simulations. 1 The exchange-correlation functional is the key component of the DFT theory and searching for approximate and more accurate functional forms has been going on since the invention of KS-DFT. 2 There are several aspects that need to be considered when implementing exchange-correlation (XC) functionals for efficient and accurate calculations. Many different basis-set options are employed, while Gaussian-type 3 basis functions dominate the quantum chemistry field. These global analytic atomic-centered basis functions capture the main features of the orbitals near the nuclei. However, high accuracy is hard to achieve by systematically increasing the basis set size without running into linear dependence problems. 4−7 Despite using global basis sets, the XC energy contribution cannot be obtained analytically but has to be computed numerically, 8−15 implying that the quadrature must be fast and accurate.
The bubbles and cube numerical framework is a finiteelement approach developed in our group. 16,17 The functions containing the steep nuclear cusps are expanded using local functions on atom-centered one-dimensional (1D) numerical grids multiplied with spherical harmonics (bubbles). The remainder, which is the cusp-free and smooth difference between the main atomic contributions and the exact molecular function, is represented by a tractable amount of equidistant grid points (cube).
Grid-based fast multiple methods (GBFMM) can be used for speeding up the computations, and they allow an efficient and even distribution of the memory and computational load between the computational nodes. 18,19 General-purpose graphics processing unit (GPGPU), message-passing interface (MPI) and open multiprocessing (OpenMP) parallelizations are utilized in our software in order to effectively use the available computational resources. 20 Tensor decomposition approaches are used for reducing memory requirements. 21,22 The aim of this work is to implement the KS-DFT method into the above framework. The implementation of the XC functional in the double basis (bubbles and cube) involves novel challenges because of the nonlinear nature of the XC term. As most of the needed components have already been implemented, we concentrate here on the missing part, namely the algorithms and the detailed implementation employed in the calculation of the exchange-correlation potential and the corresponding energy.
In the following section, we briefly present the basic KS-DFT theory. After that, we give the mathematical details of representing functions using the double bubbles and cube basis. We discuss the SCF-algorithm at the KS-DFT level in this framework and calculation of the exchange-correlation terms. We apply the methods on a few molecules and present energies obtained in KS-DFT calculations using the LDA and GGA functionals. Finally, we compare our results with accurate energies obtained using multiwavelet calculations. 5 , and E XC [ρ] are the kinetic energy, external potential energy, the energy of the Coulomb repulsion between electrons, and the exchange-correlation energy, respectively. The energy contributions are functions of the total electron density ρ, which is computed using the occupied molecular orbitals, ψ i , as where N occ is the number of occupied molecular orbitals. E XC has an integral form: where ε(r⃗ ) is the exchange-correlation energy per particle. The orbital energies, ϵ i , can be obtained from the Kohn− Sham eigenvalue equation where the Kohn−Sham potential consists of the external potential, v ext , the Coulomb potential, v J , and the exchange- At the local density approximation (LDA), the exchangecorrelation functional and the corresponding potential depend only on the electron density v XC ε ρ = ∂ ∂ (6) Density functionals at the generalized gradient approximation (GGA) also depend on the gradient of the electron density ∇ρ. At the restricted GGA level, the v XC can be written as which can also be expressed using the second derivative

BUBBLES AND CUBE
In the numerical bubbles and cube framework, the threedimensional (3D) functions, f(r⃗ ), are represented as a linear combination of the cube, f Δ (r⃗ ), expanded on a 3D equidistant grid and the bubbles, which are radial functions f Alm (r A ) expanded on a one-dimensional (1D) radial grid multiplied with real spherical harmonic functions, Y lm (θ A ,ϕ A ) centered on nucleus A. The radial grids of the bubbles are denser closer to the atom centers. The 1D radial grids of the bubbles and the grid of the cube overlap and all the grid points of the cube are independent of the grid points of the bubbles and vice versa. 16,17 A general scalar function f(r⃗ ) is expressed in the double basis of bubbles and cube as The bubbles functions take care of the cusp behavior of functions near the nuclei implying that also core electrons can be considered in the calculations. Thus, there is no need to use soft pseudo potentials or to refined the grid for representing the core electrons. 4,24−40 The 1D radial functions represent each function accurately in the vicinity of the nuclei with only a small fraction of the total computational costs. The remaining cube part is smooth and a good accuracy is obtained with a relatively coarse 3D grid.
We divide the range of each Cartesian direction of the cube and the radial range of the bubbles into a number of cells with 7 grid points. The first and the last grid point of each cell are shared with the neighboring cell. Each grid point in a cell corresponds to a Lagrange interpolation polynomial (LIP) and all the LIPs of a given cell are used when interpolating or calculating derivatives at any point of interest between grid points. Having 7 grid points in a cell implies that when differentiating the functions, the maximum error is in the order of O(h 6 ), where h is the spacing between two grid points. In our framework and in this article, the entities consisting of the cube and the bubbles are called Function3D below.

KS-DFT SCF USING BUBBLES AND CUBE
The simplest possible self-consistent field (SCF) orbital optimization scheme in the numerical framework is an iterative process with only a few steps namely: calculation of the orbital energies ϵ i and the total energy, E; calculation of the Kohn− Sham potential, v KS ; updating and orthonormalizing the orbitals. The main advantage of the optimization process of numerical KS-DFT SCF is that the algorithm scales linearly as only one bracket element per orbital is needed in order to calculate the orbital energies.
In the present approach, the orbitals are updated by integrating the Helmholtz kernel for each orbital U e r r 4 ; with 2 Integration of the six-dimensional Helmholtz kernel has been discussed in detail in our previous publications. 19,41 At the end of each Helmholtz integration step the orbitals are reorthogonalized using the Gram−Schmidt method. Calculation of electron−electron interaction potentials 11) and the corresponding energies have been discussed in our earlier work. 18−20 Calculation of the kinetic energy of the orbitals has also been discussed before, 17 We have previously developed computational algorithms needed for computing all terms that are needed for Hartree− Fock calculations. For KS-DFT calculations, we need accurate and efficient algorithms for calculating the exchangecorrelation potential, v XC , and the corresponding energy, E XC ,

Journal of Chemical Theory and Computation
Article respectively. Our approach to calculate these entities will be discussed in the next section.

EXCHANGE-CORRELATION TERMS WITH BUBBLES AND CUBE
Calculating the exchange-correlation energy and the exchangecorrelation potential for a known electron density is straightforward and can be done with existing libraries such as LibXC. 42 However, in our framework the density is divided into several parts and the exchange-correlation terms are not linearly dependent on the density, implying that they cannot be calculated separately. Thus, some additional steps are required before and after the library call, as we want to partition the resulting function and some intermediate ones into bubbles and cubes. The calculation of the input density and the gradient also requires some special attention.
We discuss the reconstruction of the output bubbles in section 5.1 and the calculation of density and its gradient in section 5.2. In section 5.3, we conclude this part by presenting the algorithms we use to compute the exchange-correlation energy and the exchange-correlation potentials for the functionals of the local density approximation (LDA) and the generalized gradient approximation (GGA).

Reconstruction of Bubbles.
When a function f(r⃗ ) can be calculated in any point in space, a straightforward method to construct the bubbles is by direct projection. Following Becke's fuzzy cells concept, 8 a function can be partitioned into atomic cell contributions The radial part of the bubbles, f A (r⃗ ), is obtained by projecting the angular part of f(r⃗ ) using spherical harmonics

Journal of Chemical Theory and Computation
where ω is the solid angle, w A is a Becke partition function, and w′ i is the integration weight of the ith point. The numerical integration on a sphere uses Lebedev grids. f Δ (r⃗ ) is obtained by subtracting the bubbles from f(r⃗ ). In the projection, the 3D information is compressed into 1D radial grid functions multiplied with the corresponding spherical harmonics. The efficiency of the compression can be evaluated by calculating the norm of the cube, ∫ (f Δ (r⃗ )) 2 dr⃗ . 5.2. Calculation of Electron Density and Its Gradient. In our DFT implementation, we use two different ways to calculate the electron density at the points of interest, p. The advantages and disadvantages of the two approaches are discussed in this section.
In the first approach, the electron density is calculated by adding the squared occupied orbitals in the Function3Ds and storing it in another Function3D object. The electron density at the points of interest can be obtained by computing the resulting Function3D electron density using eq 9. The three gradient components of the electron density of the cube are evaluated and stored as the cubes of three independent Function3D objects, whereafter the gradient of the bubbles contribution to the electron density is calculated and stored as bubbles to the Function3D objects using the relations derived in appendix A. After the gradient Function3Ds are formed, the exact values of gradient at points of interest are calculated using the same procedure as used for the density. This algorithm is described as a pseudocode in Figure 1.
The second way to get the electron density and its gradient is to perform the steps of the first algorithm in reversed order.
The orbital values are calculated in the desired points, p, which are squared and added with the contributions from the other orbitals to form the electron density. The gradient is obtained via the differentiation product rule using orbital gradient values and the orbital values in the points of interest as described in Figure 2.
The differences between the two algorithms are in speed and accuracy. The first algorithm is faster, as there are fewer pointwise calculations of Function3Ds. However, the latter algorithm increases the precision of the electron density and especially of the gradient, because it has fewer multiplications between Function3D objects prior to the calculation of the gradient. The smaller part of the differentiated functions in Function3Ds are stored in cubes prior to the calculation of the gradient.
In both algorithms, the gradients of the functions are calculated by first determining the gradients as Function3D objects instead of calculating the gradients directly from the electron density or the orbitals, rendering the resulting gradients slightly less accurate. However, the values near the nuclei are more stable, resulting in a higher precision for the optimized electron density.

The Exchange-Correlation Algorithm.
We have now all algorithms needed for calculation of the exchangecorrelation potential (v XC ) and energy (E XC ) within the bubbles and cube framework. The main algorithm for obtaining v XC and E XC at the restricted LDA and GGA level is given as pseudocode in Figure 3. We discuss the GGA algorithm, whereas the LDA algorithm can be obtained by omitting the steps involving the gradient of the electron density in the GGA algorithm.

Journal of Chemical Theory and Computation
Article representation for each term. The individual steps are described as pseudo code in the Figures 4 and 5.
Calculations of the three exchange-correlation terms begin by computing the electron density, ρ p , and its gradient, ∇ρ p , at the points of interest. In the bubbles case, ρ p and ∇ρ p are calculated in the Lebedev integration points for distances from the nuclei corresponding to 1D-grid points. For the cube, ρ p and ∇ρ p are calculated for all grid points of the cube. The second common step is the call to the LibXC library, 42 which provides the values for the derivatives of the energy, , and the energy per particle, ε p , for the chosen points. ε p is then multiplied with the electron density, ρ p , and p 2 ε ρ ∂ ∂ | ∇ | is multiplied with ∇ρ p followed by an reordering of the data to cube or bubbles format. For the cube, the data are reshaped into a 3D grid of the correct dimension size, whereas for the bubbles one has to employ the procedure described in section 5.1.
The bubbles contributions of the three terms are extracted from the cubes, leaving only those parts in the cube that cannot be accurately expanded in bubbles. The exchangecorrelation potential, v XC , is obtained by calculating the divergence of the Function3D containing

RESULTS
The accuracy of our approach is demonstrated by calculating total energies for a few atoms and molecules at the LDA level 43,44 using the VWN5 correlation functional 45 as well as at the GGA level using the PBE exchange-correlation functional. 46,47 The total energies and the relative errors are compared in Tables 1, 2, and 3 with values calculated using the multiwavelet approach. 5,23 For atoms, a cube grid step, h, of 0.05 was used. For molecules, h values of 0.05, 0.07, and 0.10 were used. The maximum angular momentum quantum number of the bubbles, l MAX , was 3, 4, or 5 depending on the size of the molecule and the employed exchangecorrelation functional. The 1D radial grid of the bubbles had 9000−14000 grid points within a radius of 14 bohr from the atom centers. The number of grid points of the bubbles depends on the charge of the corresponding nucleus. In the PBE calculations, the dimension of the cube grid was chosen such that the distance between the outermost nucleus and the edge of the cube was at least 9−10 bohr in each direction. In the LDA calculations, the distance between the outermost nucleus and the cube edge was 7 bohr.
In the LDA calculations, we used the simple method described in Figure 1 to evaluate the electron density at the points of interest, whereas in the PBE calculations, we utilized the more complex method described in Figure 2. The straightforward approach can without doubt be used in the LDA-calculations, because the two approaches yield practically the same accuracy. However, the accuracy of the electrondensity gradient depends on the employed approach. The accuracy is increased by an order of magnitude when using the more complex approach in PBE calculations.
The complex approach is computationally more expensive, because it involves an additional computational step that scales linearly with respect to the number of occupied molecular orbitals. However, the complex approach is a more effective way to increase the accuracy of the calculation than the alternative of decreasing the grid step of the cube. The complex approach can be turned on at a later stage of a GGA calculation when aiming at a higher accuracy.
The results of the atomic calculations are summarized in Table 1. The calculations demonstrate that a high accuracy can be obtained using the two exchange-correlation functionals. The obtained LDA energies for the studied molecules given in Table 2 are also accurate. The total energies are within the reported error bars of 3 × 10 −6 and 3 × 10 −5 au of the reference values for C 2 H 4 and N 2 , respectively. 23 For the coarser grids, our calculations have errors in the total energy that are still smaller than threshold for chemical accuracy of 10 −4 au.
The PBE results for molecules are given in Table 3 where one sees that accuracy of our calculations does not reach the level of the reference values with any of the used grid steps. However, the errors are much smaller than the chemical accuracy of 10 −4 au when using the smaller grid steps of 0.05 and 0.07. When the sparse grid with spacings of 0.10 au is employed, the error is smaller than 10 −4 au for all studied molecules except CH 4 , whose energy has an error of 1.9 × 10 −4 au.
Some of the energies in Tables 1, 2, and 3 lie below the reference value showing that the numerical calculations are not completely variational. The nonvariational behavior is caused by accumulations of numerical errors without any predefined sign. In this regard, numerical methods differ from analytical approaches, where the deviations from the exact values are most likely due to limitations in the employed basis set, whereas in that case numerical rounding errors are much smaller.

Article
The reason for the somewhat larger inaccuracy of the GGA approach is the uncertainties introduced when calculating the gradient. Especially, the gradient of cube is very prone to errors as even the densest grid is still relatively sparse from the point of view of numerical differentiation. The effect of the numerical differentiation error can be reduced by increasing the l MAX , i.e., the length of the spherical harmonics expansion of the bubbles. However, the computational cost of the algorithm also increases when larger l MAX values are used.
Reducing h even further would increase the accuracy. However, the use of denser grids is not feasible for the single node implementation of the software used here due to memory limitations of the available hardware. Thus, testing the accuracy limits of our DFT implementation has to be left to the future, when we have implemented a multinode version of the approach.

CONCLUSIONS
We have developed and implemented a fully numerical method for molecular electronic structure calculations at the density functional theory (DFT) level using exchange-correlation functionals at the local density approximation (LDA) and the generalized gradient approximation (GGA). The numerical methods can be extended to hybrid functionals that mix DFT and Hartree−Fock (HF) exchange potentials, because methods for GGA and HF calculations have been developed, whereas implementation of range-separated functionals requires development of new algorithms.
The numerical algorithms used for calculating the exchangecorrelation potential and energy contributions are presented in detail. In our bubbles and cube approach, we expand the steep part of the functions in one-center atom-like contributions and the remainder is expanded on an equidistant grid. We also discussed a simple way to calculate gradient for the bubbles part and real spherical harmonics expansions in general. The accuracy of our approach was demonstrated by performing calculations on atoms and small molecules at the LDA level and at the GGA level using the PBE functional. The obtained energies were compared with values calculated using multiwavelet codes. The accuracy can be improved by using larger grid sizes, which require more memory and increase the computational effort. The computational time and memory requirement increase linearly with the number of grid points.
The LDA and PBE calculations demonstrate that chemical accuracy can be achieved using our approach. Excellent accuracy is easily obtained at the LDA level even when relatively sparse grids are employed. The calculations showed that it is hard to reach a very high accuracy at the GGA level due to the uncertainies introduced by numerical differentiation of the electron density. A few extra computational steps had to be introduced in order to improve the accuracy of gradient of the electron density for obtaining sufficiently precise energies.
Presently, we use a single-node implementation without utilizing the parallelization of the recently developed gridbased fast multipole method (GBFMM), 19,41 which limits the size of the grids that are currently feasible. Consequently, the accuracy of the PBE calculations leaves room for improvement. Although, multinode parallelization of the calculation of the DFT contributions to energy and the exchange-correlation potential should be a relatively simple to implement, an efficient implementation needs some considerations. The GPGPU parallelization of the DFT part will therefore be implemented later.