Do Double-Hybrid Functionals Benefit from Regularization in the PT2 Term? Observations from an Extensive Benchmark

We put to the test a recent suggestion [Shee, J., et al. J. Phys. Chem. Lett.2021, 12 (50), 12084–12097] that MP2 regularization might improve the performance of double-hybrid density functionals. Using the very large and chemically diverse GMTKN55 benchmark, we find that κ-regularization is indeed beneficial at lower percentages of Hartree–Fock exchange, especially if spin-component scaling is not applied [such as in B2GP-PLYP or ωB97M(2)]. This benefit dwindles for DSD and DOD functionals and vanishes entirely in the ∼70% HF exchange region optimal for them.

D ouble-hybrid density functional (DHDF) theory (for reviews, see refs 1−4) represents a special case of fifthrung functionals on "Jacob's Ladder" 5 (the fifth rung is where dependence on unoccupied orbitals enters). As such, DHDF theory resides on the seamline between density functional theory (DFT) and wave function approaches. DHDFs are the most accurate DFT methods available to date for main group energetics, 6−8 transition metal catalysis, 9−13 electronic excitation spectroscopy, 14−22 external magnetic 23−26 and electric field 27,28 induced properties, and vibrational frequencies. 29 Its reliance on a second-order perturbation theory term for part of the correlation energy creates an Achilles' heel for molecules with small band gaps, due to the presence of orbital energy differences in the denominator. Using the spin−orbital notation, the MP2-like Gorling−Levy 30 nonlocal correlation term (GLPT2) has the following form: where the indices i and j refer to occupied orbitals and a and b to virtual orbitals, while the energy denominator Δ ij ab = ε a + ε b − ε i − ε j .
One remedy that has been proposed in the past, in the context of single-reference or multireference MP2, has been DCPT (degeneracy-corrected perturbation theory 31 ). Another is regularization of the expression presented above to remove the singularity for Δ ij ab → 0. Stuck and Head-Gordon 32 proposed a simple level-shift regularizer (δ) in the context of ROOMP2 33−35 (restricted orbital-optimized second-order perturbation theory). They found that this works well for single-bond breaking, but the required level shifts for multiple bonds were so large that they disrupted thermochemistry results. 36 Lee and Head-Gordon 37 have recently proposed two energy gap-dependent regularizers: σ and κ. Although these forms were developed initially with OOMP2 in mind, Head-Gordon and co-workers 38,39 have shown that even without orbital optimization, σand κ-MP2 can achieve better accuracy than ordinary (unregularized) MP2 for both main group and transition metal thermochemistry, barrier heights, and noncovalent interactions. They also found that both sets of regularizers performed comparably; hence, we will focus here on only κ-regularized MP2 correlation, which has the following expression: where κ is a fixed regularization parameter. In the large-κ or large-Δ ij ab limits, the regularization factor approaches unity, while in the small-κ or small-Δ ij ab limits, the corresponding term in the energy summation approaches zero, i.e., the Hartree− Fock energy is recovered.
Reference 37 states that "We are optimistic that this study could pave the way for future development of double-hybrid density functionals based on nonlocal correlation expressions that are more appropriate than conventional MP2 for large dispersion-bound systems and organometallic bonding, yet still free of self-correlation errors. κMP2, σ 2 MP2, and σMP2 are promising candidates in this regard." Such DHDFs as our own minimally empirical dispersioncorrected spin-component scaled families, e.g., revDSD, 37 revDOD, 37 and xDSD, 40 Grimme's PWPB95, 41 or the more heavily parametrized ωB97M(2) 42 range-separated DHDF reach accuracies approaching those of composite wave function approaches (see ref 43 for a head-to-head comparison). Yet there might still be room for further improvement, particularly in terms of resilience for systems with small band gaps, and hence significant type A static correlation (also known as absolute near-degeneracy correlation 44 ).
In this letter, we will attempt to confirm or refute the conjecture presented above from ref 37, that is, to determine whether using spin-component-scaled κ-GLPT2 instead of unregularized (i.e., conventional) PT2 correlation can further improve the performance of DSD functionals. For the κregularized DSD functionals, the final energy has the following expression: = + + − + + + + and c C,DFT the coefficient for that energy part. E 2ab,κ-PT2 and E 2ss,κ-PT2 are the opposite-spin and same-spin κ-GLPT2 correlation energies, respectively, and their respective linear coefficients are c 2ab and c 2ss . Finally, E disp is a dispersion correction such as D3(BJ) 45−47 with any associated adjustable parameters: in the work presented here (as in ref 48), the nonlinear damping-function shape parameters are fixed at a 1 = 0 and a 2 = 5.5, and the higher-order coefficient is fixed at s 8 = 0 (as we have found 48 to be appropriate for double hybrids).
We have used the GMTKN55 benchmark suite 7 (general main group thermochemistry, kinetics, and noncovalent interactions) throughout. It comprises 55 types of chemical problems, which can be further divided into five major  where |Δ | E i is the mean absolute value of all of the reference energies from i = 1 to 55, N i is the number of systems in each subset, and MAD i is the mean absolute difference between calculated and reference energies for each of the 55 subsets. For the details of all 55 subsets with proper references, see Table 1 All electronic structure calculations were performed using the QCHEM 5.4 49 package on the ChemFarm HPC cluster in the Faculty of Chemistry at the Weizmann Institute of Science. The Weigend−Ahlrichs def2-QZVPP 50 basis set was used for all subsets except seven, for which we used the diffuse-function augmented variant def2-QZVPPD 51 instead: the rare gas clusters RG18 and the six anion-containing subsets WATER27, IL16, G21EA, BH76, BH76RC, and AHB21. For the C60ISO and UPU23 subsets (which have comparatively small weights in WTMAD2), we settled for the def2-TZVPP basis set to reduce the computational cost. For the remaining 46 subsets, the def2-QZVPP 50 basis set was used. The corresponding standard RI-MP2 52 and Coulomb exchange fitting (RI-JK) 53 basis sets were employed throughout to reduce the computational cost further. The large pruned integration grid SG-3 was used across the board. 54 Generally, inner-shell orbitals were frozen, but in subsets where such orbitals come close enough to the valence shell to qualify as "honorary valence orbitals", 55 the same frozen-core settings were used as in refs 48, 56, and 57.
Similar to the unregularized (x)DSD functionals, one fully optimized κ-DSD also has six adjustable linear parameters: c X,HF , c C,DFT , c 2ab , c 2ss (=c 2aa + c 2bb ), and for the D3(BJ) dispersion correction one prefactor s 6 and one parameter a 2 for the damping function (like in refs 58 and 59, we constrain a 1 and s 8 to zero). In addition, our new κ(x)DSD functionals also contain the PT2 regularization parameter κ.
Powell's BOBYQA 60 (bound optimization by quadratic approximation) derivative-free constrained optimizer together with in-house written scripts and Fortran programs were used to optimize all parameters.
What if we allow only one degree of freedom between c 2ab and c 2ss but self-consistently reoptimize the remaining three parameters (i.e., c C,DFT , c 2ab , and s 6 )? We considered two possibilities. The first is imposing the condition c 2ab = c 2ss , i.e., simple double hybrids like B2PLYP, 64 B2GPPLYP, 65 PBE0-2, 66 etc., as well as the combinatorially optimized ωB97M(2). 42 The nomenclature, "DSD", is no longer appropriate in this case and will be replaced by κDH-XC-D3BJ. The second option is to exclude same-spin correlation entirely (c 2ss = 0), also known as the DOD forms. These are of interest because they are amenable to reduced-scaling opposite-spin MP2 techniques like Laplace transform MP2 of Haser and Almlof, 67 which scales as Ο(N 4 ) with system size, or the tensor hypercontraction approach of Song and Martinez, 68 which scales as Ο(N 3 ). As expected, constraining c 2ab = c 2ss always increases WTMAD2 over the κDSD forms. For both κDH-PBEPBE-D3BJ and κDH-PBEP86-D3BJ, we found the lowest WTMAD2 at κ min = 2.5, corresponding to decreases of 0.11 and 0.10 kcal/mol, respectively, compared to their unregularized forms. For κDH-BLYP-D3BJ, the WTMAD2 gap between κ min (κ value for which we obtain the minimum WTMAD2) and κ = ∞ is ∼0.08 kcal/mol. Interestingly, for the κDOD functionals, GLPT2 regularization does more harm than good, and the unregularized forms always offer the best performance (see Figure 1).
The use of the more modern D4 69,70 dispersion correction instead of D3BJ for the regularized functionals does not affect any trends with respect to the regularizer (κ). For the PBE-PBE and PBE-P86 exchange-correlation combinations, the The Journal of Physical Chemistry Letters pubs.acs.org/JPCL Letter WTMAD2 gaps between the D3BJ and D4 corrected forms increase gradually with an increase in κ. However, switching from D3BJ to D4 has no significant effect on the performance of the κxDSD 75 -PBEP86-D4 and κDSD-BLYP-D4 functionals (see Table S2). The same story is repeated for the κDOD and κxDOD functionals with D4 dispersion correction (see Table  S3).
Thus far, for the κDSD functionals, we have used the same parameter for exact exchange as for their unregularized forms. 40,48 Our next objective is to check whether the Hartree−Fock exchange prefactors of the unregularized forms are still optimal for the new regularized functionals. To answer this question, we have considered seven c X,HF values ranging from 0.5 to 1.0, and the κDSD X -PBEP86-D3BJ The Journal of Physical Chemistry Letters pubs.acs.org/JPCL Letter functional together with its κDH, κDOD, and dispersion uncorrected variants (X represents the percentage of HF exchange used; only one joint set of electronic structure calculations is required for all variants). It turns out that regularization in κDSD-PBEP86-D3BJ becomes gradually more beneficial as c X,HF is decreased, with κ min decreasing concomitantly. For example, we obtain the lowest WTMAD2 near κ = 1.67 when c X,HF = 0.55, but when c X,HF = 0.5, the κ min decreases to 1.45 (see Figure 2). Now, splitting each WTMAD2 into five major subcategories, we found that regularization does more harm than good across the board for small-molecule thermochemistry. However, the extent of performance deterioration with respect to the κ values becomes less prominent as c X,HF is decreased. Using a small κ value can severely harm the performance for barrier heights at higher percentages of HF exchange, but much less so at lower percentages. When κ = 1.1, the κDSD 50 -PBEP86-D3BJ functional marginally outperforms the unregularized variant. For large-molecule reactions, the κDSD-PBEP86-D3BJ functional is a better choice across the board compared to the revDSD-PBEP86-DBJ functional, and κ min decreases gradually with an increase in c X,HF . The regularized forms with 69% and 66% Hartree−Fock exchange offer the best performance near κ = 3.0 and 2.0, respectively. For intramolecular interactions, the trends are largely the same as what we obtained for the smallmolecule thermochemistry subsets. Finally, for intermolecular interactions, regularized forms always outperform the unregularized alternatives, and κ min decreases with an increase in c X,HF . For this subset, κDSD 69 -PBEP86-D3BJ and κDSD 63 -PBEP86-D3BJ are the two best picks at κ = 1.45 and 1.33, respectively (for optimized parameters, the total WTMAD2 for full GMTKN55, and its decomposition into five major subsets, see Table S4).
The benefit in noncovalent interactions is not as prominent as that found by Shee et al. 39 for pure MP2, which is expected because MP2 correlation in their case has a coefficient of unity while in a DHDF functional the MP2-like correlation (or PT2 correlation) is scaled down by a factor in the range of 0.3−0.5; hence, regularization will impact overall performance less. Additionally, HF exchange and PT2 correlation in a basis of KS orbitals are a different proposition from the same in a basis of HF orbitals.
As we found no material change from D3BJ to D4 in the previous section, we have decided not to explore that avenue for different c X,HF values. Now, what happens if we impose c 2ab = c 2ss , i.e., a simple double hybrid rather than a DSD or DOD form? (As always, parameters are reoptimized self-consistently.) The results can be found in Figure S3 and Table S5. Even when c X,HF = 0.75, we find a shallow WTMAD2 reduction (0.04 kcal/mol) at κ min ; as c X,HF is decreased, this "well" is deepened until it reaches 0.36 kcal/mol at c X,HF = 0.50 (κ min decreases in tandem with c X,HF ). Among the five major subsets, at low c X,HF values, the WTMAD2 component from noncovalent interactions (NCI) decreases as κ min decreases, while this is detrimental to smallmolecule thermochemistry and (at κ min = 1.1−1.45) for barrier heights: the former tendency grows weaker, and the latter stronger, as c X,HF is increased. We note that the ωB97M(2) combinatorially optimized range-separated double hybrid of Mardirossian and Head-Gordon 42 has c 2ab = c 2ss and might hence benefit. (The way spin-component-scaled MP2 behaves differently from standard MP2 has been rationalized to some degree as approximate higher-order effects. 71,72 ) Interestingly, the behavior seen for the κDOD X -PBEP86-D3BJ functionals (i.e., when c 2ss = 0) is fairly similar, with lower percentages of HF exchange significantly benefiting from PT2 regularization. For example, the WTMAD2 "well" for κDOD 50 -PBEP86-D3BJ is 0.39 kcal/mol, but for κDOD 63 -PBEP86-D3BJ, this shrinks to just 0.04 kcal/mol. Except for small-molecule thermochemistry, we found the same trend as κDSD X -PBEP86-D3BJ for the remaining subsets. At small c X,HF values, regularization seems to be slightly beneficial for the small-molecule thermochemistry subsets (see Figure S3 and Table S6).
For the sake of completeness, for dispersion-uncorrected κnoDispSD X -PBEP86 functionals, we likewise found that at higher fractions of HF exchange, κ min approaches infinity (see Figure S3 and Table S7). In this case, obviously the balance is tipped more strongly toward high percentages of PT2 correlation (and of HF exchange, as the two are wellknown 65 to be linearly related) as long-range dispersion is not covered by anything else.
Thus far, we have used the same percentage of HF and semilocal exchange for orbital generation and final energy calculation. What happens when we use a different fraction of HF and DFT exchange for orbitals and final energies, e.g., PBE0P86 (i.e., 0.25HFx + 0.75PBEx + 1.0P86c) orbitals in κDSD 69 -PBEP86-D3BJ? These new DSD functionals are found not to benefit from PT2 regularization (see Table S8 and Figure S4). When WTMAD2 = 2.28 kcal/mol, the unregularized version, revDSD-PBEP86-D3BJ@PBE0P86, slightly outperforms the original revDSD-PBEP86-D3BJ (WTMAD2 = 2.38 kcal/mol): nearly all of that gain comes from RSE43 (radical-stabilization energies), due to substantially reduced spin contamination thanks to the smaller c X,HF in the orbitals. If we constrain c 2ab = c 2ss , the WTMAD2 gap between κ min and κ = ∞ is 0.13 kcal/mol, marginally larger than what we obtained for κDH 69 -PBEP86-D3BJ (0.10 kcal/ mol). κ min also increases from 2.0 (for κDH 69 -PBEP86-D3BJ) to 3.0. For the κDOD variants of these new functionals, we saw the same trend that we did for the κDSD forms.
Summing up an extensive survey of regularized DHDFs using the large and chemically diverse GMTKN55, we can conclude the following.
The benefits of PT2 regularization for intermolecular interactions and large-molecule reactions are negated by the losses for small-molecule thermochemistry, barrier heights, and conformer energies. Hence, κ-GLPT2 correlation causes no significant reduction in WTMAD2 compared to the unregularized revDSD-PBEP86-D3BJ, revDSD-PBE-D3BJ, and xDSD 75 -PBEP86-D3BJ functionals. However, the significantly better performance of κDSD-BLYP-D3BJ when κ = 2.2 for The Journal of Physical Chemistry Letters pubs.acs.org/JPCL Letter large-molecule reactions has enough impact on WTMAD2 that, overall, it marginally outperforms revDSD-BLYP-D3BJ. Replacing D3BJ with D4 dispersion does not affect those trends.
If we eliminate spin-component scaling (i.e., c 2ab = c 2ss ), the WTMAD2 gap between the κ min and κ = ∞ (i.e., unregularized) forms of κ(x)DH-XC-D3BJ is more significant than that we obtained for the κ(x)DSD functionals. In contrast, for the κ(x)DOD forms, unregularized functionals always perform better.
Regularization of the GLPT2 terms in double hybrids is most helpful at lower percentages (e.g., 50%) of HF exchange. At higher percentages of HF exchange, the benefits for intermolecular interactions and large-molecule reactions are outweighed by the deterioration in the remaining three subsets. At lower percentages of HF exchange, the benefits are heightened and the deterioration is mitigated, hence, an overall beneficial effect. In special cases in which DHs with a small fraction of HF exchange might be more resilient (e.g., systems with strong static correlation or prone to severe spin contamination error), κ-regularized double hybrids will offer advantages over their unregularized counterparts.