Surprisingly Good Performance of XYG3 Family Functionals Using Scaled KS-MP3 Correlation

By adding a GLPT3 (third-order G\"orling-Levy perturbation theory, or KS-MP3) term E3 to the XYG7 form for a double hybrid, we are able to bring down WTMAD2 (weighted total mean absolute deviation) for the very large and chemically diverse GMTKN55 benchmark to an unprecedented 1.17 kcal/mol, competitive with much costlier composite wavefunction ab initio approaches. Intriguingly: (a) the introduction of E3 makes an empirical dispersion correction redundant; (b) GGA or mGGA semilocal correlation functionals offer no advantage over LDA in this framework; (c) if a dispersion correction is retained, then simple Slater exchange leads to no significant loss in accuracy. It is possible to create a 6-parameter functional with WTMAD2=1.42 that has no post-LDA DFT components and no dispersion correction in the final energy.

Over the past few decades, Kohn-Sham density functional theory 1 (KS-DFT) has become the goto electronic structure method, especially for medium and large molecular systems, as it combines reasonable accuracy with relatively mild computational cost-scaling with system size. 2 Being at the top of the 'Jacob's Ladder' of density functional approximations, 3 double hybrid density functionals (DHs) (see Refs. [4][5][6][7][8] for reviews) are the most accurate DFT methods available to date for modeling a variety of chemical problems, like main group energetics, [9][10][11] transition metal catalysis, [12][13][14][15] electronic excitation spectroscopy, [16][17][18][19][20][21][22][23][24] external magnetic [25][26][27][28] and electric field 29,30 response properties, and so on. Double hybrids can be further categorized according to their use of lower rung (hybrid) functionals in the orbital generation step as gDHs (after Grimme 31 ) and xDHs (after the XYG3 32 and related functionals 5 ). Xu and co-workers used B3LYP [33][34][35] orbitals while calculating the final energy for the original XYG3 32 functional and its later variants, lrc-XYG3 36 and XYGJ-OS. 37 Along this line, xDSD, [38][39][40] ωDSD, 39 and ωDSD3 41 functionals from Martin group and the combinatorially optimized range separated double hybrid, ωB97M(2) 42 by Mardirossian and Head-Gordon have shown remarkable performance for extensive and chemically diverse main-group benchmark like MGCDB84 9 (Main-Group Chemistry Database with about 5000 test cases divided into 84 subsets) and GMTKN55 10 (General Maingroup Thermochemistry, Kinetics, and Noncovalent interactions, 55 problem subsets). In a recent study, Zhang and Xu 43 explored the limits of the XYG3-type double hybrids utilizing B3LYP reference orbitals to be consistent with the original XYG3 functional. By gradually relaxing the constraints of the XYG3 they were able to improve the performance of their new xDHs and with all the constraints removed, XYG7@B3LYP appeared as the best in class. Their most intriguing finding was that decoupling Slater exchange from the GGA enhancement factor, i.e., giving B88x and pure Slater exchange independent coefficients, markedly improved performance.
In this letter, we consider two avenues for potential further improvement over XYG7@B3LYP. First, whether there might be more suitable orbitals than B3LYP for this type of functionals. Second, in light of our previous findings for the DSD3 and ωDSD3 functionals, 41 whether it would be advantageous to add a scaled nonlocal GLPT3 (third-order Görling-Levy 44 perturbation theory) term on top of the same-spin and opposite-spin GLPT2 (second-order GLPT) terms already present. To answer both questions, we begin by employing reference orbitals where the percentage HF-like exchange (HFx) is increased systematically. Following Ref. 43 , the number m in XYGm refers to the number of terms considered; as we shall add a D3BJ [45][46][47] empirical dispersion correction term, we shall refer to these xDH's as XYG8@BnLYP and XYG9@BnLYP, where n stands for the percentage of HFx used in the orbital generation. The final energy expressions of the XYG8@BnLYP and XYG9@BnLYP functionals have the form:  [45][46][47] or D4 48-50 , each coming with adjustable parameters. In the present work, the dispersion energy term is made proportional to a linear coefficient s6.
In the discussion below, XYG8[f1]@BnLYP will indicate that one parameter was frozen, XYG8[f2]@BnLYP that two parameters were frozen, and so on.
Unlike B3LYP, BnLYP reference orbitals used for our XYG8 and XYG9 functionals do not contain any "unenhanced" Slater exchange. The BnLYP orbitals only differ in the exchange part where n% HFx and (100-n)% DFTx are used while the correlation part is kept unchanged throughout (19% VWN5 + 81% LYP). During the orbital generation and the final energy evaluation step, we make use of the same semilocal GGA-type and LDA-type exchange and correlation functionals.
For the present study, we have used the aforementioned GMTKN55 benchmark suite 10 throughout to validate and parametrize our new functionals. It comprises 55 types of chemical problems, which can be further classified into five major subcategories: thermochemistry of small and medium-sized molecules, barrier heights, large-molecule reactions, intermolecular interactions, and conformer energies (or intramolecular interactions) (for the details of all 55 subsets with proper references see Table S1 in Supporting Information). The WTMAD2 (so-called weighted total mean absolute deviation of type 2) has been used as our primary metric of choice: where |∆ | 666666 % is the mean absolute value of all the reference energies from = 1 to 55, % is the number of systems in each subset, % is the mean absolute difference between calculated and reference energies for each of the 55 subsets.
All electronic structure calculations except the third-order Møller-Plesset correlation (MP3) part were performed using the MRCC2020 51 package. The Weigend−Ahlrichs family def2-QZVPP 52 basis set was used for all subsets except for the anion-containing ones WATER27, RG18, IL16, G21EA, BH76, BH76RC and AHB21 -where the diffuse-function augmented def2-QZVPPD 53 was employed instead -and the C60ISO and UPU23 subsets, where we settled for the def2-TZVPP basis set to reduce computational cost. To reduce the latter further, the corresponding standard RI 54 and RI-JK 55 basis sets were employed for the correlation as well as for the Coulomb (J) and HF exchange (K) parts. The LD0110-LD0590 angular integration grid was used throughout, being a pruned Lebedev-type integration grid similar to Grid=UltraFine in Gaussian 56 or SG-3 in Q-Chem. 57 The same frozen core settings were used as in Refs. 40,41 MP3 calculations in both HF and KS orbitals were performed using Q-Chem 5.4, 57 def2-TZVPP being used throughout -we previously found that this basis set is adequate for post-MP2 corrections, both in composite WFT and in double-hybrid DFT contexts. 41,58 All calculations were performed on the ChemFarm HPC cluster in the Faculty of Chemistry at the Weizmann Institute of Science.
We employed Powell's BOBYQA 61 (Bound Optimization BY Quadratic Approximation) derivative-free constrained optimizer, together with scripts and Fortran programs developed inhouse, for the optimization of all parameters.
The statistical significance of adding or omitting a parameter was decided with the help of the Bayesian Information Criterion. 62,63 For the dataset at hand (see Appendix A in the Supporting Information for details), a change of ~0.6% in WTMAD2 corresponds to "strong" and ~3.5% to "decisive" importance according to the Kass-Raftery criteria. 63 We first explore the performance of the XYG8@BnLYP functionals with n ranging from 10-70%. The minimum WTMAD2 occurs near n = 20%; in fact, the WTMAD2 surface is relatively flat between 20-30%. While optimizing all eight parameters, the coefficient for the semilocal GGAtype correlation (cC,GGA) settles near zero, thus fixing it to zero and reoptimizing the remaining seven parameters led us to WTMAD2=1.85 kcal/mol. That being said, the XYG8[f1]@B25LYP and XYG8[f1]@B30LYP are two very close contenders -which illustrates the flat behavior of the WTMAD2 surface between the 20-30% region (see Figure1).
Next, what if we leave out the dispersion component entirely, thus eliminating one additional parameter? On average, this would sacrifice only ~0.08 kcal/mol accuracy in terms of total WTMAD2, or about 4% of the total, which just exceed the "decisive" BIC criterion. As expected, among the top five subcategories of GMTKN55, the noncovalent interactions, and to some extent, barrier heights are most affected (see Table S2 in the Supporting Information). Although the D3BJ-uncorrected XYG8[f2]@B25LYP functional offers the lowest WTMAD2 (1.92 kcal/mol), the performances of the XYG8[f2] variants utilizing B20LYP and B30LYP orbitals are pretty close. For the XYG8[f1]@BnLYP functionals, the coefficients for the GGA exchange (i.e., cX,GGA) decrease as n increases (see Table 1). So, together with cC,GGA if we also fix cX,GGA at zero and optimize the remaining coefficients, the WTMAD2 difference between the XYG8[f1]@BnLYP and the new form, XYG8[f2g]@BnLYP, becomes narrower with increasing HF-like exchange in the orbitals and almost invisible for n = 70% (see Table 1). The letter g in XYG8[f2g]@BnLYP refers to the nature of the frozen parameters -g for GGA and l for LDA. Going from XYG8[f1]@B20LYP to XYG8[f2g]@B20LYP WTMAD2 increases by 0.16 kcal/mol. Now, comparing these two for the five top-level subsets, performance for all four top-level subsets other than barrier heights is degraded by imposing the cX,GGA=cC,GGA=0 constraint (see Table S2). Interestingly, among the small molecule thermochemistry subsets, the G21EA subset actually improves by imposing that constraint. Now, if we omit the dispersion correction term only and reoptimize everything else (i.e, XYG8[f1d]@B20LYP, where d stands for dispersion), we obtain WTMAD2= 1.92 kcal/mol, which is practically the same as what we got for the XYG8[f2]@B20LYP functional. Interestingly for XYG8[f1]@B20LYP, if both s6 and cX,GGA are frozen (i.e, XYG8[f3]) this drives up WTMAD2 by almost 1 kcal/mol, an order of magnitude more than freezing each of these individually (see Table  S1 in the Supporting Information).
Next, as a control experiment: by fixing the coefficients for the LDA exchange and correlation (cC,LDA and cX,LDA) at zero, we always obtain higher WTMAD2 than the corresponding functional forms where both cX,GGA and cC,GGA are zero. Once again, the WTMAD2 difference seems to become narrower for larger values of n (see Table S2 in the Supporting Information). The best pick in this category, the XYG8[f2l]@B50LYP offers WTMAD2= 2.12 kcal/mol, which is comparable to our group's latest xDHs, xDSD75-PBEP86-D3BJ (WTMAD2 = 2.14 kcal/mol) and xDSD75-PBEP86-D4 (WTMAD2 = 2.12 kcal/mol). 39 Incidentally, for XYG8[f5]@B20LYP, using only 100% HF-like exchange with scaled LDAc and PT2 correlation, WTMAD2 rises to 3.59 kcal/mol (see Table 1).
We now explore the effect of using different LDA semilocal correlations for the construction of these new functionals. XYG7@B3LYP is chosen for this purpose, where the VWN3 correlation is replaced by VWN5. For B3LYP orbitals, we also checked PW92 and found the results to be essentially the same as for VWN5 (see Table S3 and S4 in the Supporting Information). Without further optimization, the original XYG7 43 offers WTMAD2=1.88 kcal/mol, which is 0.17 kcal/mol lower than the value reported by Zhang and Xu. This discrepancy could be attributed to differences in the basis sets used, particularly for anionic subsets.  In 2003, Grimme introduced the SCS-MP3 method, which outperformed regular MP3 at no extra cost; 64 later, Hobza and coworkers 65 coined the term MP2.5 for the average of MP2 and MP3, and showed its superior performance for noncovalent interactions. Recently, we have reported 41 that adding an MP3-like correlation term in the final KS energy expression can significantly improve the performances of both global and range-separated DSD double hybrids, albeit at some extra computational expense. As technical limitations precluded using KS reference orbitals for MP3 at that time, we used the HF reference orbitals to calculate the conventional MP3 correlation energies and added them into the DSD and ωDSD functional forms with an additional linear coefficient. Motivated by the above two findings, here we combine the scaled E3 correlation energies with the XYG8@BnLYP functionals to check whether it is possible to improve their performance further. Similar to Ref. 41 , we have omitted the 50 largest systems, i.e., the UPU23 and C60 subsets, ten largest ISOL24, three INV24, and a single IDISP, we refer to this reduced version of GMTKN55 as mod-GMTKN55 in this paper.
Surprisingly enough, the lowest WTMAD2 we got is 1.28 kcal/mol using the B67LYP orbitals. Similar to the PT2 based functionals above, here too, the optimized cC,GGA values approach zero, and thus cC,GGA can be eliminated without any accuracy loss (see Table 2). Upon comparing the performance of the eight parameter XYG9MP3[f1]@B67LYP with the PT2 based XYG8[f1]@B20LYP for the five top-level subsets, we find that the lion's share of the WTMAD2 improvement is seen in the small and medium molecule thermochemistry and large molecule reactions subsets. Next, comparing the XYG9MP3[f1] with the PT2-based XYG8[f1] evaluated using the same set of orbitals, namely B67LYP, we find that statistics of large-molecule reactions subset improve by 50% upon inclusion of MP3 correlation (see Table S5 in the Supporting Information). In Ref. 41 , for none for the global or range-separated DSD3 functionals could we omit the dispersion correction term with impunity: for instance, ωDSD3-PBEP86-D3BJ offered WTMAD2 = 2.08 kcal/mol while the dispersion-free variant ωDSD3-PBEP86 has WTMAD2 = 2.86 kcal/mol (see Table 3 in Ref. 41 ). However, for the present conventional-MP3-based XYG9MP3[f1]@BnLYP functionals, discarding the dispersion term has only a negligible effect on total WTMAD2 (~0.01-0.02 kcal/mol) -which is a much narrower range compared to that in PT2-based XYG8[f1]@BnLYP functionals (see Figures 1).
We now examine the importance of considering the semilocal DFT exchange coefficient (cX,GGA). The optimized cX,GGA values decrease as n increases (see Table 2). If we fix both cX,GGA and cC,GGA at zero and optimize other coefficients (i.e., XYG9MP3[f2g]@B67LYP), the resulted WTMAD2 goes up only a negligible amount compared to that for the XYG9MP3[f1]@B67LYP. Now, if we also discard the empirical dispersion correction term -which leaves us with only six adjustable parameters (same as our group's revDSD-PBEP86-D3BJ 40 ), we get WTMAD2 = 1.38 kcal/mol. So, for these PT3-containing double hybrids, the dispersion correction term is nearly redundant as far as total WTMAD2 is concerned.
Thus far, we have used canonical MP3 correlation energies as an additional component to improve the performance of GLPT2 based XYG8[f1]@BnLYP double hybrids. In a recent study, Head-Gordon and co-workers have shown 66,67 that the use of DFT orbitals instead of pure HF ones for MP3 calculation can significantly improve performance for thermochemistry, barrier heights, noncovalent interactions, and dipole moments. Additionaly, Jana et al. 68 have found that considering Görling−Levy perturbation terms beyond second order can improve the performance of PBE-based double hybrids for some selected datasets. So, here we explore whether employing the scaled MP3 correlation energies evaluated in a basis of BnLYP orbitals (i.e., KS-MP3) can still be beneficial compared to the already excellent XYG9MP3@BnLYP functionals. We have used the same mod-GMTKN55 dataset for the parametrization as well as validation of these functionals.
We again notice that cC,GGA settles near zero after optimization, so we proceed by setting it to zero and refitting the eight remaining parameters for the KS-MP3 based functionals (namely XYG9[f1]@BnLYP). Using 44% HFx in the orbital generation step, the XYG9[f1]@B44LYP offers the lowest WTMAD2 (1.17 kcal/mol), which is even lower than what we got for the top canonical MP3 based xDH, XYG9MP3[f1]@B67LYP(1.28 kcal/mol). However, the WTMAD2 surface for XYG9[f1]@BnLYP is pretty flat between the 40-50% region (see Figure 1). That being said, the performance of the XYG9[f1]@B44LYP is now in the territory of the best G4-type composite WFT methods (cWFT), recently proposed by Semidalas and Martin. 58,69 It actually surpasses the performance of the G4-T approach 58 by 0.34 kcal/mol; the improvement lies mainly in conformers and to a lesser degree in large molecule energetics (see Table 1 in Ref. 58 ). The correlation consistent cc-G4-type protocols 69 still have an edge over XYG9[f1]@B44LYP, but the accuracy gap has narrowed even more since the top cWFT performers, cc-G4-T and the parameter-free cc-G4-F12-T, have WTMAD2 = 0.90 and 1.04 kcal/mol, respectively. Now, comparing the XYG9[f1]@B44LYP with the PT2-based XYG8[f1]@B20LYP (WTMAD2 = 1.79 kcal/mol for mod-GMTKN55) for the five top-level subsets, we found that the large molecule reactions and small molecule thermochemistry are the two most benefited classes. Let's check where exactly KS-MP3 based functionals benefit over the canonical MP3 based variants. We compare the WTMAD2 contribution of individual subsets between the XYG9[f1] and XYG9MP3[f1] (WTMAD2=1.49 kcal/mol) functionals evaluated over the same set of KS orbitals, B44LYP (see Table S12 and S17 in the Supporting Information). Although performance improved for most of the subsets, the top affected subsets are: TAUT15, RSE43, HAL59, PAREL, MCONF, G21EA and CDIE20. Now, what if we leave the dispersion correction out entirely? The WTMAD2 gap between the XYG9[f1]@BnLYP and XYG9[f2]@BnLYP then almost vanishes (see Figure 1). Due to the negligible effect of the dispersion correction term, we elected not to fit the more advanced D4 49,50 dispersion correction on top of XYG9[f2]@BnLYP. At a reviewer's request, a plot of the interaction energy of stretched Ar2 can be found in the Supporting Information (see Figure S2): proper asymptotic R -6 dependence is observed there, indicating this was not sacrificed by omitting the dispersion correction.
Similar to the PT2 based XYG8@BnLYP and canonical MP3 based XYG9MP3@BnLYP double hybrids, cX,GGA gradually decreases with the increase of n (see Table 3). Now, if we fix both cX,GGA and cC,GGA at zero and reoptimize the remain seven parameters, we obtain WTMAD2 = 1.25 kcal/mol. Additionally, if we also discard the empirical dispersion term (WTMAD2 = 1.42 kcal/mol), we lose a nontrivial amount (0.25 kcal/mol) of accuracy compared to our best pick XYG9[f1]@B44LYP. In contrast, the loss in WTMAD2 is 0.17 kcal/mol compared to the corresponding D3BJ-uncorrected variant XYG9[f2]@B44LYP.
Next, if we consider only pure HF-exchange (i.e., fixing cX,HF=1) and optimize the four remaining correlation prefactors, we get WTMAD2 = 1.56 kcal/mol for the XYG9[f5]@B44LYP (see Table 3). Comparing XYG9[f5]@B44LYP with XYG9[f2]@BnLYP for all 55 subsets of GMTKN55, we find the greatest differences for BUT14DIOL, G21EA, RG18, and BSR36. However, we must mention that the resulting functional XYG9[f5]@B44LYP is no longer a "double hybrid" in the usual sense, as we are mixing the semilocal and nonlocal terms only for the correlation part. However, if we additionally constrain cC,LDA=0 for XYG9[f5]@B44LYP (ak.a, XYG9[f6]@B44LYP), WTMAD2 increases to 2.50 kcal/mol. For the XYG9[f1]@B44LYP functionals, we note that the sum of three exchange parameters is very close to one. So, if we re-enforce the cX,HF + cX,GGA + cX,LDA =1 constraint (one among the four constraints of the original XYG3 32 ) and optimize the seven remaining parameters, how much performance is sacrificed? Performing this test on both the XYG9[f1]@B40LYP and XYG9[f1]@B44LYP and their D3BJ-uncorrected variants, we found that the loss in terms of WTMAD2 is almost imperceptible (see Table S3 in the Supporting Information). So, it is possible to reduce the number of parameters for both variants without sacrificing any significant accuracy.
A reviewer inquired about the counterintuitively negative cX,GGA coefficients in Table 3. In response, we carried out constrained reoptimizations for a sequence of fixed values of cX,HF. The result can be found in the Supporting Information (see Figure S3). It can be seen there that cX,GGA has a linear dependence with a negative slope on cX,HF; beyond ca. 87%, cX,GGA becomes negative. A tentative explanation can be advanced as follows: considering that EX,B88=EX,LDA.FB88(s), where F(s) is an enhancement factor depending on the reduced density gradient s=∇⍴/⍴ 4/3 , it follows that cX,LDA.EX,LDA + cX,GGA.EX,B88 = (cX,LDA+cX,GGA)EX,LDA + cX,GGA(FB88(s) -1). Now as FB88(0)=1 and F(s)≥1 for all s, a negative cX,GGA effectively implies that semilocal exchange will be progressively throttled as s becomes large (in the periphery of the atomic density), while a positive cX,GGA implies the opposite. It stands to reason that, as more "exact" exchange is introduced, the functional would increasingly benefit from semilocal exchange "getting out of the way" as s increases.
Lastly, by adding a KS-MP3 correlation component on top of XYG3@B3LYP and optimizing all four parameters against mod-GMTKN55, we get WTMAD2= 1.64 kcal/mol. On the other hand, doing the same for the XYG7@B3LYP offers WTMAD2= 1.35 kcal/mol (see Table S6 in the Supporting Information).
We can conclude the following from an extensive study of XYG8@BnLYP, XYG9MP3@BnLYP and XYG9@BnLYP double-hybrid density functionals using the large and chemically diverse GMTKN55 and mod-GMTKN55 databases: • For all three xDH variants we have discussed above, the optimized parameter for the semilocal DFT correlation is close to zero across the board and can be safely eliminated without sacrificing any accuracy in terms of WTMAD2. • Both for the PT2 based and MP3 based xDHs presented in this letter, the coefficient for the semilocal DFT exchange term decreases with increasing % HFx used during the orbital generation. Thus, this parameter can also be set to zero if one use orbitals with higher fraction of HFx for the final energy evaluation. We note that this leads to a situation where the final energy expression is entirely independent of post-LDA enhancement factors in both exchange and correlation. • Interestingly enough, using the full HF-like exchange and only the four correlation energy terms (i.e., cC,LDA, c2ab, c2ss and c3) in the final energy expression, we can achieve WTMAD2 = 1.56 kcal/mol for XYG9[f5]@B44LYP, which is only 0.38 kcal/mol inferior to our best pick XYG9[f1]@B44LYP. A final remark is due on the computational cost. Table S18 and Figure S1 show wall clock time dependence on the system size in the ADIM6 subset (n-alkane dimers, n=2-7). XYG8[f1]@B44LYP and other MP2-based double hybrids, in this size range and with the RI approximation, scale about O(N 3 ) with basis set size. The RI-MP3 algorithm used scales almost like canonical O(N 6 ), and hence that term rapidly dominates the overall calculation. However, recent developments 67,70 in tensor hypercontraction approaches 71 reduce scaling to O(N 4 ) and may be effectively exploited here.
We would like to acknowledge helpful discussions with Prof. Dr. A. Daniel Boese (Karl-Franzens-Universität Graz, Austria). GS acknowledges a doctoral fellowship from the Feinberg Graduate School (WIS). The work of E.S. on this scientific paper was supported by the Onassis Foundation -Scholarship ID: F ZP 052-1/2019-2020. The authors would like to thank Dr. Nisha Mehta (WIS) for critical reading of the manuscript. Appendix on the Bayesian Information Criterion; Abridged details of all 55 subsets of GMTKN55 with proper references; Optimized parameters, total WTMAD2 and its division into five major subsets for several XYG8@B3LYP variants and original XYG7@B3LYP; Total WTMAD2 and its division into five major subsets for different variants of XYG8@BnLYP, XYG9MP3@BnLYP and XYG9@BnLYP; MAD, MSD and RMSD as well as breakdown of total WTMAD2 per subset for the XYG8