Exploring Avenues beyond Revised DSD Functionals: II. Random-Phase Approximation and Scaled MP3 Corrections

For revDSD double hybrids, the Görling–Levy second-order perturbation theory component is an Achilles’ heel when applied to systems with significant near-degeneracy (“static”) correlation. We have explored its replacement by the direct random phase approximation (dRPA), inspired by the SCS-dRPA75 functional of Kállay and co-workers. The addition to the final energy of both a D4 empirical dispersion correction and of a semilocal correlation component lead to significant improvements, with DSD-PBEdRPA75-D4 approaching the performance of revDSD-PBEP86-D4 and the Berkeley ωB97M(2). This form appears to be fairly insensitive to the choice of the semilocal functional but does exhibit stronger basis set sensitivity than the PT2-based double hybrids (due to much larger prefactors for the nonlocal correlation). As an alternative, we explored adding an MP3-like correction term (in a medium-sized basis set) to a range-separated ωDSD-PBEP86-D4 double hybrid and found it to have significantly lower WTMAD2 (weighted mean absolute deviation) for the large and chemically diverse GMTKN55 benchmark suite; the added computational cost can be mitigated through density fitting techniques.


INTRODUCTION
While the Kohn−Sham density functional theory (KS-DFT) 1 in principle would be exact if the exact exchange-correlation (XC) functional were known, in practice its accuracy is limited by the quality of the approximate XC functional chosen in electronic structure calculations. Over the past few decades, a veritable "zoo" (Perdew's term 2,3 ) of such functionals has emerged. Perdew introduced an organizing principle known as the "Jacob's Ladder," 4 ascending by degrees from the Hartree "vale of tears" (no exchange, no correlation) to the heaven of chemical accuracy: on every degree or rung, a new source of information is introduced. LDA (local density approximation) constitutes the first rung, GGAs (generalized gradient approximations) the second rung, and meta-GGAs (mGGAs, which introduce the density Laplacian or the kinetic energy density) represent the third rung of the ladder. The fourth rung introduces dependence on the occupied Kohn−Sham orbitals: hybrid functionals (global, local, and range-separated) are the most important subclass here. Lastly, the fifth rung corresponds to inclusion of virtual orbital information, such as in double hybrids (see refs5−7 for reviews, and most recently ref8 by the present authors).
Building on the earlier work of Gorling and Levy 9 who introduced perturbation theory in a basis of Kohn−Sham orbitals, Grimme's 2006 paper 10 presented the first double hybrid in the current sense of the word. The term refers to the fact that aside from an admixture of (m)GGA and "exact" Hartree−Fock (HF)-like exchange, the correlation is treated as a hybrid of (m)GGA correlation and GLPT2 (second-order Gorling−Levy 9 perturbation theory). Following a Kohn−Sham calculation with a given semilocal XC functional and a given percentage of HF exchange, the total energy is evaluated in the second step as: validation benchmarks like GMTKN55 11 (general main-group thermochemistry, kinetics, and noncovalent interactions) that rival those of composite wavefunction theory (cWFT) methods like G4 theory 12,13 (see, however, Semidalas and Martin for some ways to improve cWFT at zero to minimal cost 14,15 ). One Achilles' heel for GLPT2 are molecules with small band gaps (a.k.a absolute near-degeneracy correlation, type A static correlation 16 ), owing to the orbital energy difference in the PT2 denominator becoming very small. One potential remedy would be to replace PT2 by the random phase approximation (RPA) 17 for the nonlocal correlation part. From the viewpoint of wavefunction theory, Scuseria and co-workers 18,19 have analytically proven the equivalence of RPA and direct ring coupled clusters with all doubles (drCCD). While the coupled-cluster singles and doubles (CCSD) method is not immune to type A static correlation, it is much more resilient compared to PT2.
The very first foray in this direction was made by Ahnen et al., 20 who substituted RPA for GLPT2 in the B2PLYP double hybrid. 10 Later, Kaĺlay and co-workers, 21 as well as Grimme and Steinmetz, 22 have explored this possibility in greater depth and came up with their own double hybrids featuring the direct random phase approximation (dRPA, ref 23 and references therein). The dRPA75 "dual hybrid" of Kaĺlay and co-workers, which uses orbitals evaluated at the PBE 75 level (with 75% Hartree−Fock exchange and full PBEc correlation), but only includes pure dRPA correlation in the final energy, is closer in spirit to dRPA than to a double hybrid. In contrast, Grimme and Steinmetz's PWRB95 employs computationally inexpensive mGGA orbitals (specifically, mPW91B95 24,25 ) to evaluate a final energy expression consisting of 50% HF exchange, 50% semilocal exchange, 35% dRPA correlation, 71% semilocal correlation, and 65% nonlocal 26 dispersion correctionmaking it an obvious double hybrid.
One major issue with the dRPA75 was its poor performance for total atomization energies (TAEs, the computational cognates of heats of formation). The authors later remedied that by spin-component scaling: 27 although dRPA is a spin-free method and thus such scaling would have no effect for closedshell systems, it will affect open-shell cases (most relevantly for TAEs, atoms), particularly as dRPA has a spurious selfcorrelation energy for unpaired electrons. 28 The so-called SCS-dRPA75 functional employs c X = 0.75, c o−s = 1.5, and c s−s = (2 − c o−s ) = 0.5addressing the issue for atoms and other open-shell species while being equivalent to dRPA75 for closedshell species. 27 In their revision of the S66x8 noncovalent interactions data set, 29 Brauer et al. 30 found that the ostensibly good performance of dRPA75/aug-cc-pVTZ resulted from a spurious error compensation between the basis set superposition error and the absence of a dispersion correction. They also observed, as expected, that the basis set convergence behavior of dRPA is similar to that of CCSD. A D3BJ dispersion correction 31 was parametrized for use with dRPA75 and its parameters found to be very similar to those optimized on top of CCSD (coupled cluster with all singles and doubles 32 ); from a symmetry-adapted perturbation theory 33,34 perspective, the most important dispersion term not included in dRPA and CCSD is the fourth-order connected triple excitations term.
In addition, as already mentioned, the dRPA75 and SCS-dRPA75 forms do not include any semilocal correlation contribution in their final energy expressions.
The first research question to be answered in this paper is (see Section 3.1) whether (SCS)dRPA75 can be further improved by not only admitting modern dispersion corrections and semilocal correlation but also reparametrizing against a large and chemically diverse database. The functional form is denoted as DSD-XCdRPAn-Disp, where DSD stands for dispersioncorrected, spin-component-scaled double hybrid, XC stands for the nonlocal exchange-correlation combination used for both the orbital generation in the first step and energy calculation in the second step; n is the percentage of HFexchange used for both the steps. The final energy for DSD-XCdRPAn-Disp has the form: where, c o−s and c s−s stand for opposite-spin and same-spin dRPAc coefficient, respectively. All other terms are the same as eq 1. In this notation, the SCS-dRPA75 dual hybrid is a special case where c X,HF = 0.75, c C,XC = 0, and s 6 = s 8 = 0. As we will show later on, the answer to our research question is affirmative, and the resulting functionals approach the accuracy of the best PT2based double hybrids known thus farMardirossian and Head-Gordon's 35 ωB97M(2) and our own 36 revDSD-PBEP86-D4. The second research question (to be answered in Section 3.2) is: would taking GLPT2 beyond the second-order improve the performance of revDSD functionals further? Radom and coworkers 37 considered MP3 (third-order many-body perturbation theory), MP4, and CCSD instead of MP2 and found no significant improvement over regular double hybrids. However, this may simply have been an artifact of the modest basis sets and relatively small training set used in ref 37. Such considerations have been examined in ref 14 where it was also found that the benefits of including an MP3 "middle step" in a 3-tier cWFT can be realized also with a medium-sized basis set for this costly term. In the sections below, we shall consider its addition to global double hybrid revDSD 36 and range-separated ωDSDtype double hybrids using the GMTKN55 data set for training/ calibration. Newly developed functionals will be denoted as DSD3 for global DHs and ωDSD3 for range-separated DHs. The final energy expression of a DSD3 functional has the following form: The primary parametrization and validation set used in this work is the GMTKN55 (general maingroup thermochemistry, kinetics, and noncovalent interactions) benchmark 11 by Grimme, and co-workers. This database is an updated and expanded version of its predecessors GMTKN24 39 and GMTKN30. 40 GMTKN55 comprises 55 types of chemical model problems, which can be further classified into five major (top-level) subcategories: thermochemistry of small and medium-sized molecules, barrier heights, large-molecule reactions, intermolecular interactions, and conformer energies (or intramolecular interactions). One full evaluation of the GMTKN55 requires a total of 2459 single point energy calculations, leading to 1499 unique energy differences (complete details of all 55 subsets and original references can be found in Table S1 in the Supporting Information).
The WTMAD2 (weighted mean absolute deviation, type 2) as defined in the GMTKN55 paper 11 has been used as the primary metric of choice throughout the current work: where |ΔE̅ | i is the mean absolute value of all the reference energies from i = 1 to 55, N i is the number of systems in each subset, MAD i is the mean absolute difference between the calculated and reference energies for each of the 55 subsets. Mean absolute deviation (MAD) is a more "robust" metric than root-mean-square difference (RMSD), in the statistical sense of the word 41 that it is more resilient to a small number of large outliers than the RMSD. For a normal distribution without systematic errors, RMSD ≈ 5MAD/4. 42 As one reviewer pointed out, the average absolute reaction energies (AARE) for subsets NBPRC and MB16-43 given in the GMTKN55 paper 11 differ from the corresponding values calculated from the individual data provided in the Supporting Information. If these corrected AARE values were employed in the construction of the WTMAD2 equation, eq 4, then their average, which appears in eq 4 as the overall scale factor, would be 57.76 rather than 56.84. However, as all previously published papers on GMTKN55 (such as refs3, 8, 31, 36, 43−45) have used the original (smaller) coefficient, we are retaining it as well for the sake of compatibility. This obviously will not affect the ranking between functionals; those who prefer WTMAD2 57.76 can simply multiply all WTMAD2 values by 1.0162.
Reference geometries were downloaded from the Supporting Information of refs 11 and 46 and used without further geometry optimization.
2.2. Electronic Structure Calculations. The MRCC2020 47 program package was used for all calculations involving dRPA correlation. The Weigend−Ahlrichs 48 def2-QZVPP basis set was used for all of the subsets except WATER27, RG18, IL16, G21EA, BH76, BH76RC, and AHB21where the diffuse-function augmented def2-QZVPPD 49 was employedand the C60ISO and UPU23 subsets, where we settled for the def2-TZVPP basis set to reduce computational cost. 48 The LD0110-LD0590 angular integration grid was used for all the DFT calculations; this is a pruned Lebedev-type integration grid similar to Grid = UltraFine in Gaussian 50 or SG-3 in Q-Chem. 51 In their original GMTKN55 paper, Goerigk et al. 11 correlated all electrons in the post-KS steps. However, in a previous study by our group, 36 we have shown that core-valence correlation is best omitted when using the def2-QZVPP basis set (which has no core-valence functions), while a more recent study on composite wavefunction methods indicated that even with correlation consistent core-valence sets, the effect of subvalence electrons on WTMAD2 of GMTKN55 is quite smallbenefits gained there are mostly from the added valence flexibility of the basis sets. 15 Exceptions were made for MB16-43, HEAVY28, HEAVYSB11, ALK8, CHB6, and ALKBDE10 subsetswhere the orbital energy gaps between the halogen and chalcogen valence and metal subvalence shells can drop below 1 hartree, such that subvalence electrons of metal and metalloid atoms must be unfrozenas well as for the HAL59 and HEAVY28 subsets, where (n − 1)spd orbitals on heavy p-block elements were kept unfrozen. We note in passing that, unlike the valence correlation consistent basis sets, the Weigend−Ahlrichs QZVPP basis set is multiple-zeta in the core as well and contains some core-valence polarization functions: see Table 1 of ref 48. At any rate, we have considered 15 the impact of core-valence correlation on GMTKN55 using correlation consistent corevalence basis sets and found (in the context of pure wavefunction calculations) that its impact is on the order of 0.05 kcal/molwhich will be further reduced here through attenuation of the correlation terms.
For the DSD3 and ωDSD3 functionals, QCHEM 51 5.3 was used throughout. The same "Frozen core" settings and integration grids were applied as were used in the preceding paper on the revDSD and ωDSD functionals. 36 In order to reduce the computational cost, all the MP3 calculations were done using the def2-TZVPP basis set; 48 all other energy components were evaluated using the same basis set  However, DSD3-type functionals (see below) introduce one additional parameter (c 3 ) for the MP3 correlation term. For the ωDSD3 family, yet another parameter ω needs to be considered for range-separation, which brings the total number of empirical parameters to eightstill only half the number involved in the current "best in class" double hybrid ωB97M(2), 35 which has 16 empirical parameters.
We employed Powell's BOBYQA 56 (Bound Optimization BY Quadratic Approximation) derivative-free constrained optimizer, together with scripts and Fortran programs developed inhouse, for the optimization of all parameters.
Once a full set of GMTKN55 calculations is done for one set of fixed nonlinear parameters c X,HF and c C,DFT (for ωDSD3 also ω), the associated optimal values of the remaining parameters {c 2ab , c 2ss , (c 3 ), s 6 , a 2 } can be obtained in a "microiteration" process. This entire process corresponds to one step in the "macroiterations" in which we minimize WTMAD2 with respect to {c X,HF , c C,DFT } and, where applicable, (ω). The process is somewhat akin to microiterations in CASSCF algorithms w.r.t. CI coefficients vs orbitals (see ref 57 and references therein), or QM-MM geometry optimizations where geometric parameters in the MM layer are subjected to microiteration for each change of coordinates in the QM layer (e.g., ref 58).
In view of the small number of adjustable parameters, we have elected, as in our previous studies, to effectively use all of GMTKN55 as both the training and validation set.

RESULTS AND DISCUSSION
3.1. GMTKN55 Suite. In our previous study, 36 we found that refitting of the original DSD functionals 54,55 to the large and chemically diverse GMTKN55 data set led to greatly improved performance, particularly for noncovalent interaction and largemolecule reaction energy. Motivated by this prior finding, we attempted first to reoptimize the spin-component-scaling factors in SCS-dRPA75 and obtained WTMAD2 = 4.71 kcal/moljust a marginal improvement over the original 27 dual hybrid (WTMAD2 = 4.79 kcal/mol).
In the S66x8 noncovalent interaction benchmark paper, 30 dRPA75-D3BJ with basis set extrapolation was found to be the best performer of all DFT functionals. Inspired by this observation, we added a D3BJ correction on top of the Kaĺlay SCS-dRPA75 dual hybrid 27 and found that WTMAD2 dropped from 4.79 to 2.89 kcal/mol. For perspective, it should be pointed out that the lowest WTMAD2 thus far found for a rung-four functional is 3.2 kcal/mol for ωB97M-V. 59 By additionally relaxing the opposite spin and same spin (SS−OS) balance of the dRPA correlation in the optimization, WTMAD2 can be further reduced to 2.76 kcal/mol (see Table 1). As expected, the majority of the improvement comes from the noncovalent interaction and large molecule reaction subsets (Figure 1).
Considering that the energy expression for optSCS-dRPA75-D3BJ contains full dRPA correlationunlike revDSD double hybrids, where the GLPT2 correlation is scaled down by ∼50%one can reasonably expect basis set sensitivity. Would improving the basis set beyond def2-QZVPP reduce WTMAD2 further? Extrapolating from def2-TZVPP and def2-QZVPP using the familiar L −3 formula of Halkier et al., 60 we found a reduction by only 0.03 kcal/molwhile using a compromise extrapolation exponent between the L −3 opposite-spin and L −5 for same-spin correlation, α = 3.727 from solving [((4/3) 3 −   Table 1). Among all 55 subsets, BSR36, MCONF, and to some extent WATER27 and PNICO23 benefitted by considering D4. Incidentally, in response to a reviewer query, we have evaluated the impact of the recent revision 65 of D4 (corresponding to version 3 of the standalone dftd4 program) and found the difference for WTMAD2 to be negligible (0.005 kcal/mol) even for PBE0-D4, where s 6 = 1 unlike for the double hybrids at hand.
Thus far, we have only considered dRPA correlation for the nonlocal correlation part of the dual hybrids. Can further improvement be achieved by also mixing some semilocal correlation component into the final energy (i.e., by transforming Kaĺlay's dual hybrid into the true DHDF form)? By doing so, we obtained the DSD-PBEdRPA 75 -D3BJ functional for which WTMAD2 is reduced by an additional 0.38 kcal/mol (see Table 1) at the expense of introducing one additional parameter (c C,DFT ). The intermolecular interactions subset is the only one that does not show a net improvement. The individual data sets that do benefit most are SIE4x4, AMINO20X4, ISOL24, PCONF21, BH76, and PNICO23 (for S66 and BSR36, performance deteriorates). Indeed, this DSD-PBEdRPA 75 -D3BJ (WTMAD2 = 2.36 kcal/mol) compares favorably to its GLPT2-based counterpart, revDSD-PBE-D3BJ (WTMAD2 = 2.67 kcal/mol): a detailed inspection suggests significant improvements for BUT14DIOL, AMINO20x4, TAUT15, HAL59, G21EA, and BHPERI and degradations for SIE4x4 and RG18. If we additionally relax a 2 from its fixed value (while keeping a 1 = s 8 = 0 fixed) WTMAD2 drops slightly further to 2.33 kcal/mol.
Supplanting D3BJ with the D4 61,62 correction leads to a further drop in WTMAD2 to 2.32 kcal/molslightly better than its PT2-based counterpart revDSD-PBEPBE-D4 36 (WTMAD2 = 2.39 kcal/mol). Comparing these two for the five top-level subsets, we found that the dRPA-based double hybrid performs worse for the intermolecular interaction (the lion's share of that due to RG18), comparably for conformer energies, and better for the remaining three (see Figure 1), despite the exception of SIE4x4 due to increased self-interaction error. TAUT15 and G21EA are the two subsets which benefit the most, whereas the two subsets that deteriorate most are SIE4x4 and RG18.
The poor performance of DSD-PBEdRPA 75 -D4 for SIE4x4 can be mitigated by applying the constraints c s−s = 0 and c o−s = 2: MAD for SIE4x4 drops from 9.0 to 4.7 kcal/mol, at the expense of spoiling thermochemical performance.
In a previous study, we found 36 that including the subvalence electron correlation in the GLPT2 step marginally improved WTMAD2 further. This is not the case here: in fact, correlating subvalence electrons with the given basis sets (which do not contain core-valence correlation functions) actually does more harm than good. Therefore, we have not pursued this avenue further (for a detailed discussion and review on basis set convergence for core-valence correlation energies, see ref66).
Thus far, we have kept c X,HF fixed at 0.75. What if we include it too in the optimization process? For each value of c X,HF , a complete evaluation of the entire GMTKN55 data set is required. We performed such evaluations for five fixed c X,HF points (c X,HF = 0.0, 0.25, 0.50, 0.75, and 0.90), where the same fraction of HF-exchange was used for both the orbital generation and the final energy calculation steps. Interpolation to the aforementioned data points suggests a minimum in WTMAD2 near c X,HF = 0.68; however, upon actual GMTKN55 evaluation at that point, we found that the corresponding WTMAD2 value (2.34 kcal/mol) is very close to the minimum WTMAD2 calculated, 2.32 kcal/mol for c X,HF = 0. 75. It thus appears that the The Journal of Physical Chemistry A pubs.acs.org/JPCA Article WTMAD2 hypersurface in that region is rather flat with respect to variations in c X,HF . Performance of the barrier heights subset deteriorates sharply beyond c X,HF = 0.75; for all other subsets, however, trends are not as straightforward. Error statistics for conformer energies remain more-or-less unchanged beyond 50% HF exchange. For c X,HF < 0.5, a high WTMAD2 value is obtained due to poor performance for small-molecule thermochemistry (see the left side of Figure 2). For each c X,HF , the optimized parameters, the WTMAD2, and its breakdown into five top-level subset components can be found in Table S3 in the Supporting Information. We also noticed that, with increasing %HF for our functionals, the fraction of DFT correlation in the final energy expression decreases almost linearly and approaches zero near c X,HF = 0.85.
For the GLPT2-based double hybrids, we found that in both the original 54,55 and revised 36 parametrizations, the P86c 67,68 semilocal correlation functional yielded superior performance to PBEc 69 (and indeed all other options considered), while we earlier found 54,55 that pretty much any good semilocal exchange functional will perform equally well. Presently, however, we found that DSD-PBEP86dRPA n alternatives yield only negligible improvements over their DSD-PBEP86dRPA n counter-partspresumably because the coefficient for the semilocal correlation is so much smaller here.
That being said, our own DSD-PBEdRPA75-D4 and DSD-PBEP86dRPA75-D4 are still inferior to Mardirossian and Head-Gordon's 35 combinatorially optimized range-separated double hybrid, ωB97M(2) (WTMAD2 = 2.13 kcal/mol) (see Table S2 in the Supporting Information). It should be noted here that ωB97M(2) was not trained against GMTKN55 but against a subset of the ca. 5000-point MGCDB84 (main group chemistry data base 70 ), although substantial overlap exists between GMTKN55 and MGCDB84.
3.2. "External" Benchmarks. Next, we tested our new dRPA-based double hybrids against two separate data sets very different from GMTKN55: the metal−organic barrier height (MOBH35) database by Iron and Janes 71 (see also erratum 72 ) and the polypyrrols (extended porphyrins) data set POLY-PYR21. 73,74 Both data sets are known to exhibit moderately strong static correlation (a.k.a., near-degeneracy correlations) effects. 16 3.2.1. MOBH35. This database 71 comprises 35 reactions ranging from σ-bond metathesis over oxidative addition to ligand dissociations. 71 We extracted the reported best reference energies" from the erratum 72 to the original ref 71 The def2-QZVPP basis set was used for all of our calculations reported here.
Note that these are all closed-shell systems, hence dRPA75, SCS-dRPA75, and optSCS-dRPA75 are equivalent for this problem. Unless a semilocal correlation is introduced into the final energy expression, adding a D3BJ or D4 dispersion correction appears to do more harm than good. However, if the association reactions 17−20 are removed from the statistics, the difference goes awaystrongly pointing toward basis set superposition error as the culprit (omitting dispersion corrections would lead to an error cancellation 30 ). Among all the functionals tested, DSD-PBEP86dRPA75-D4 and DSD-PBEdRPA75-D4 are the two best performers, both with MAD = 0.9 kcal/mol. Both with D3BJ and D4 corrections, DSD-PBEdRPA75 and DSD-PBEP86dRPA75 are better performers compared to their GLPT2-based revDSD counterparts (the purple bars in Figure 3).
Semidalas et al. (to be published) have recently investigated MOBH35 using a variety of diagnostics for static correlation, as well as recalculated some of the reference energies using canonical CCSD(T) rather than the DLPNO-CCSD(T) approximation. 75 They found that severe type A static correlation in all three structures for reaction 9 (but especially the product) led to a catastrophic breakdown of DLPNO-CCSD(T), to the extent that it can legitimately be asked if even canonical CCSD(T) is adequate. Therefore, omitting this particular reaction and recalculating MADs using the remaining 34 reactions (the orange bars in Figure 3) causes all MADs for the revDSD double hybrids to drop significantly. In contrast, If, in addition to reaction 9, we also leave out the bimolecular reactions 17−20 (we note that these reactions were omitted from Dohm et al.'s recent revision 76 of MOBH35 as well) and calculate MADs for the remaining 30 reactions (the green bars in Figure 3), the MAD values are seen to drop across the board. However, unlike the full MOBH35, here all the dual hybrids perform similarly, whether we include any dispersion correction or not. The same is true for all the double hybrids. From Figure  3, it is clear that, for DSD-PBEP86dRPA75-D4 and DSD-PBEdRPA75-D4, the MAD values drop slightly compared to the MADs calculated against the original MOBH35.
3.2.2. POLYPYR21. This data set contains 21 structures with Huckel, Mobius, and figure-eight topologies for representative [4n] π-electron expanded porphyrins, as well as the various transition states between them. 73,74 Among these 21 unique structures, Mobius structures and transition states resembling them exhibit pronounced multireference character (for more details see ref 73). We have used def2-TZVP basis set throughout; CCSD(T)/CBS reference energies have been extracted from ref 73. As these are all closed-shell systems, changing the OS-SS balance has no effect on the RMSD value, hence dRPA75, SCS-dRPA75, and optSCS-dRPA75 offer identical error statistics. Adding either D3BJ or D4 dispersion correction on top of that does more harm than good.
Next, similar to what we found for GMTKN55, mixing in semilocal correlation (i.e., DSD-XCdRPAn-Disp) helps quite a bit. Considering the D3BJ dispersion correction, both the dRPA-based double hybrids outperform their PT2-based revDSD counterparts. On the contrary, with D4 dispersion correction, revDSD-D4 functionals have a slight edge over the dRPA-based double hybrids. As expected, the performance variation mainly comes from the Mobius structures, whereas RMSD statistics for the Huckel and twisted-Huckel topologies stay more or less the same for all DSD-DHs (see the third and fourth columns of Table 2).
3.3. DSD3 and ωDSD3 Family Functionals: Introducing Scaled Third-Order Correlation. As mentioned in the Introduction, Radom and co-workers 37 tried to improve on double hybrids by introducing MP3, MP4, and CCSD correlation. Unfortunately, using fairly modest basis sets and fitting correlation energy coefficients to the small and chemically one-sided G2/97 77 database of atomization energies, they failed to discern any significant improvement beyond regular double hybrids. From our previous experience, 36 we know that the use of small, idiosyncratic training sets for empirical functionals may lead to highly suboptimal performance. Thus, here, we are instead employing GMTKN55, which is more than an order of magnitude larger and covers many other types of energetic properties. All the "microiteration" (i.e., linear) parameters were refitted (i.e., c DFT , c 2ab , c 2ss , and c 3 ; s 6 for D3BJ subject to s 8 = a 1 = 0, a 2 = 5.5 fixed; s 6 , a 1 , and a 2 for D4 subject to s 8 = 0, c ATM = 1). Two functionals, DSD-PBEP86 and ωDSD 69 -PBEP86 (ω = 0.16) are considered as the representatives of global and rangeseparated DHs for the present study. It was previously found, 14 in a cWFT context, that the MP3 term does not change greatly beyond the def2-TZVPP basis set, hence we restrict ourselves to the latter in an attempt to control computational cost.
Total WTMAD2 and optimized parameters for all the DSD3, ωDSD3 and corresponding revDSD functionals are presented in Table 3 (for individual subsets of GMTKN55, see Tables S13− S16 in the Supporting Information). Analyzing the results, we can conclude the following.
• Considering PT2 and MP3 correlation together and scaling the MP3 term by an extra parameter (c 3 ) does improve performance for both the DSD3 and ωDSD3 functionals at the expense of the extra computational cost entailed by the MP3/def-TZVPP calculations. • For DSD3 with D4 dispersion correction, the improvement is 0.17 kcal/mol compared to revDSD-PBEP86-D4. Among all 55 individual subsets, the RSE43 subset benefited the most and performance for BHPERI and TAUT15 also improved to some extent. However, for ωDSD3 the performance gain is more pronounced, 0.29 kcal/mol (see Table 3). Inspection of all 55 individual subsets reveals that the RSE43 and TAUT15 subsets showed significant gain in accuracy and AMINO20x4, RG18, ADIM6, and S66 only marginally improved. It differs from the present work in that the semilocal starting point is 100% Hartree−Fock without semilocal correlation, rather than a hybrid GGA as here. Clearly the latter offers an advantage.
Although both the G4(MP3|KS)-D-v5 and DSD3 method use spin component-scaled PT2 correlation and scaled MP3 correlation, the key differences between these two are: no ). This scaling behavior is similar to that of canonical MP3, which however does not need to store the amplitudes within a direct algorithm. Thus, the estimated speedup of MP3 over CCSD would be 10−20 times (i.e., the typical number of CCSD iterations). Therefore, in terms of computational cost, this advantage makes the results obtained for DSD3 and ωDSD3 functionals interesting enough without the need for using any further acceleration techniques, such as tensor hypercontraction density fitting (THC-DF-MP3) 78 or the interpolative separable density fitting (ISDF) 79 for the MP3 step.
Following a bug fix to the open-source electronic structure program system PSI4, 80 (version 1.4rc1+) we were able to run RI-MP3 (a.k.a. DF-MP3) for all but a couple dozen of the species for which we had canonical MP3. In the size range of melatonin conformers, we found this to be about seven times faster (wall clock) than conventional MP3, and the overall wall clock time for DSD3-PBEP86-D3BJ and ωDSD3-PBEP86-D3BJ was found to be about three times shorter. It should be noted that our machines are equipped with fast solid state disk scratch arrays with a 3 Gb/s bandwidth for sequential writes; for conventional scratch disks, the canonical:RI wall time ratio would be much more lopsided. By way of example, for DSD3-PBEP86-D3BJ, WTMAD2 using conventional MP3 and RI-MP3 components differs by just 0.03 kcal/mol; when substituted for canonical MP3 inside DSD3-PBEP86-D3BJ and ωDSD3-PBEP86-D3BJ, the effects on WTMAD2 are just −0.009 and −0.004 kcal/mol, respectively.
The computational time requirements were checked for two molecules from GMTKN55: one melatonin conformer and one peptide conformer (see Figure S1 for structures). From Figure 4 we can conclude the following:

CONCLUSIONS
Analyzing the results presented above for the dRPA-based double hybrids; original and reparametrized form of SCS-dRPA75 dual hybrid; and DSD3-and ωDSD3-type double hybrid functionals (all evaluated against GMTN55), we are able to state the following conclusions. Concerning the first research question:  The Journal of Physical Chemistry A pubs.acs.org/JPCA Article a) Following the recommendation of Martin and coworkers, 30 adding a dispersion correction on top of the original SCS-dRPA75 significantly improved the WTMAD2 statistics, D4 slightly more so than D3BJ. b) By additionally admitting a semilocal correlation component into the final energy expression, we were able to obtain DSD-PBEdRPA 75 -D3BJ and DSD-PBEdRPA75-D4 functionals that actually slightly outperform their PT2-based counterparts, 36 revDSD-PBE-D3BJ and revDSD-PBE-D4. c) We considered different percentages of HF exchange but found the WTMAD2 curve flat enough in the relevant region, for both the DSD-PBEdRPAn-D4 and DSD-PBEP86dRPAn-D4 variants, that c X,HF = 0.75 is a reasonable choice. d) Judging from the SIE4x4 subset, we found that the refitted SS-OS balance in dRPAc apparently causes significant self-interaction errors. This issue can be eliminated by applying the constraint, c s−s = 0, c o−s = 2at the expense of spoiling small-molecule thermochemistry. Concerning the second research question, we considered a different post-MP2 alternative, namely, the addition of a scaled MP3 correlation term (evaluated in a smaller basis set, and using HF orbitals, for technical reasons). Particularly when using range-separated hybrid GGA orbitals, we achieved a significant improvement in WTMAD2. Especially in conjunction with RI-MP3 or with further acceleration techniques like fragment molecular orbital-based FMO-RI-MP3 81 or the chain-of-spheres approximation for SCS-MP3 as implemented by Izsaḱ and Neese, 82 this approach could potentially be very useful. Head-Gordon and co-workers have very recently shown 83 that the use of DFT orbitals for regular MP3 level calculation results significantly improved performance for thermochemistry, barrier heights, noncovalent interactions, and dipole moments compared to the conventional HF-based MP3. Unlike what Semidalas and Martin 14 observed for their G4(MP3|KS)-D-v5 method, we have found that the dispersion correction term cannot be neglected for DSD3 or ωDSD3 functionals.
More extensive validation calculations of these and prior functionals, both in quantity (using the larger MGCDB84 benchmark 70 ) and in system size (MPCONF196, 84 37CONF8, 85 S30L, 86,87 and to some extent MOR41 88 ), are in progress in our laboratory.
Abridged details of all 55 subsets of GMTKN55 with proper references; breakdown of total WTMAD2 into five major subcategories for the original and refitted SCS-DRPA75 variants, DSD-PBEdRPA 75 , DSD-PBEP86dR-PA 75 , and corresponding revDSD functionals with dispersion correction; optimized parameters and breakdown of total WTMAD2 into five top-level subsets for DSD-PBEdRPAn and DSD-PBEP86dRPAn with both D3BJ and D4 dispersion correction; MAD, MSD, and RMSD as well as breakdown of total WTMAD2 by each subset for each new functionals we propose here; figure showing the structures of two systems upon which we experimented the computational time requirements for different functionals; and ORCA sample inputs for revDSD-PBEP86-D3BJ and revDSD-PBEPBE-D3BJ (PDF)