FCIQMC-Tailored Distinguishable Cluster Approach

The tailored approach is applied to the distinguishable cluster method together with a stochastic FCI solver (FCIQMC). It is demonstrated that the new method is more accurate than the corresponding tailored coupled cluster and the pure distinguishable cluster methods. An F12 correction for tailored methods and FCIQMC is introduced, which drastically improves the basis set convergence. A new black-box approach to define the active space using the natural orbitals from the distinguishable cluster is evaluated and found to be a convenient alternative to the usual CASSCF approach.


INTRODUCTION
Coupled cluster (CC) theory 1−3 is a reliable and standard tool in quantum chemistry to treat many-body electron correlations at high accuracy level. 4−7 However, the standard CC methods yield qualitatively erroneous results when applied to strongly correlated systems, e.g., multicenter transition metal complexes and other multiradicals, or in other highly degenerate situations encountered, e.g., in systems away from the equilibrium geometry. These complex systems require multiple similarly weighted configurations in the wave function even for qualitatively correct description and represent a challenging problem for the contemporary quantum chemistry.
Conventional methods applied to such strongly correlated systems rely on the multireference approach. However, the complexity and computational cost of this methodology is comparably high and makes the single-reference-based (SRbased) approaches more appealing. There have been many attempts to design SR-CC type methods able to properly describe also static electron correlation. 8−19 Generally, these approaches can be divided into two classes of corrections externally and internally corrected CC methods, and a good overview of these methods can be found in a recent review by Paldus. 20 The externally corrected CC methods rely on externally generated cluster amplitudes, which are adopted as a correction to the CC amplitudes. These methods make use of the fact that the wave function can be expressed in the split-amplitude ansatz. 17,18 The external source of the cluster amplitudes can be chosen to describe well the static part of the electron correlation in the system, leaving the rest to the CC methodology, thereby alleviating the sensitivity of the CC methods to the static correlation. Usually, the separation is done at the orbital level, introducing an active space, which is treated at the full configuration interaction (FCI) level. The separation can be combined with the self-consistent orbital optimization, yielding the well-known complete-active-space self-consistent field (CASSCF) method, which defines the strong electron correlation part and generates the external cluster amplitudes. In the tailored CC (TCC) methods 19,21,22 the external amplitudes do not change during the solution of the CC amplitude equations and are simply replacing the corresponding CC amplitudes; the remaining CC amplitudes are relaxed as usual.
The internally corrected CC methods, on the other hand, rely on possible mutual cancellations of certain cluster components, e.g., approximate coupled pair (ACP), [9][10][11]23,24 approximate coupled-cluster doubles (ACCDs), 25 nCC hierarchy, 14 parametrized coupled-cluster (pCCSD), 26 or the distinguishable cluster (DC) 27−31 methods, or introduce new terms into the CC amplitude equations, e.g., quasi-variational CC 15 or CC valence bond. 16 The DC with singles and doubles (DCSD) method 27,28 has been introduced as a small modification of the CC with singles and doubles (CCSD) amplitude equations, and it has been demonstrated in numerous studies that the resulting method is more accurate than CCSD for weakly correlated systems. 7,29,31−35 Additionally, DCSD often provides a qualitatively correct description of the strongly correlated systems.
Recently, it has been demonstrated that the DC approximation is also applicable to triple excitations. 36, 37 The high accuracy of the DCSD method and improved stability to the onset of the static correlation suggest its superiority over CCSD also in the tailored formalism, and in this work we investigate the applicability of the tailored CC methodology to the DCSD method, denoted in the following as TDC.
Further, calculation of the external correction can easily become a bottleneck in this method, since the tailored methods generally require large active spaces to accurately describe multiradicals. In this respect different ideas have been investigated, and especially the density-matrix-renormalization-group (DMRG) based TCC 38−40 demonstrated very promising results. In the present study, we utilize the full configuration interaction quantum Monte Carlo (FCIQMC) method 41−44 to obtain the external amplitudes in large active spaces well beyond 20 active orbitals. We will refer to this combination of TCC/TDC with FCIQMC as FCIQMC-TCC/ FCIQMC-TDC methods.
Additionally, in the present work we investigate various possibilities to define the active space. Apart from the usual (but expensive) CASSCF route, we employ natural orbitals from a preceding DCSD calculation, with occupation numbers significantly differing from zero and two, as the active orbital subspace for FCI or FCIQMC calculations. This route is especially appealing as a basis set correction for the expensive FCIQMC calculations, and in this respect we also introduce a simple F12 explicit correlation correction on top of the TCC/ TDC methods based on the Valeev's perturbative F12 approach. 45 Finally, we compare the tailored methods to a simple subtractive embedding scheme for strongly correlated systems, similar to the one utilized in the embedded MRCC methods. 46 Here, the embedded FCI calculation is performed in the space of most active DCSD natural orbitals, and the correction to the total DCSD correlation energy is calculated as the difference between the embedded FCI energy and the embedded DCSD energy.

OVERVIEW OF COMPUTATIONAL APPROACHES
In the following, we will briefly describe the computational methods exploited in this work. The tailored DC and CC methods have been implemented in the Molpro package, 47−49 and the FCIQMC calculations have been done using the NECI program. 44 2.1. Distinguishable Cluster Approach. In the DC with doubles (DCD) method exchange diagrams in quadratic terms in the CC with doubles (CCD) amplitude equations are removed and the remaining quadratic terms are rescaled to retain the overall exactness for two electrons (after orbital relaxation) and the particle−hole symmetry of the equations. 27,50 Alternatively, the method can be derived from a screened Coulomb formalism by introducing a direct-randomphase-approximation (dRPA) screening and considering all possible pairwise interactions. 30 The resulting DCD amplitude equations are very similar to CCD equations, and the method is also size-extensive and orbital-invariant. The orbital relaxation can be added using similarity transformation of the Hamiltonian with e T1 (as well as other techniques), 28 and the resulting DCSD method exhibits the same level of insensitivity with respect to occupied-virtual orbital rotations as CCSD.
However, the DCSD method possesses a number of advantages compared to the CCSD method. It can be applied to strongly correlated systems without exhibiting the usual breakdown of the traditional CC methods; it is also much more accurate than CCSD even for weakly correlated systems. 29 Besides, it can be implemented more efficiently and has a lower computational scaling in some representations. 36,51,52 For particle−hole symmetric systems as Pariser−Parr−Pople or Hubbard models, DCD is equivalent to the ACP-D14 method, 10 which has been shown to be exact in the fully correlated limit of these models. We refer the interested reader to the recent review by Paldus, 20 which discusses the connection of ACP and DCD methods.
The one-body density matrices can be calculated using the established Lagrange technique, which is approximately twice as expensive as the energy calculation, since Lagrange multiplier equations have to be solved. Natural orbitals and the corresponding occupation numbers can be obtained as eigenvectors and eigenvalues of the density matrices.
The basis set convergence of the method can be very much improved by adopting the F12 techniques, either by adding the F12 terms to the amplitude equations 29,53 or as a perturbative correction. 45,54 The later approach will be utilized here to speed up the basis set convergence in the tailored methods.
2.2. FCI Quantum Monte Carlo. FCIQMC is a stochastic method to compute the ground-state energy of extremely large many-body Hamiltonians in the context of FCI methods, i.e., electronic wave functions expanded in Slater determinant spaces comprising all possible determinants constructable from a given spatial orbital basis. Compared to the deterministic FCI, much larger systems can be calculated with over 100 active orbitals.
FCIQMC samples the Slater determinant space by a number of walkers, whose population dynamics evolves according to a simple set of rules: spawning, death, and annihilation processes. The equation that guides the walkers population dynamics is 1) where N i is the number of walkers on the determinant i, τ is the imaginary time, H ij are the Hamiltonian matrix elements in the basis of Slater determinants, and S is the shift parameter that controls the total walker number. The idea is to perform a long time integration of the imaginary-time Schrodinger equation, with the propagator expanded to first order using a discrete time step Δτ. In the limit of long imaginary-time propagation, the algorithm converges to the ground-state wave function, as long as the number of walkers used is sufficient to sample the Hilbert space. However, the last condition is not straightforward to follow because the number of walkers usually scales with the size of the Hilbert space itself. In this respect, the initiator method (i-FCIQMC) introduced in 2010 by Cleland et al., 42 here used by default, made possible to get a stable convergence with a lower number of walkers. However, this method introduces a bias which can get substantial in large systems. This initiator error can be seen as the sign of size-inconsistency error that normally affects CI methods. A recent development of the so-called adaptive-shift method dramatically reduces this error and allows one to obtain near-FCI quality results up to systems such as benzene. 55 The latter method modifies the shift applied to a non-initiator determinant such that only a suitable local shift is used.
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article Additionally, the semistochastic approach 56,57 can be used, where a number of the most populated determinants is selected and this part of the imaginary-time propagation is performed deterministically, which dramatically reduces the stochastic noise. It turned out to be especially beneficial for our purposes of extracting accurate CI coefficients from the FCIQMC calculation, since it allows one to reduce the time-averaging of the coefficients.
Recently, FCIQMC has been combined with coupled-cluster methods, 58,59 either as a means to select important high-order excitations for CC(P,Q) methods, or as a means to accelerate the convergence of the FCIQMC method with respect to imaginary-time propagation; in both cases the objective being to approach the exact FCI energy with greater efficiency than allowed either by pure high-order CC methods or FCIQMC methods. In the methodology to be presented below the objective is different: it is to use FCIQMC as a complete-activespace solver, whose amplitudes are then used in an externally corrected tailored method to solve the CC or DC amplitude equations in the full space. This should enable us to treat much larger problems, for which the direct application of the FCIQMC or any similar exact method would be impractical, taking advantage of the fact that the coupled-cluster description of the correlation outside of the active space is likely to be very accurate, if not perfect. Such an approach would present a computationally attractive alternative to true multireference methods. Here, T̂C AS represents the cluster amplitudes of the active space calculated from the extracted CI coefficients in the FCI (or FCIQMC) calculations, while T̂C C is the rest of the cluster operator. In this way, we assume that T̂C AS describes most of the static electron correlation of the system, and this information is contained in the amplitudes of the singles and doubles cluster excitation operators, T1 CAS and T2 CAS . In the tailored formalism, 19 these amplitudes of the cluster operators T̂C AS are kept frozen during the CC iterations. Only the T1 CC and T2 CC are optimized with the usual projective CCSD amplitude equations, here a, b, ... and i, j, ... represent the unoccupied and occupied orbitals in the Φ 0 reference determinant, respectively. The resulting tailored CCSD (TCCSD) method is size-consistent, 19 as long as the CI coefficients have been extracted from a sizeconsistent calculation.
The tailored methods are based on a single-reference formalism, and therefore a qualitatively correct description of the strong electron correlation in the underlying coupled-cluster approaches can be beneficial. Besides, the enhanced accuracy of the distinguishable cluster methods for weakly correlated systems is expected to be transferable to the respective tailored approaches. Although the DCSD method cannot be formulated in terms of excitation operators as CCSD, it is still possible to apply the same strategy to the DCSD amplitude equations in order to define the TDCSD method. Since the DCSD method is size-consistent, the TDCSD is size-consistent as well (again, for size-consistent CI calculations).
Freezing active amplitudes facilitates a straightforward implementation into an existing CCSD or DCSD code. However, the missing relaxation of the complete-active-space (CAS) amplitudes in the presence of other amplitudes obviously represents an approximation of the method. We have investigated a simple intermediate region approach to alleviate this error; i.e., only amplitudes from a subset of the CAS space are frozen in the TDCSD/TCCSD amplitude equations, and the rest of the CAS space represents a buffer region, in which amplitudes are relaxed both in FCI and in the subsequent TDC/ TCC calculation.
Extraction of the CC amplitudes from the FCIQMC run requires collecting and averaging singles and doubles CI coefficients from the FCIQMC wave function. The singles and doubles amplitudes (T CAS ) a i and (T CAS ) ab ij are computed using the well-known relation between the CC amplitudes and CI coefficients in the spin−orbitals, For the definition of the active orbital space various techniques can be utilized. CASSCF calculations are an obvious choice, which however require expensive CASSCF optimizations and can easily become a bottleneck for large active spaces. Therefore, we have investigated an alternative procedure of using DCSD active natural orbitals, i.e., with occupation numbers significantly different from two and zero, as the simple low-cost option, which can be applied in a black-box manner and relies on the ability of the DC methods to describe strongly correlated systems qualitatively correctly. Additionally, amplitudes from the tailored run (e.g., with a small active space) can be used as a starting guess for the DC methods, which helps to target the physical solutions in complicated cases.
Another option to utilize the TDCSD/TCCSD amplitudes is to calculate a basis set correction to the FCIQMC. It can be achieved either by simply running the tailored calculations for the frozen virtual natural orbitals or by additionally calculating a perturbative F12 correction. 45,54 For the latter we calculate the DCSD/CCSD F12 Lagrangian removing the contribution of the DCSD/CCSD active-space residual, where T 1 , T 2 , and C denote singles and doubles TDCSD/ TCCSD amplitudes, and the F12 amplitudes that satisfy the first-order coalescence conditions, respectively, and the tilde denotes the corresponding contravariant quantities, which replace the Lagrange multipliers. Ω i , Ω i AS , and Ω F12 are the DCSD-F12/CCSD-F12 amplitude equations, DCSD/CCSD residuals in the active space, and the F12 amplitude equations. Here we employ the F12a and F12b approaches. 29,53,60 If the active space spans the complete orbital space, the F12 correction is in fact applied to the FCIQMC method and is therefore denoted FCIQMC F12 , which is an alternative to the previously introduced explicitly correlated FCIQMC methods. 61 65 and full FCIQMC from the NECI program package. 44 All tailored calculations have been done on top of natural orbitals, either from CASSCF or from DCSD calculations. For the DCSD natural orbitals we have used the orbital-relaxed density matrices produced by the closed-shell implementation in Molpro.
Finally, we extended our study to the ozone molecule in order to investigate the accuracy of the approach for calculation of absolute and relative energies and for somewhat larger systems. The ozone molecule is not trivial for many quantum chemical methods, because of the strong interplay of dynamical and static electron correlation, even when no bond breaking is involved. 66−69 In all calculations, the cc-pVTZ basis set has been employed and the core electrons have been kept frozen in the correlated calculations. In the F12 calculations the cc-pVTZ/jkfit and cc-pVTZ/mp2fit basis sets have been used for the complementary auxiliary basis (CABS) and for density fitting, respectively.
All of the FCIQMC calculations have been characterized by a fixed number of iterations (5000) to average the CI coefficients. These iterations are supposed to be the last 5000 in order to extract the coefficients when the calculation is already converged. Further, the number of walkers has been defined each time according to the CAS under study and the multireference character of the system. This has brought from a minimum of 10 5 walkers in the N 2 case for the CAS(10,8) up to a maximum of 10 8 walkers used in the ozone case. Moreover, the deterministic space in the semistochastic method was chosen to be represented by the most populated 1000 Slater determinants in most of the calculations. Larger values were considered only in a few sensitive cases. Additionally, the initiator method is used by default in all calculations. The set of initiator determinants is not fixed during a calculation but can be enlarged when a noninitiator determinant reaches a population which is larger than a preset value. The latter has been defaulted to 3 in all our calculations. Finally, we also made use of the adaptive-shift method 55 in FCIQMC, in order to deal with the ozone case. The  The sizes of the active spaces are denoted in the usual manner as (n electrons ,n orbitals ).
3.2. Nitrogen (N 2 ). First, we examine the triple-bondbreaking process in the N 2 molecule. Figure 1 shows PECs calculated using TDCSD and TCCSD with the CI coefficients extracted from the deterministic and the stochastic FCI. The active space contains 10 electrons in 8 orbitals, (10, 8), and the 8 active orbitals represent the complete valence space of the molecule with two σ, two π, two σ*, and two π* orbitals, given by the 2s and 2p orbitals from both of the atoms. It is evident from Figure 1 that the TDCSD curve is much closer to the reference MRCI+Q curve than TCCSD, especially further away from the equilibrium geometry. Note that the TDCSD curve is on top of the DCSD curve up to a distance of 3.4 bohr, i.e., in the region where DCSD has been shown to be very accurate in the previous studies. 27,29 However, for larger distances the error in DCSD increases while the TDCSD curve stays close to the reference curve. As expected, the FCI-and FCIQMC-tailored calculations yield the same results.
In Figure 2, TDCSD and TCCSD PECs with a larger active space (10,16) originating from CASSCF natural orbitals (CASSCF NOs) and DCSD natural orbitals (DCSD NOs) are presented. The additional 8 virtual orbitals in the active space of the CASSCF calculations correspond to 3s and 3p orbitals for short distances, gradually acquiring the 3d character for larger distances. The DCSD natural orbitals in the active space are also composed of s, p, and d atomic orbitals. Evidently, in this case TDCSD and TCCSD based on the CASSCF defined active space yield considerably more accurate results than the ones based on the DCSD-NOs active space. Accidentally, the TDCSD curve using DCSD NOs is very close to the TCCSD curve using CASSCF NOs, which demonstrates that the accuracy of the correlated approach here is of similar importance as the quality of the active space. As in the previous example with the smaller active space, the TDCSD method outperforms TCCSD for both choices of the active orbitals. Additionally, the CASSCF-based TDCSD curve is much more parallel to the MRCI+Q reference curve than other methods, with nonparallelity errors of 5 mHa in the region between 1.4 and 5 bohr (compared to nearly 15 mHa for TCCSD on CASSCF orbitals).
In Figure 3, various choices of the active space for TDCSD are compared, all using DCSD natural orbitals. Additionally to the previously used (10,16), a larger active space (10,26) is tested with the size of a double-ζ basis set. This active space includes a large portion of the dynamical correlation, and the TDCSD part of the calculation acts as a basis set correction. As a result the PEC from these calculations is very accurate and lies on top of the MRCI+Q (10,16) curve. Besides, results from the TDCSD method with a buffer region for a partial relaxation of the amplitudes are shown. In this approach we utilize the CI  Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article coefficients from a FCIQMC(10,26) calculation, but only from a smaller region of (10,16), denoted as FCIQMC (10,26) (10,16)-TDCSD. As a result, the amplitudes from the intermediate region influence the CI coefficients but still are relaxed in the subsequent DCSD calculation. As can be seen in Figure 3, PEC from calculations with the buffer region lies between TDCSD with (10,16) and (10,26) PECs. Evidently, in this case the relaxation of the amplitudes in the buffer region is less important than the accuracy of the method in this region. However, we also see that inclusion of some screening effects outside of the active space can lead to large improvements in the accuracy. This will be investigated further in a forthcoming publication. The remaining curves in Figure 3, FCIQMC-(10,16)+ΔDCSD and FCIQMC(10,26)+ΔDCSD, are obtained with a subtractive embedding scheme of the FCI or FCIQMC methods into DCSD, The energies from FCIQMC(10,16)+ΔDCSD are much higher than from the equally sized TDCSD calculations, which clearly demonstrates the importance of the coupling between the active region and the rest. However, the results from the subtractive embedding are much closer to the corresponding TDCSD values if larger active spaces are used, as can be seen from the (10,26) curves.
In the last graph on N 2 , Figure 4, the F12b basis set correction 53,54 is applied to two TDCSD calculations with different active spaces and to FCIQMC in the full cc-pVTZ basis set; see eq 4. The TDCSD and FCIQMC F12b results match very nicely the MRCI+Q-F12 numbers around the equilibrium; however, they start to deviate much more for stretched geometries. One of the problems is certainly the usage of contravariant amplitudes instead of Lagrange multipliers in eq 4, which in the case of ODCD F12b calculations led to errors of around 10 mHa for the dissociated N 2 geometries. 54 Another source of the discrepancy is the F12 approximations themselves, as can be seen from the TDCSD F12a curve, which uses a different F12 approximation and is extremely close to the reference MRCI+Q-F12 PEC.
3.3. Fluorine (F 2 ). Now we turn our attention to the PEC of the F 2 molecule. The first active space considered here corresponds to the full-valence active space, (14,8), composed of 2s and 2p atomic orbitals and, therefore, contains only one single virtual spatial orbital. Thus, only single and double excitation are possible inside of this active space, and CCSD and DCSD are in fact equivalent to FCI there. As has to be expected, the accuracy of the tailored methods is very low with such limited active space; see Figure 5. In fact, even MRCI+Q with this active space is not very accurate compared to the CCSDTQ reference, although the curve is much more parallel to the reference than from the tailored methods. The TCCSD curve is  Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article very close to MRCI+Q around the equilibrium geometry but is too high for stretched geometries. The TDCSD curve is too low at the equilibrium, and comparing to the DCSD curve, it is obvious that adding some relaxation or screening to the active region would improve the results. The results are much better in Figure 6, where a larger (14,16) active space is considered. It includes eight more virtual orbitals, composed in the case of CASSCF mostly of the 3s and 3p atomic orbitals, and with somewhat larger 3d contributions in the case of DCSD natural orbitals. Here, in contrast to the N 2 molecule, tailored methods on top of the DCSD NOs are much closer to the reference than the ones based on the CASSCF NOs. And, again, TDCSD performs noticeably better than TCCSD using any of the representations.
In Figure 7, TDCSD PEC using a further enlarged active space of the double-ζ size, (14,26), is presented. Even though it is still   Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article shifted compared to the CCSDTQ curve, it runs much more parallel to the reference than the (14,16) PEC. The (14,26) and (14,16) curves agree quite well at the equilibrium geometry and start to deviate more along the dissociation. As expected, the FCI and FCIQMC values are essentially equal. In Figure 8, PECs from explicitly correlated methods are shown. Again, the PEC from TDCSD with the (14,26) active space is consistent with the FCIQMC curve and shifted up by a constant.
3.4. Water Molecule (H 2 O). Next, we evaluate the doublebond-breaking process in the H 2 O molecule. The two O−H σ bonds are stretched simultaneously while the bond angle is kept fixed to 107.6°. We again start with a full-valence active space (8,6) with 1s hydrogen orbitals and 2s and 2p oxygen orbitals.   The corresponding PECs are presented in Figure 9. All curves overlap in the equilibrium region; however, the TDCSD curve is too low in the intermediate region compared to the reference MRCI+Q curve, while the TCCSD curve is too high in the dissociation region. Thus, one has to increase the active space size in order to get accurate results over the whole potential energy curve. In Figure 10, calculations for a larger (8,18) space with 12 additional virtual orbitals given by the 3s and 3p orbitals of the oxygen and the 2s and 2p orbitals of the hydrogen atoms are shown. With this active space the TDCSD, TCCSD, and MRCI +Q curves are indistinguishable from each other on this scale. Adding the explicit correlation to the methods does not change the picture in this case: the TDCSD F12b curve is still essentially on top of the MRCI+Q-F12 curve; see Figure 11.
3.5. H-Chain (H 10 ). The uniform dissociation of a 1D hydrogen chain is another challenging problem for the conventional coupled cluster methods. Here, we focus on a simple case of 10 hydrogen atoms, and the active space corresponds to the minimal active space of 10 electrons in 10 orbitals. As evident from Figure 12, in this example TDCSD and TCCSD methods produce very similar results, and all tailored methods overestimate the correlation in the intermediate region compared to the reference MRCI+Q values, with the TDCSD and TCCSD on DCSD NOs showing a slight improvement with respect to the CASSCF NOs ones. Compared to the simple DCSD approach, the tailored DCSD is in fact less accurate, which can be attributed again to the frozen amplitudes in the active space. The corresponding second-order CAS perturbation theory (CASPT2) results are much less accurate around the equilibrium but approach the MRCI+Q reference results for larger separations.
In Figure 13, TDCSD PEC using a larger active space with 10 more virtual DCSD natural orbitals with the largest occupation numbers are presented (FCIQMC(10,20)-TDCSD). However, the results become only slightly better compared to (10,10) numbers. Putting the added space instead to the buffer region (FCIQMC (10,20) (10,10)-TDCSD) results in a small further improvement, which suggests that the main problem here is the missing relaxation of the frozen amplitudes. Interestingly, a simple subtractive embedding procedure (FCIQMC-(10,10)+ΔDCSD) yields much better results in this case, with a perfect overlap with the MRCI+Q curve.
3.6. Ethylene (C 2 H 4 ). Another case of study of processes governed by static electron correlation is here given by the torsion of ethylene. While at the equilibrium the wave function of this molecule is dominated by a single determinant, at the transition state the bonding π and antibonding π* orbitals become degenerate, and two equally weighted determinants represent the main contribution to the wave function. Potential energy curves corresponding to the rotation along the C−C axis are presented in Figure 14. The full-valence active space (12,12)  Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article has been employed, and as in previous studies, 29,36 the geometry is relaxed at the DCSD level for each fixed torsion angle. Evidently, results from both tailored methods are in a good agreement with MRCI+Q, both being quite parallel to the reference curve. The nonparallelity error of the TDCSD curve amounts to merely 0.5 mHa and of the TCCSD to 1.4 mHa, with respect to MRCI+Q. As in the water molecule case, adding the explicit correlation to the methods does not change the picture: curves from the tailored methods remain parallel to the MRCI+Q-F12 curve, and the TDCSD F12b results are essentially on top of the MRCI +Q-F12; see Figure 15.
3.7. Cyclobutadiene (C 4 H 4 ). Cyclobutadiene (CBD) has been studied for many decades 70−73 and it has been shown already in early works that its ground state is a closed-shell singlet in a planar rectangle geometry. 74−76 Here, we focus on the automerization reaction coming from the double-bond flipping of two equivalent rectangular singlets, which exhibits a challenging wave function transition in its square transition-state geometry. The corresponding potential energy curves can be found in Figure 16. The molecular geometries are obtained as a linear interpolation of the optimized structures from ref 73, with the internal coordinates l i calculated as l i (λ) = (1 − λ)l i (rt) + λl i (sq) and with λ varying in a range of 0−1 in steps of 0.2. Here, l i (sq) and l i (rt) represent the optimized parameters in the square and rectangle geometries, respectively, and correspond to the two different C−C bond distances (l 1 (rt) = 1.349 Å; l 2 (rt) = 1.562 Å; l 1 (sq) = l 2 (sq) = 1.447 Å). The remaining internal coordinates, i.e., the H−C−C angle (135.0°), H−C distance (1.076 Å), and C−C−C angle (90.0°), are kept frozen. The CASSCF (12,12) active space employed here includes molecular orbitals related to the C−C bonds.
CCSD, CCSD(T), and DCSD produce qualitatively wrong curves, although DCSD noticeably improves upon CCSD in the absolute energies as well as in the barrier height, cf., Table 1. The tailored methods show a trend comparable to the MRCI+Q curve, and the absolute energy of TDCSD matches well the CCSD(T) energy in the equilibrium geometry.
Energy barrier heights of the double-bond shift are listed in Table 1. The experimental value spreads over a wide range from 1.6 to 10 kcal/mol, 77,78 and one of the best theoretical estimates from the multireference average-quadratic coupled cluster (MR-AQCC) amounts to 8.8 kcal/mol. 73 The tailored methods yield very accurate barrier heights, with the one from TCCSD coinciding with the MRCI+Q result, and the one from TDCSD coinciding with MR-AQCC 73 and very close to the multireference Mukherjee's CCSD(T) (MR-MkCCSD(T)). 81 The results of TCCSD and TDCSD using (12,12) CASSCF improve upon previous TCCSD calculations employing restricted Hartree−Fock (RHF) orbitals and (2,2) active space. 79 3.8. Ozone (O 3 ). The ozone molecule is well-known to be difficult to describe accurately without high-order excitations or multireference methods. The electronic structure of O 3 was first investigated decades ago, with the pioneering studies of Hay et  Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article al. 66,67 Since then many further investigations have been focusing on the understanding of this system with respect to structural optimizations and vibrational frequencies, 68,69,82−89 including an application of the TCCSD method. 21 The potential energy surface of ozone has two ground-state minima, the global open minimum and the local ring minimum. 66,90,91 The latter was shown to exist only by theoretical calculations, and no experimental evidence is available so far. Further, it was demonstrated by the CI wave function analysis in earlier works 66,91 that both the open and the ring isomers have a large multireference character, making this system a nontrivial problem for single-reference methods. The transition state between these two minima has an even larger multireference character, and therefore an accurate prediction of the energy differences between these geometrical structures is especially difficult without high excitations or a multireference treatment, which would ensure a balanced description. In the present work, we use three geometries, the open minimum (OM), the ring minimum (RM), and the transition state (TS), taken from ref 92. According to the previous studies, 88,92 in the transition-state geometry the ground and the first excited state are almost degenerate with the excitation energy of 0.01−0.1 eV; therefore, we have calculated both states and denote them as TS and TS2, ordered as in ref 88.
As before, we start with the full-valence active space of ozone, (18,12), which is defined either by CASSCF or by DCSD natural orbitals. The absolute energies of OM and relative energies of TS, TS2, and RM with respect to OM are shown in Table 2. For this confined active space the tailored methods based on the CASSCF natural orbitals work very well. Deviations in the relative energies from the reference values of Chien et al., 92 given by the semistochastic Heath−Bath CI (SHCI) method, are well below 0.1 eV. The RM−OM gap differs from the SHCI value only by 0.02 eV in the case of TDCSD and by 0.04 eV in the case of TCCSD. The deviations for the TS states are slightly larger: 0.03 and 0.06 eV for the first state and 0.04 and 0.04 eV for the second state, for TDCSD and TCCSD, respectively. Additionally, the energetic order of the states is swapped for both tailored methods. Using the DCSD natural orbitals to define the active space increases the errors in most of the relative energies. However, even then the errors are much smaller than from pure DCSD and CCSD calculations, and the accuracy can be improved further by increasing the active space.
For large active-space calculations, we have used the DCSD natural orbitals and selected a (18,39) active space of the double-ζ size, i.e., representing again a basis-set-correction type approach. The results can be found in Table 2. The absolute energies from the (18,39)-tailored methods agree well with high-level coupled-cluster results. For the OM geometry, the TDCSD(18,39) total energy turned out to be almost exactly in the middle of the CCSDT (−225.1317 Ha) and CCSDTQ (−225.1378 Ha) energies. The relative energies of TDCSD and TCCSD agree with each other within the given precision and are very close to the SHCI reference values with discrepancies of only 0.01−0.02 eV.

CONCLUSIONS
We have implemented and benchmarked the tailored distinguishable cluster method and combined it with the stochastic FCI (FCIQMC) solver, facilitating large active spaces.
The TDCSD calculations show an evident improvement to the standard DCSD in six representative cases, N 2 , F 2 , H 2 O, C 2 H 4 , C 4 H 4 , and O 3 potential energies. Compared to a simple subtractive embedding, TDCSD provides more accurate absolute energies in the nitrogen case, especially for smaller active spaces. However, DCSD still produces considerably better results for the uniform dissociation of a hydrogen chain, even if large active spaces are used in TDCSD, and a correction from the subtractive embedding makes the results essentially exact. For the ozone molecule, FCIQMC-TDCSD and FCIQMC-TCCSD using an active space (18,39) produce very accurate isomerization energies, which differ only by 0.01−0.02 eV from the reference values.
In most of the cases a tailored distinguishable cluster is noticeably more accurate than the tailored coupled cluster. Obviously, the larger the active space is, the less pronounced is the difference; however, the higher accuracy of DCSD will be especially important for larger systems with large inactive spaces.
Additionally, we have investigated accuracy of the tailored methods with active spaces defined by most active DCSD natural orbitals. Depending on the system and the size of the active space, the accuracy is quite comparable to CASSCFdefined active spaces. These findings allow for faster and more TCCSD-(2,2) RHF orbitals 79 12.9 TCCSD(T)-(2,2) RHF orbitals 79 7.0 RMR CCSD(T) 80 7.5 MR-MkCCSD(T) 81 8.9 MR-AQCC 73 8.8 experimental range 77,78 1.6−10 Furthermore, an explicit correlation correction for tailored methods has been introduced and benchmarked, yielding remarkably accurate results along the H 2 O double-dissociation and the C 2 H 4 torsion curves, overlapping with the reference MRCI+Q-F12 curves. Besides, this F12 correction represents a new type of explicitly correlated FCIQMC method (FCIQMC F12 ) when the active space spans the full orbital space.
A simple buffer region test applied here demonstrated the importance of a relaxation of the amplitudes in the active space. There are various possibilities of the active-space relaxation, 93−95 which can be applied to the TDCSD (and TCCSD) methods. Higher order excitations can also be easily extracted from FCIQMC calculations and employed to improve results further.
Our present implementation is limited to closed-shell molecular systems; however, this is a purely technical limitation, and an unrestricted open-shell version is currently being developed. Additionally, an extension to linear-scaling localcorrelation-tailored methods is possible, 96 which greatly increases the applicability of the approach.