Compression of Spin-Adapted Multiconfigurational Wave Functions in Exchange-Coupled Polynuclear Spin Systems

We present a protocol based on unitary transformations of molecular orbitals to reduce the number of nonvanishing coefficients of spin-adapted configuration interaction expansions. Methods that exploit the sparsity of the Hamiltonian matrix and compactness of its eigensolutions, such as the full configuration interaction quantum Monte Carlo (FCIQMC) algorithm in its spin-adapted implementation, are well suited to this protocol. The wave function compression resulting from this approach is particularly attractive for antiferromagnetically coupled polynuclear spin systems, such as transition-metal cubanes in biocatalysis, and Mott and charge-transfer insulators in solid-state physics. Active space configuration interaction calculations on N2 and CN– at various bond lengths, the stretched square N4 compounds, the chromium dimer, and a [Fe2S2]2– model system are presented as a proof-of-concept. For the Cr2 case, large and intermediate bond distances are discussed, showing that the approach is effective in cases where static and dynamic correlations are equally important. The [Fe2S2]2– case shows the general applicability of the method.


INTRODUCTION
Polynuclear transition metal and f-element systems play central roles in biochemical processes and as building blocks of Mott and charge-transfer insulators. Understanding their electronic structure is of paramount importance to control their properties. At the atomic level, these compounds have complex electronic structures, with several unpaired electrons per metal center distributed among near-degenerate valence d (or f) orbitals. Orbital degeneracies are partially lifted by ligand-field effects, at the price of even more complex electronic structures characterized by charge-transfer excitations between metal centers and ligands (consider the super-exchange mechanism in solids as an example 1 ) and degeneracies between metal and ligand orbitals. These systems also exhibit multiple quasidegenerate low-lying spin states whose relative order is easily altered by small external perturbations. 2 Locally (at each metal center), Hund's rules suggest that the unpaired electrons have parallel spins. However, kinetic-exchange interactions, including direct-exchange and super-exchange mechanisms, favor electrons residing in adjacent metal centers to couple with antiparallel spins, thus inducing antiferromagnetism. 3−9 Computational investigations of these systems require advanced multiconfigurational electronic structure methods, such as the complete active space self-consistent field approach (CASSCF). 10−14 However, for these methods, even the determination of the spin of the ground state is computationally demanding, and predictions of reaction mechanisms and electronic properties are, in practice, limited to systems containing at most two transition-metal atoms. 15−18 When studying systems with numerous unpaired and low-spincoupled electrons, the limiting step is the exponential scaling of the Hilbert space size with respect to the size of the chosen active space. This limitation is exemplified by the {Mn 4 CaO 5 } cluster of photosystem II. In its relaxed form, the S1 state, the cluster consists of two d 4 -Mn(III) and two d 3 -Mn(IV) ions. The minimal active space for this system is the CAS (14,20), consisting of the valence orbitals and electrons of the four metal centers. In the low spin (singlet, S = 0), the configuration interaction (CI) vector contains ∼6 × 10 9 Slater determinants (SDs) and ∼1 × 10 9 configuration state functions (CSFs), quickly reaching the present computational limits. A more adequate active space would also contain orbitals and electrons of the bridging oxygens, CAS (44,35), largely exceeding the current computational limits.
In recent years, a number of methods have been developed to circumvent the exponential scaling of CAS wave functions, which use algorithms such as density matrix renormalization group (DMRG) 19−28 or full configuration interaction quantum Monte Carlo (FCIQMC) 29−35 as CI eigensolvers. Within the framework of the novel stochastic-CASSCF approach, 36 active spaces containing up to 38 electrons and 40 orbitals have been reported. 37,38 In FCIQMC, a finite number of "walkers" are used to stochastically sample CAS (or FCI) wave functions and information is stored only for those SDs that are populated by walkers at the given instantaneous imaginary time step. For a fixed number of walkers, the stochastic representation of the wave function is generally more accurate for sparse wave functions than for dense ones. Thus, it seems relevant for methods that benefit from wave function sparsity, such as FCIQMC, to ask whether techniques exist that can reduce the number of nonvanishing coefficients in CI wave functions.
The graphical unitary group approach (GUGA) 39,40 is a technique that constrains multiconfigurational wave functions to a chosen total spin, S. The method has been pioneered by Paldus, Shavitt, and others, 39,41−47 and it has been used for decades in conventional MCSCF methods. Since 2011, the GUGA approach has also been adapted to generalized active space SCF wave functions (GASSCF) 48 and to the GASPT2 approach. 49 Recently, a spin-adapted version of the FCIQMC algorithm based on GUGA has also been developed in our laboratories. 50 When used in conventional CI procedures, GUGA represents the most compact way of storing CI expansions, as it contains a much smaller number of parameters (the CSF coefficients) than the ones in Slater determinant expansions. However, Slater determinant representations are more effective in direct-CI driven procedures, as the evaluation of the sigma vector, σ = HC, only relies on the Slater−Condon rules and vectorization is possible. 51,52 The advantage of both expansions, Slater determinants for computing the σ vector and CSFs for storing the wave function parameters, can be combined at the extra cost of efficient ways of transforming the wave function between the two bases. Already in 1976, Grabenstetter 53 suggested one of such methods. This method is currently used in many chemistry software packages, including MOLCAS, 54,55 LUCIA, 56 and DALTON. 57 More recently, an algorithm has been suggested by Olsen 58 that avoids the large spin-coupling transformation matrix and the operation count can be reduced for systems featuring a large number of low-spin-coupled unpaired electrons.
Within the GUGA formalism, one additional property emerges: orbital reordering impacts the sparsity of the CI Hamiltonian matrix and the number of nonvanishing CI coefficients in the CI eigensolutions. This feature, which has already been discussed for several simple cases by Brooks and Schaefer III in 1979, 47,59 is unique to CSF expansions, and it is not present when an SD basis is utilized. The property follows from the way CSFs are constructed and coupled via the spinfree nonrelativistic Hamiltonian operator, and it will be discussed in great detail in the present manuscript.
Orbital reordering is also a crucial element for the convergence of the DMRG procedure. 60−67 However, the reordering discussed in this manuscript differs from the one discussed within the DMRG framework, both in motivation and in aim. In the context of DMRG, the role of the reordering is to accelerate the convergence of the method with respect to the bond dimension (referred to as the M, or D, value). This is achieved by exploiting the concepts of entanglement entropy 63,64,66 and locality 67 between adjacent sites (orbitals) and by analyzing one-and two-electron integrals. Although locality and electron repulsion integrals could also be related to our orbital reordering schemes, our reordering schemes are strictly motivated by the intrinsic mechanisms of the GUGA algorithm and aim at the compression of wave functions expanded in CSFs.
The structure of CI expansions based on different orbital representations (natural, pseudo-natural, and canonical orbitals) has previously been investigated by Shavitt for the water molecule in its ground-state equilibrium geometry and only for configuration interaction with single-and double-excitation (CISD) wave functions in spin-adapted basis. 68 The wave function compression is here analyzed in combination with two orbital representations commonly used in multiconfigurational CI approaches, the active natural orbitals produced by diagonalizing the active space one-body density matrix and the localized active orbitals that are obtained by localizing occupied and virtual orbitals together (a not-invariant transformation for HF wave functions). Split-localized orbitals, obtained by localizing occupied and virtual orbitals separately (an invariant transformation for HF wave functions), and mixed localized/delocalized basis represent alternative routes. Although we use different orbital representations, the aim of this work is not to compare between them, but rather to study the effect of reordering schemes on the ground-state wave function sparsity in a spin-adapted basis within them. These reordering schemes are based on the occupation numbers for natural orbitals, real-space orbital separation arguments for localized orbitals, and generalized active space 48,52,54,55,69 orbital partitioning for both of them. As it has already been noticed by Shavitt in ref 59, orbital reordering is the only tool available in the unitary group approach, including GUGA, for controlling spin couplings of CSFs and, although it is not to be expected to find in all cases the specific ordering that reduces the number of coupling CSFs to its minimum, it is generally possible, as shown in this manuscript, to find the ordering that leads to a considerable reduction of the interacting space.
Conventional CASCI procedures, as well as the spin-adapted FCIQMC algorithm, are used to show the wave function compression effect that follow specific orbital reordering schemes. The increased sparsity obtained facilitates the convergence of spin-adapted FCIQMC calculations with respect to walker distributions, and it is strongly recommended for polynuclear transition-metal complexes with antiferromagnetically coupled metal centers. The N 2 and CN − at various bond distances, the stretched N 4 molecules, the chromium dimer at 2.4 Å and at dissociation, and a model system of the oxidized form of the [Fe 2 S 2 ] 2− cluster in its lowest spin state (S = 0, singlet) will be used as examples.
We discuss the theoretical foundation of the compression of spin-adapted wave functions in Section 2 and present numerical examples in Section 3, using conventional CI procedures and the stochastic FCIQMC algorithm. For the latter, we show that the convergence behavior with respect to the total number of walkers can be greatly improved by taking advantage of the wave function compression that follows orbital reordering.

THEORETICAL DETAILS
2.1. Representation of CSFs. CSFs are generally represented by one of the three equivalent tables of Figure  1, known as Gel'fand, Paldus, and Weyl tableaux, respectively. 43,70,71 The top row of the Gel'fand tableau completely where n, N, and S are the total number of orbitals, electrons, and the total spin, respectively. The m 1i elements represent the individual entries of the top row. The example of Figure 1 represents a system with eight orbitals (n = 8, dimension of the top row) and six electrons (N = 6, sum of the m 1i entries) coupled to a triplet spin state (S = 1, sum of 1-entries divided by 2). The other rows in the Gel'fand tableau identify a specific CSF for the given electronic state, as it will be explained in the following. Considering that Gel'fand tableaux contain only 0, 1, and 2 entries, 43 a more compact "three-column table" can be used, where the number of 0's, 1's, and 2's is counted. This table is referred to as the Paldus ABC tableau ( Figure 1). The sum of the entries in Paldus ABC tableau equals the row index (from bottom to top) thus, any two columns are sufficient to uniquely determine the state and the specific CSF. Paldus AC tableaux are derived from the ABC tableaux by excluding the second column, B. Paldus ABC (or AC) tableaux can be recast in "variation tables" with Δx i = x i − x i−1 (x = a, b, c), as shown in Figure 1. Starting from the top row of Paldus ABC, or ΔABC variation tableau, four actions recursively follow to obtain the possible lower rows and generate the CSFs for the targeted electronic state: • remove one empty orbital, Δa i = 0, Δb i = 0, Δc i = 1, • reduce spin by 1/2, Δa i = 0, Δb i = 1, Δc i = 0 (negative spin coupling), • remove one doubly occupied orbital and one empty orbital and increase the spin by 1/2, Δa i = 1, Δb i = −1, Δc i = 1 (positive spin coupling), and • remove one doubly occupied orbital, Δa i = 1, Δb i = 0, Δc i = 0.
Lexically ordered CSFs are obtained when the above steps are followed in order. While the Δa and Δc entries are restricted to 0 and 1 values, the Δb column may assume 1, 0, and −1 entries. All CSFs for a given state can be constructed by allowing the possible variations of a i , b i , and c i according to the actions given above, decreasing the values of a i , b i , and c i down to (0 0 0). From the ΔAC tableaux, the Weyl representation is promptly obtained by writing the row indices of the left 1entries and the right 0-entries, as indicated in Figure 1. Each Weyl tableau represents a CSF with a defined total spin, S, and the left and right columns represent the positively and negatively spin-coupled contributions in a cumulative sense.
Step Vector. The four possible actions that lead from the top row of the ABC tableaux to the bottom can be expressed in a more compact form via the step vector, whose elements (the step values) are defined as Depending on the action that leads to the lower row index, the step values will assume values from 0 to 3. Table 1 summarizes the correspondence between the possible step values and the Δa i , Δb i , and Δc i variations.
Step values, d i , of 0, 1, 2, or 3 correspond to empty, singly occupied orbitals increasing the total spin by 1/2 (positive spin coupling and referred to as u in this work), singly occupied decreasing the total spin by 1/2 (negative spin coupling and referred to as d), or doubly occupied ith-orbital, respectively. The d i ′ labels of Table 1 represent a more intuitive step-value nomenclature to specify CSFs that will be used in the rest of this manuscript. Graphical Unitary Group Approach (GUGA). When constructing the CSFs of a given multiconfigurational wave function, rows in Paldus ABC tableaux repeat for different CSFs. Repetitions can be avoided by listing only nonequivalent rows. The table collecting all of the nonequivalent rows is referred to as a distinct row table (DRT) ( Table 2), introduced by Shavitt. 39 Each row of a DRT is identified by a pair of indices, (i, j), with i = a i + b i + c i being the level index, and j the lexical row index, a counting index such that j < j′ if a i > a i ′ or if a i = a i ′ and b i > b i ′. CSFs are generated by connecting rows with decreasing level index. Allowed connections between rows are indicated by downward chaining indices. For a given lexical row, the downward chaining indices define the connected rows of the lower level row after the action of the four possible step values, d 0 , d 1 , d 2 , and d 3 . Table 2 summarizes the DRT of a CAS(6,6) wave function, coupled to a singlet spin state. A more compact representation of DRT tables is obtained by means of graphs ( Figure 2). Each vertex of the graph represents one distinct row of the DRT. Arcs connect only vertices linked by downward chaining indices. Vertices are labeled by the lexical ordering index, j, and arcs by the corresponding step value. The head node corresponds to the top row and the tail node corresponds to the bottom row (0 0 0) of the corresponding DRT table. Vertices with the same ivalue are aligned horizontally. Vertices are also left−rightsorted with respect to the a and b values of the DRT. The left− right ordering ensures that the slope of each arc corresponds to its step value. Direct walks through the graph, following only vertices connected by arcs, lead to all possible CSFs of the given multiconfigurational wave function. In Figure 2, three CSFs have been highlighted. The step vector d = |111222⟩, represented by the orange path in Figure 2 (read from bottom Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article to top), corresponds to the CSF |uuuddd⟩ (u = positively spincoupled, d = negatively spin-coupled), the step vector d = | 121212⟩ (green path in Figure 2) corresponds to the CSF | ududud⟩, and the step vector d = |333000⟩ (blue path in Figure  2) corresponds to the closed-shell |222000⟩ CSF. GUGA Representation of GAS Wave Functions. When the GUGA representation of CSFs is used for generalized active space (GAS) wave functions, 48 a number of direct walks in the GUGA graph are not permitted by the occupation number constraints of the GAS specifications. A GAS6(6,6) is considered as an example, which contains six active electrons and six active orbitals, each orbital in a separate GAS subspace. The six GAS subspaces are populated by only one electron and thus referred to as disconnected spaces (interspace electron excitations are not allowed). This GAS wave function corresponds to a configurational space where only spin recouplings via exchange-driven spin-flips are permitted. As a guide for the eye, the cumulative occupation number, N elec , associated with each vertex is shown in Figure 2. Some of the arcs of Figure 2 are not permitted for this GAS6 (6,6). For instance, the arc connecting vertices (30) and (27) corresponds to populating the first orbital with two electrons, which is not permitted by the chosen GAS. The paths permitted by the GAS6(6,6) restrictions are depicted in Figure   3. The filled black circles indicate the allowed vertices within the GAS restrictions. The thin black lines and circles of Figure   3 represent CSFs of the auxiliary space, a space that is forbidden by GAS rules, but necessary for the coupling of permitted CSFs via double excitations. The gray lines and circles are prohibited by GAS rules and not necessary for the auxiliary space.     Table 2 with the additional GAS6(6,6) constraints discussed in the main text. The orange, green, and thick black paths between them represent the allowed CSFs of the GAS wave function, connecting the allowed vertices, indicated by the filled circles. The thin black lines and circles belong to the auxiliary space (see the main text), and gray nodes and arcs are prohibited by GAS rules and not necessary for the auxiliary space.

Journal of Chemical Theory and Computation
is given by where h pq and (pq|rs) are the one-and two-electron integrals, and ⟨m′|Ep q |m⟩ and ⟨m′|ep q,rs |m⟩ are the coupling coefficients between two SDs or CSFs. The integral values depend on the shape of the orbitals, while the coupling coefficients depend on the entries in |m⟩ and |m′⟩ (depending on whether an SD or CSF basis is chosen). The Slater− Condon rules apply for the coupling coefficients between SDs, which can be evaluated very efficiently. However, these rules are not applicable for CSFs, and, consequently, wave function optimizations in CSF basis have been less popular than optimizations in SD basis. Paldus, Shavitt, and others 40,41,47 have demonstrated that the efficient evaluation of CSF coupling terms is possible via the GUGA approach.
One-Electron Coupling Coefficients. One-electron coupling coefficients, ⟨m′|Ep q |m⟩, can be computed graphically by first identifying |m⟩ and |m′⟩ paths in the corresponding GUGA graph followed by their connection via the excitation operator, Ep q . For nonvanishing coefficients, the walks of |m′⟩ and |m⟩ on the graph must coincide outside the (p, q) range, defined by the excitation operator, Ep q . The value of the coupling coefficient is independent of the overlapping outer regions and depends only on the shape of the loop formed by the two CSFs in the range defined by the operator Ep q . At each row level, k (inside the range), nonvanishing terms satisfy the conditions Shavitt proved that coupling coefficients can be factorized as (8) and the values of W(Q k ; d k ′, d k , Δb k , b k ) are tabulated. 59 The factors, W, depend on step-vector values, d k ′ and d k , Δb k = b k − b k ′, and the b k value of |m⟩ at the k-level. They also depend on the k-segment shape, Q k , which indicates the relation of the klevel arcs of the two CSFs, |m′⟩ and |m⟩. If the k-arc of |m′⟩ is on the left, coincident or on the right of the k-arc of |m⟩, Q k is labeled as raising (R), weight (W), or lowering (L), respectively. For the segments where the loop begins (bottom) and ends (top), under-bars and over-bars (R, L and R̅ , L̅ ), respectively, are used as labels. As an example, the ⟨2uud0d|E1 5 | uuuddd⟩ coupling term is promptly evaluated using the graph of Figure 3, the labeling rules defined above, and  (9) where we have used the Δb k R d k ′d k b k symbols for each k-level inside the loop. When orbitals are reordered, the graphical representation of any CSF in a GUGA graph and the couplings between CSFs are altered and thus nonvanishing coupling terms may vanish after orbital reordering.
Two-Electron Coupling Coefficients. Matrix elements of two-body excitation operators, Ep q Er s , can either be evaluated by introducing a summation over intermediate states, |m″⟩ (resolution of identity) pq rs (10) or directly in a factorized form similar to eq 8; see ref 40. Similar to the one-body coupling coefficients, orbital reordering also impacts these terms; thus, it is possible to increase the number of vanishing coupling terms and produce a more sparse Hamiltonian matrix and compact representation of the many-body wave function. ), or (b) by grouping orbitals residing on the same atom (atom-separated ordering). At large distance, Hund's rules suggest that, for each magnetic center, the unpaired electrons have parallel spins; this leads to very few nonvanishing paths (to the extreme of one for the CAS(6,6) wave function of N 2 or the CAS (12,12) wave function of Cr 2 at dissociation).

APPLICATIONS
For weakly interacting magnetic centers, say N 2 at a bond length of 2.0 Å (Section 3.3), Cr 2 at a bond length of 2.4 Å (Section 3.4), or more realistic transition-metal clusters, such as the [Fe 2 S 2 ] 2− system (Section 3.5), this partitioning is still very useful, and one has neutral configurations in addition to charge-transfer ones. Higher-order charge-transfer configurations carry small weights, and GUGA allows for an easy selection of dominant paths.
As demonstrated in detail in Section 3.1, the sparsity of the CI Hamiltonian matrix and its eigenvectors arises from multiple factors: (a) vanishing spin-coupling coefficients, (b) vanishing electron integral values, (c) exact cancellation of terms with opposite signs, and (d) terms that fortuitously fall below thresholds. The latter case leads to unpredictable numerical zeroes, while the former cases (zeroes by symmetry) can be predicted and their effect traced into the leading configurations of the multiconfigurational wave function. It is hard to distinguish and quantify the various sources that lead to sparsity and impractical when tackling realistic cases featuring complex wave functions. In Section 3.3, the L 1 and L 4 norms will be utilized to quantify the compression effect for different orbital representations and orderings for the homonuclear N 2 and the noncentrosymmetric CN − species Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article at various bond lengths. Instead, in Section 3.5, the convergence of the spin-adapted FCIQMC projected energy will be utilized as a practical mean to measure the sparsity of the CAS (22,26) wave function of the [Fe 2 S 2 ] 2− model system.

Stretched Nitrogen Molecule.
The nitrogen molecule at dissociation is a simple pedagogical example that demonstrates how orbital reordering impacts the sparsity of the many-body CI expansion when CSF representations are utilized. Consider a CAS(6,6) active space, consisting of six MOs formed by a linear combination of the 2p atomic orbitals on each atom, and their electrons. The CASSCF(6,6) wave function is optimized to a singlet spin state. Two sets of orbitals are considered: delocalized natural orbitals, with bonding and antibonding characters, and localized orbitals (atomic-orbital-like). C 1 point group symmetry has been used for all cases. The CAS(6,6) CI expansion contains 175 CSFs. These CSFs are represented by all possible walks in the GUGA graph of Figure 2. The number and the list of nonvanishing terms in the optimized CI wave function for each type of orbital shape and ordering are given in Tables 3 and 4, respectively.
Natural Orbitals. Two ordering schemes have been adopted for the natural orbitals of the CAS(6,6) wave function: the canonical ordering, with bonding orbitals (σ, π x , and π y ) preceding the antibonding orbitals (σ*, π x *, and π y *), and the pair ordering, where orbitals are sorted in (σ, σ*), (π x , π x *), and (π y , π y *) pairs. The pair ordering has the effect of reducing the number of nonvanishing terms in the CI expansion with respect to the canonical ordering, from 20 to 14 CSFs (Table  3).
In the natural orbital basis, there is a strong coupling between bonding and antibonding orbital pairs, i.e., σ ↔ σ* and π x/y ↔ π x/y * . As a consequence, the significant off-diagonal molecular integrals at dissociation are the exchange-like, (σσ*|σσ*) and (π x/y π x/y * |π x/y π x/y * ), and the "coherent" combinations of them, i.e., (σσ*|π x/y π x/y * ) (8-fold permutational symmetry implied). The list of electron repulsion integrals is available in the Supporting Information. We will explain the increased sparsity of the pair ordering scheme using the example of three CSFs shown in Table 5.
In the canonical ordering scheme, the Hamiltonian matrix elements ⟨1|Ĥ|2⟩ and ⟨1|Ĥ|3⟩ are driven by the large integral contribution (π x π x *|π y π y *), which is (25|34). The coupling coefficients ⟨2|e5 2;43 |1⟩ and ⟨3|e5 2;43 |1⟩ do not vanish, and, upon multiplication by the (25|34) integral value, they result in the nonvanishing ⟨1|Ĥ|2⟩ and ⟨1|Ĥ|3⟩ matrix elements. Also, in canonical ordering, the following matrix element does not vanish where (34|34) and (25|25) correspond to the large and identical integrals between the bonding and antibonding π orbitals, while the (35|35) and (24|24) are the small (π y π x *|π y π x *) and (π x π y *|π x π y *) integral values, respectively. Thus, in the canonical ordering scheme, configurations |1⟩, | 2⟩, and |3⟩ are coupled via the Hamiltonian operator, causing the wave function to be in general dense. For the pair ordering scheme, the coupling coefficient between states |1′⟩ and |3′⟩ is zero for the strong (π x π x *|π y π y *) integral contribution, which is (34|56), and is only driven by the much smaller (π x π y |π x *π y *) and similar contributions. At the same time, in the pair-ordered case, the matrix element  a Natural orbitals and localized orbitals are shown. Natural orbitals in canonical ordering are sorted as (σπ x π y , π y *π x * σ*). Natural orbitals in pair ordering are sorted as (σσ*, π x π x *, π y π y *). Localized orbitals in pair ordering are sorted as (p . Localized orbitals in atom-separated ordering are sorted as (p orbital labels σ π x π y π y * π x * σ* 0  pair order  1  2  3  4  5  6 orbital labels σ σ * π x π x * π y π y * |1′⟩ 2 goes to zero, as all of the involved integrals correspond to identicaland weak(π x π y |π x π y ) types. Thus, the pair ordering scheme reduces the connectivity within the Hilbert space (a) by zeroing coupling coefficients that multiply strong integral contributions or (b) by cancellation of equal integral contributions, due to the sign structure of the resulting coupling coefficients. This reduced connectivity within CSFs leads to more sparse wave functions.
It is important to highlight already at this point that, for some orbital representations, integrals that multiply nonzero coupling coefficients may become vanishingly small (numerical zeroes). It is difficult in realistic cases to distinguish truly vanishing elements (by conditions (a) and (b) above) from terms that fall below a certain threshold and it will not be attempted in the subsequent sections.
Localized Orbitals. Localization of the natural orbitals produces atomic-orbital-like molecular orbitals. The pair ordering scheme (p ) have been considered. In the pair ordering scheme, the wave function contains five nonvanishing terms. A single-configurational (yet multideterminantal) wave function is obtained when the atomseparated ordering is adopted. Localization schemes without paying particular attention on the orbital ordering is not a sufficient condition for optimal wave function compression.
The five nonvanishing CSFs, for the pair-ordered localized orbitals, are graphically shown in Figure 2. They are represented by all of the paths inside (and including) the orange and green direct walks. These CSFs share a common property: they all feature singly occupied orbitals. Thus, a GAS wave function can be constructed, with each orbital in a separate GAS subspace, and the spaces kept disconnected. The graph of Figure 3 represents such a wave function. The singleconfigurational character of the CI expansion in the localized orbitals and atom-separated ordering completely reflects the chemical nature of this system, that is, two noninteracting nitrogen atoms, each in its ground state, 4 S, antiferromagnetically coupled to form a singlet spin-state compound. This information is not promptly accessible when orbitals are delocalized or localized and pair-ordered.
GAS Hamiltonian Matrix. The single-configurational character of the wave function in atom-separated ordering scheme is bound to the sparsity of the corresponding Hamiltonian matrix. Only a matrix with vanishing off-diagonal elements can provide a strictly single-configurational eigenvector. For simplicity and without loss of generality, only the GAS6(6,6) Hamiltonian matrix is discussed. The GAS6(6,6) wave function contains a total of five CSFs (listed in the third column of Table 4). The GAS6(6,6) Hamiltonian matrices in the localized orbital basis and using pair ordering and atomseparated ordering are reported in Figure 4.
As an example, two off-diagonal elements, ⟨ududud|Ĥ| uuuddd⟩ and ⟨uduudd|Ĥ|uuuddd⟩, are evaluated, and it is shown why in the atom-separated ordering these terms vanish, while in the pair ordering they do not.
General one-electron excitation operators, Êp q (p ≠ q), applied to any of the CSFs of the GAS6(6,6) wave function necessarily generate CSFs outside the GAS expansion (CSFs with doubly occupied orbitals), and no contribution to the offdiagonal elements of the GAS Hamiltonian matrix can arise from them. Only exchange two-particle operators can contribute to these elements (double spin-flips), namely The resolution of identity (eq 10) is used for their evaluation.
The |m″⟩ configurations of the auxiliary space, which simultaneously couple with |uuuddd⟩ and |ududud⟩ (or | uduudd⟩), are found by applying conditions eqs 6 and 7. The black thin lines of Figure 3 represent the CSFs of the auxiliary space that simultaneously couple with |uuuddd⟩ and |ududud⟩, and the resulting coupling coefficients are listed in Table 6.
In the pair ordering case, instead, the orbitals of these pairs reside on the same atom and the two-electron repulsion integrals do not vanish, leading to a nonvanishing Hamiltonian matrix element. 3.2. Square N 4 System at Dissociation. In this section, the square N 4 model compound at dissociation and in its singlet spin state is discussed. A CAS(12,12) active space that  Natural and localized orbitals are used as the basis for the CI procedure, using canonical, type, and atom-separated orderings. The type ordering for the localized orbitals is the following The number of nonvanishing terms for each orbital representation is summarized in Table 3. As for the nitrogen molecule, the most compact representation of the CI wave function is obtained when localized orbitals in atom-separated ordering are utilized. The ground state of this system is 4-fold degenerate, and, in principle, any combination of the four states is a solution. The two most compact solutions correspond to the single |uuuuuudddddd⟩ CSF and the | uuuddduuuddd⟩ CSF. These two CSFs correspond to the extreme paths of Figure 5b. The other two degenerate states are linear combinations of the other CSFs depicted in Figure  5b. Thus, in the atom-separated ordering, the worst compression would contain 20 nonvanishing CSFs. This number can be derived from a GAS12 (12,12) wave function with disconnected spaces and singly occupied orbitals only. Under these conditions, the first three electrons, residing on the first atom, are coupled to a quartet, the last three electrons are coupled antiferromagnetically to the previous ones, locally with parallel spins, while the intermediate six electrons couple in all possible ways with parallel or antiparallel spin (see the genealogical branching diagram of Figure 5b). In the type ordering, the gray paths of Figure 5b do not vanish; instead, they also contribute to the CI expansion, thus reducing the sparsity. 3.3. N 2 and CN − at Varying Bond Distances. In this section, the compression that results from different orbital representations and orderings will be discussed for N 2 and CN − dimers at various bond lengths. The noncentrosymmetric CN − anion has been chosen to investigate the role of molecular symmetry (N 2 is centrosymmetric, while CN − is not) and to show that the compression does not depend strongly on symmetry considerations, which could result in fortuitous zeroing of electron repulsion integrals, as might be thought in view of the results earlier presented for the N 2 system. However, it is important to note that CN − is isoelectronic to N 2 , effectively making the CI space of the two systems identical, while their electron repulsion integral values differ.
For both systems, five bond lengths have been considered, namely, 10, 3, 2, and 1.4 Å and the equilibrium bond distance, R eq , of 1.0975 Å for N 2 and 1.1770 Å for CN − , respectively. A basis set of atomic natural orbital (ANO-RCC) type has been used for both C and N atoms, contracted to 3s2p functions. In all cases, the 1s core orbitals have been kept frozen, while the 10 valence electrons have been correlated into the space of the remaining 16 orbitals, CAS (10,16). In all cases, a space of 4504864 CSFs is built and the exact CASCI wave function is deterministically computed.
Two orbital representations have been used for these tests, (a) the natural orbitals, ordered by irreducible representations (irreps) and within the irrep in the decreasing order of occupation numbers (symmetry ordering), and (b) the Pipek− Mezey localized orbitals. The localization procedure was applied either only to the six valence 2p orbitals (valence localized) or to the entire list of correlating orbitals (all localized). Upon localization, two reordering schemes (as discussed in Section 3.1) have been utilized, namely, the pair ordering and the atom-separated ordering.
Results for the different geometries and orbital representations/orderings are summarized in Figures 6 and 7 for N 2 and CN − , respectively. For the CN − system, at 10 Å, natural orbitals are already localized into the atoms, as opposite to the other geometries where orbitals are delocalized between the two centers. Thus, considering that we already have three different schemes using localized orbitals and for consistency with the other geometries, we have used the delocalized natural orbitals obtained for CN − at a bond length of 3.0 Å to discuss the sparsity of the wave function for the CN − at the more stretched bond distance.
Two measures of sparsity of the wave functions are reported in Figures 6 and 7, namely, the L 1 and L 4 norms of the CI vector, defined as L n = ∑ i |c i | n , where it is assumed that the wave functions are normalized in the sense of L 2 = 1. The L 1 Figure 5. Genealogical branching diagram for the N 4 model system at dissociation in the localized orbital basis and considering only spinflips of singly occupied orbitals with type ordering (a) and atomseparated ordering (b). Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article norm can range from a value of 1 (single-configurational wave function) to N CSF , N CSF being the size of the wave function (highly multiconfigurational wave function with equal coefficients in the entire Hilbert space). A wave function with many small (but nonvanishing) components can have a large L 1 norm, and therefore the L 1 is quite sensitive to parts of the wave function with small but non-negligible amplitudes. In the context of FCIQMC, the L 1 norm is related to the number of walkers required to instantaneously describe the wave function. The L 4 norm (also called the participation ratio in localization theory) provides a different measure of the sparsity. Here, the maximal value is 1 and corresponds to a wave function with only one nonzero component (i.e., fully localized), while a small L 4 norm indicates that the wave function is spread over several (potentially many) components. The components of a CI expansion below roughly 10 −3 make a numerically negligible contribution to the L 4 norm, so L 4 −1 is a measure of the number of substantially nonzero components present in a wave function.
Both systems behave quite similarly in terms of the sparsity and localization properties of the wave functions, with differences between the two molecules observed only in the intermediate stretching regime. At equilibrium and nearequilibrium geometries, the symmetry-adapted orbitals lead to the sparsest (most localized) description of the wave function, with L 4 values of 0.8 for both N 2 and CN − . On the contrary, the L 4 norm is only 0.04−0.08 for the localized basis. At 2.0 Å, N 2 is clearly more compactly represented by the localized orbitals in atom-separated ordering (L 4 = 0.48, compared to 0.16 for the symmetry-ordered orbitals). By contrast, the CI wave function of CN − at this geometry is more localized in the symmetry-adapted basis L 4 = 0.37 instead of 0.16. However, for both N 2 and CN − , the L 1 norm for the various orbital representations investigated is very similar at a bond distance of 2.0 Å, making any judgment on wave function compression far from convincing.
At yet more stretched geometries (3.0 Å and beyond), the localized and atom-separated orderings lead to a more compact wave function for both molecules (with L 4 > 0.9, compared to L 4 ∼ 0.05 in the symmetry-adapted basis). It is also striking that the equilibrium geometries are always less sparse than the stretched (near-dissociated) geometries, when each is expressed in their optimal orbital representation. This observation shows that describing correlation at the equilibrium geometries of molecules is actually harder compared to that of the stretched geometries. This arises thanks to the compactness that the GUGA formulation allows, for the optimally ordered representations of the stretched molecules.
In Figures 6 and 7, the (percentage) weight of the dominant CSF is also reported for each orbital representation and geometry. For a given system and geometry, this weight may drastically differ depending on the orbital representation, being close to 1 for some representations and close to zero in others. Sampling a wave function dominated by a single CSF is much easier in FCIQMC compared to wave functions in which multiple CSFs are important. This is partly because the presence of a single dominant configuration leads to greatly reduced sign problems (there being fewer important CSFs in which the signs are determined through a delicate cancellation of oppositely spawned walkers) and partly because the stochastic noise of the projected energy is generally reduced in such systems (the number of walkers on the reference appears in the denominator of the projected energy expression). For these two reasons, any representation that leads to a single-configurational wave function is to be preferred in FCIQMC.
As will be demonstrated in Section 3.5, there are realistic systems possessing multiple open-shell orbitals, such as polynuclear transition-metal (TM) catalysts, which behave similarly in terms of the induced sparsity in the wave function, to the small dimers (N 2 and CN − ) at dissociation. This makes the strategy presented here a powerful tool for wave function compression for cases of practical interest, beyond pedagogical model systems.
3.4. Chromium Dimer, CAS (24,48). Small active spaces have been considered for the N 2 and N 4 systems, with wave functions that can be optimized by conventional methods (Davidson in its direct-CI formalism). GAS restrictions, similar to the nitrogen cases, discussed in Sections 3.1 and 3.2, could also be applied to the valence orbitals of the chromium dimer at dissociation on a localized basis. Consequently, the valenceonly active space for the chromium dimer at dissociation, CAS (12,12), can be related to the CAS(6,6) of the nitrogen molecule, with the two chromium atoms in their high-spin, 7 S, ground state and antiferromagnetically coupled. This CAS- (12,12) wave function would be single configurational if represented by localized orbitals in atom-separated ordering and will not be discussed further.
Instead, we are interested in a considerably larger active space, CAS (24,48) (see ref 17 for details on the active space). Conventional CI optimization procedures are infeasible, and the spin-adapted implementation of the FCIQMC algorithm has been utilized. In the CAS (24,48) case, doubly occupied, singly occupied, and empty orbitals simultaneously occur in the wave function, and a mixture of static and dynamic correlations characterize this wave function, independently of the orbital representation chosen. Two geometries are Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article discussed, one at the dissociation limit and one at a bond distance of 2.4 Åthe "shoulder region" of the potential energy curvewhere a more complex wave function is to be expected.
Dissociation Limit. Figure 8 shows four spin-adapted FCIQMC calculations using (a) delocalized orbitals in canonical ordering, (b) delocalized orbitals in pair ordering, (c) localized orbitals in pair ordering, and (d) localized orbitals in atom-separated ordering. For case (b), pair ordering was adopted for all orbitals, and orbitals have been reordered such that orbitals with p and s characters (any shell) are in adjacent positions. In case (d), orbitals have been sorted as An unstable spin-adapted FCIQMC dynamics is observed for the delocalized orbitals in canonical ordering even at a population of 20 × 10 6 walkers (20M). The wave function is highly multiconfigurational in this orbital representation, and walkers are evenly distributed among the many equivalent configurations. As a consequence, the occupation of the reference CSF drops to zero, which, in turn, causes the energy estimate to diverge. This result is to be compared to the findings summarized in Figure 6 for the simple N 2 system. Also for the N 2 dimer at dissociation (bond distance of 10.0 Å), the weight of the dominant configuration is very small, only 6%. The pair ordering increases the weight of the reference configuration and increases sparsity in the CI wave function. These two effects are sufficient to stabilize the FCIQMC dynamics. Yet, with a population of 20 × 10 6 walkers (20M), the energy is not converged. Localization schemes greatly improve the dynamics, and the atom-separated reordering has a major effect on the sparsity of the wave function, in line with the findings reported for the simpler N 2 system. Already with a population of 1 × 10 6 walkers (1M), a satisfactory dynamics is observed with a projected energy estimate of 12 kcal/mol lower than the case with localized orbitals in pair ordering and 20 kcal/mol lower than the energy estimate obtained with delocalized orbitals and with higher walker population (20M). Furthermore, increasing the walker population from 1 × 10 6 to 50 × 10 6 walkers causes only a marginal lowering of the projected energy, by less than 1 milliHartree, and projected energy estimates for 10 × 10 6 (10M) and 50 × 10 6 (50M) walkers are, in practice, identical, suggesting that convergence with respect to walker population has been reached. The difference of 12 kcal/mol between localized/pair-ordered and localized/atom-separated orbital bases clearly demonstrates that the convergence pattern of the spin-adapted FCIQMC with respect to the walker number (initiator error) changes for different orbital representations and orderings, being more advantageous for the orbital ordering that compresses the wave function the most.
Intermediate Bond Distance. Orbital representation has also a significant impact on wave function sparsity at an intermediate bond distance of 2.4 Å ( Figure 9). As for the asymptotic case, an unstable dynamics is observed when using delocalized orbitals in canonical ordering and populations up to 20 × 10 6 walkers (20M). The dynamics stabilizes when delocalized orbitals are ordered in bonding and antibonding pairs and further improves when localized orbitals in atomseparated ordering are utilized, as already observed for the dimer at dissociation. The latter representation leads to an FCIQMC dynamics that, already at 5 × 10 6 walker population (5M), is 18 kcal/mol lower than the 20M walker simulation with delocalized and pair-ordered orbitals.
Analogously to the asymptotic case, the different convergence rate is entirely attributed to the initiator error, greatly reduced for the localized/atom-separated orbitals, thus leading to lower energy. For the localized/atom-separated representation, a population of 5 × 10 6 walkers (5M) is close to convergence. Increasing the population to 50 × 10 6 walkers (50M) lowers the energy by 1.5 kcal/mol. Increasing further to 100 × 10 6 walkers (100M) has a marginal effect of 0.3 kcal/ mol (see the inset of Figure 9). However, while at dissociation a negligible energy difference was observed for the spinadapted FCIQMC dynamics at 10 × 10 6 and 50 × 10 6 walkers, at the shoulder region the two dynamics still differ by 1 kcal/ mol. This result suggests that the localized/orbital-separated Figure 8. Spin-adapted FCIQMC dynamics for the chromium dimer at the dissociation limit, using a (24,48) active space for different orbital representations and orderings. The inset shows the convergence using localized orbitals ordered "atom-separated" with respect to the total walker number. Within the localized orbitals in atom-separated ordering, the projected energy estimates at 10 × 10 6 (10M) and 50 × 10 6 (50M) walkers are, in practice, identical, indicating that convergence is reached. Figure 9. Comparing the spin-adapted FCIQMC dynamics of the chromium dimer at a bond distance of 2.4 Å, using the (24, 48) active space, for different orbital representations and orderings. The inset shows the convergence with respect to the total walker number for the localized orbitals in atom-separated ordering.
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article representation leads to the most compact representation of the wave function as already observed for the asymptotic limit. However, the slower convergence with respect to the walker population, as compared to the stretched geometry, is a clear indication that the wave function at this geometry is already more dense.
3.5. [Fe 2 S 2 ] 2− Model System. In this section, we show that orbital reordering can lead to higher sparsity in spinadapted CI wave functions also in practical cases, where no obvious simplifications of the wave function can easily be predicted. An active space of 22 electrons and 26 orbitals is considered for a [Fe 2 (III) S 2 ] 2− model system ( Figure 10; coordinates available in the Supporting Information), 72 which consists of the 20 valence, 3d, and double-shell, d′, orbitals on the metal centers, and the 6 3p orbitals of the bridging sulfur atoms. Formally, the orbitals of the bridging sulfur atoms are doubly occupied (12 electrons, S 2− ), and the iron atoms are in their Fe(III) oxidation state (d 5 configuration, 10 electrons), for a total of 22 electrons. The low-spin state (singlet) with antiferromagnetically coupled spins at the metal centers is characterized by a highly correlated wave function (details of the wave function go beyond the scope of the present work). Spin-adapted FCIQMC wave function optimizations have been performed on the basis of the stochastic-CASSCF- (22,26) 36 optimized natural orbitals (natural orbital coefficients in symmetry order and Molcas 54,55 format are available in the Supporting Information). Delocalized and localized orbitals are discussed for this system. Similar to the Cr 2 case, the spin-adapted FCIQMC dynamics is highly unstable when the delocalized natural orbitals in canonical order are utilized. This behavior is observed for any walker population up to 20 × 10 6 walkers (20M) and does not arise in FCIQMC dynamics in the Slater determinant basis. The spin-adapted FCIQMC dynamics becomes stable when a qualitative pair ordering scheme for the valence Fe 3d and S 3p orbitals is utilized (orbital ordering given in Supporting Information). The pair reordering in this case is qualitative due to the mixing of the sulfur atomic orbitals into the metalcentered molecular orbitals.
We now turn our attention to the construction of the localized orbitals. Within the active space, CAS (22,26), a CASSCF(10,10) optimization has been performed, keeping the inactive and virtual orbitals of the CAS (22,26) frozen to their original shape. This procedure separates the open-shell 3d orbitals from the ligand orbitals. Next, the Pipek−Mezey localization procedure has been used only for the 10 open-shell orbitals, leaving the molecular orbitals centered on the bridging sulfur atoms and the double-shell orbitals delocalized. The mixed localized/delocalized orbitals and their relative ordering are available in the Supporting Information.
In Figure 11, the horizontal violet line corresponds to the converged stochastic-CASSCF (22,26) energy, using 2 × 10 9 walkers (2B), the Slater determinant basis, and delocalized canonical orbitals. Two important features emerge from Figure  11, (i) within the localized atom-separated orbital representation, the spin-adapted FCIQMC dynamics converges faster than the Slater determinant counterpart with respect to the walker population, and (ii) within the Slater-determinantbased FCIQMC dynamics, the delocalized basis leads to fast convergence with respect to the number of walkers. In the localized basis already for a population of 100 × 10 6 walkers, the spin-adapted FCIQMC projected energy estimate is ∼2 mHartree lower than the Slater-determinant-based FCIQMC projected energy estimate and only ∼0.5 mHartree above the reference value. The faster convergence of the spin-adapted formalism is to be expected considering that the spin recoupling is implicitly accounted by GUGA, while, in Slater determinant basis, spin re-coupling has to be handled explicitly by the FCIQMC dynamics. The Slater determinant FCIQMC dynamics, however, converges faster when delocalized and pair-ordered orbitals are utilized. In this representation, the leading determinants of the CI expansion have a relatively low number of singly occupied orbitals (also referred to as seniority in the literature), ranging from 0 to 2, greatly reducing the explicit spin re-coupling with respect to the localized case. It is the reduction of spin recombinations that favor the delocalized orbital representation when utilizing Slater-determinant-based FCIQMC dynamics.

CONCLUSIONS AND OUTLOOK
We have demonstrated that the sparsity of multiconfigurational wave functions expanded in CSFs depends on the orbital ordering, as well as orbital representation, the former feature  . Spin-adapted and Slater determinant FCIQMC convergences with respect to walker population for the [Fe 2 S 2 ] 2− cluster, using a CAS (22,26) active space. The horizontal line corresponds to the converged stochastic-CASSCF (22,26) wave function (in Slater determinant basis), using 2 × 10 9 walkers, and lies at −5092.9513 ± 0.0002 au. The mauve band indicates the corresponding standard deviation. The green triangles correspond to spin-adapted FCIQMC dynamics for the localized/atom-separated orbitals. The orange squares and the red diamonds correspond to FCIQMC dynamics in Slater determinant basis and using delocalized/pair-ordered and localized orbitals, respectively.
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article being unique to CSF expansions. Orbital transformations can be applied that greatly reduce the number of nonvanishing CI coefficients of multiconfigurational wave functions. Bonding and antibonding pair orderings for delocalized orbitals and atom-separated ordering schemes for localized orbitals maximally increase the sparsity of spin-adapted wave functions for the examples discussed in this work. The increased sparsity that follows from the orbital reordering is beneficial for methods that approximate FCI wave functions, such as FCIQMC. This procedure improves the convergence of the spin-adapted FCIQMC algorithm and, in certain difficult cases, is found to be the only viable way to stable dynamics. The protocol is general and can be applied to molecular systems of practical interest. A simple reordering of the CAS (22,26) natural orbitals of a [Fe 2 S 2 ] 2− model system has been discussed.
The main aim of this work is to show how reordering schemes within a chosen orbital representation (delocalized or localized) may affect large CI expansion sparsity. Our results indicate that, for complex molecular systems, a mixed delocalized/localized orbital representation is preferable, using a delocalized and pair-ordered representation for orbitals that describe covalent bonds and a localized and atomseparated representation for singly occupied orbitals. Localization of covalent bonds can easily lead to unnecessary complications at the level of the wave function, with new terms appearing to account for the orbital mixing. On the other hand, a localized representation for singly occupied orbitals guarantees the highest sparsity for antiferromagnetically coupled polynuclear spin systems, removing the unnecessary entanglement of electrons that follows from using delocalized orbitals for unpaired electrons (as already shown for the N 2 system).
■ ASSOCIATED CONTENT * sı Supporting Information