A Stabilizer Framework for the Contextual Subspace Variational Quantum Eigensolver and the Noncontextual Projection Ansatz

Quantum chemistry is a promising application for noisy intermediate-scale quantum (NISQ) devices. However, quantum computers have thus far not succeeded in providing solutions to problems of real scientific significance, with algorithmic advances being necessary to fully utilize even the modest NISQ machines available today. We discuss a method of ground state energy estimation predicated on a partitioning of the molecular Hamiltonian into two parts: one that is noncontextual and can be solved classically, supplemented by a contextual component that yields quantum corrections obtained via a Variational Quantum Eigensolver (VQE) routine. This approach has been termed Contextual Subspace VQE (CS-VQE); however, there are obstacles to overcome before it can be deployed on NISQ devices. The problem we address here is that of the ansatz, a parametrized quantum state over which we optimize during VQE; it is not initially clear how a splitting of the Hamiltonian should be reflected in the CS-VQE ansätze. We propose a “noncontextual projection” approach that is illuminated by a reformulation of CS-VQE in the stabilizer formalism. This defines an ansatz restriction from the full electronic structure problem to the contextual subspace and facilitates an implementation of CS-VQE that may be deployed on NISQ devices. We validate the noncontextual projection ansatz using a quantum simulator and demonstrate chemically precise ground state energy calculations for a suite of small molecules at a significant reduction in the required qubit count and circuit depth.


Introduction
Quantum computers promise to yield solutions to complex problems that have previously been unattainable by classical means, yet experimental demonstration remains challenging.
To date, the largest molecules simulated on noisy intermediate-scale quantum (NISQ) hardware are H 12 -albeit only a Hartree-Fock calculation -conducted by Google using just 12 of the 53 qubits available on their superconducting quantum processor Sycamore 1 , and H 2 O, performed independently by IonQ using 3 qubits of an unspecified proprietary trapped ion device 2 and IBM, using 5 of the 27 qubits on the now-decommissioned ibmq_dublin superconducting device 3 .
Due to the limitations of shallow circuit depth and short coherence times that characterise the NISQ era, we are not able to harness the full state-space afforded to these machines.
To circumvent the above issues, we turn to the class of variational quantum algorithms, of which the Variational Quantum Eigensolver (VQE) 4 is most widely studied.In contrast with eigenvalue-finding algorithms requiring fault-tolerant machines such as Quantum Phase Estimation (QPE) 5 -which necessitates state evolution over an extended period of coherence -VQE executes a large ensemble of comparatively shallow parametrized circuits to estimate energy expectation values, informing a classical optimizer that updates the parameter settings before reinitialization of the quantum circuit.Its success is predicated on the variational principle, meaning the ground state energy of the system bounds expectation values from below 6 .However, VQE is not without its challenges.First of all, the parametrized quantum state mentioned above -known as an ansatz -needs to be constructed carefully; it must be sufficiently expressible so the subspace of quantum states it spans contains the true ground state.On the other hand, if the ansatz is too expressible, we run into the problem of barren plateaus 7 where we observe vanishing gradients.This is more often a symptom of 'hardware efficient' ansätze 1,[8][9][10][11][12] , which aim to access the largest possible region of Hilbert space for the fewest number of native quantum gates.
To avoid barren plateaus, one must take into account some of the underlying problem structure to define ansatz circuits whose image is confined to a smaller, but more targetted, region of Hilbert space.Within this category are 'chemically inspired' ansätze that represent sequences of electronic excitation operators in circuit; unitary coupled cluster (UCC) 13,14 is widely acknowledged as the gold standard for electronic structure simulations, albeit computationally very expensive in practice.
More recently, we have seen the development of hybrid ansätze that bridge the gap between hardware efficiency and chemical motivation.For example, Gard et al. 15 designed a compact circuit designed to conserve molecule symmetries such as particle number and spin, while Adaptive Derivative-Assembled Pseudo-Trotter (ADAPT) VQE [16][17][18][19] describes a more complete approach to scalable quantum chemistry simulations by defining selection criteria of ansatz terms from a pool of excitation operators.
In this work we are concerned with Contextual Subspace VQE (CS-VQE) 45 , which describes a method of partitioning the molecular Hamiltonian into disjoint parts so that an electronic structure problem may be simulated to some degree on the available quantum device, even when the dimension of the full problem is too great to be encoded on the number of qubits available.This is supplemented by some classical overhead, but this often permits one to achieve chemical accuracy (1.6mHa ≈ 4kJ/mol) at a saving of qubits, as indicated by Kirby et al. 45 .
There has since been further research into the use of classical estimates of the electronic structure problem to reduce the resource requirements on quantum hardware.In particular, Classically-Boosted VQE (CB-VQE) 46 identifies classically tractable states and excludes them from the quantum simulation, alleviating some measurement and fidelity requirements of the VQE routine. 47,48, which exploits Z 2 symmetries of the Hamiltonian; the differences and similarities are highlighted herein and by Kirby et al. 45 .However, there are still a number of problems to address before CS-VQE may be successfully deployed on real quantum hardware, most notably with regard to the ansatz, which is the principal focus of this work.

Preliminaries
The notation used throughout shall be to write operators in standard capital font (A, B, C, . . .), with the exception of single-qubit Pauli operators being written in the form for p ∈ {0, 1, 2, 3}.Sets are denoted by calligraphic letters (A, B, C, . . . ) and vector spaces by script typeface (A , B, C , . . .).The state space of N qubits may be identified with the 2 N dimensional Hilbert space H = (C 2 ) ⊗N , with the space of (bounded) linear operators acting upon H denoted B(H ).
We introduce the Pauli group An N -qubit Hamiltonian can be written in the form for a set of Pauli operators T ⊂ P N -specifying real coefficients ensures H T is Hermitian.
The objective of quantum chemistry simulations is to estimate the ground state energy where evaluated via many prepare-and-measure cycles.The choice of ansatz restricts us to a subspace of quantum states and therefore must be carefully designed to be sufficiently expressible so as to capture the true ground state of the system.
A common form of ansatz state -particularly in relation to the electronic structure problem -is where |ψ ref ∈ H is some fixed reference state in which the quantum circuit is initialized and A(θ) = σ∈A θ σ σ for parameters θ σ ∈ R and Pauli operators σ ∈ A ⊂ P N ; the unitary e iA(θ) effects excitations above the reference state.Such ansätze as unitary coupled cluster (UCC) 13,14 may by expressed by our choice of A (taking as reference the Hartree-Fock state), in addition to any others based on the theory of excitation operators such as ADAPT-VQE [16][17][18][19] .The quantum advantage in VQE stems from the ability to prepare classically intractable states from our parametrized ansatz circuits.

Projections onto stabilizer subspaces
Given an operator σ ∈ P N , the space of quantum states |ψ ∈ H that it stabilizes are those satisfying σ |ψ = |ψ , the +1-eigenspace of σ.Extending this notion to an abelian subgroup of Pauli operators Q ⊂ P N , there is an induced vector space V Q of states stabilized by the elements of Q.
A particularly useful definition is that of a Hamiltonian symmetry, taken here to mean a set S ⊂ P N of Pauli operators such that In other words, a symmetry of H T is any set of Pauli operators that commute universally among T , which we may extend to an abelian group S := S generated by S under operator multiplication, which we shall call a symmetry group.
Note the setting in which we present symmetries here is stricter than the conventional definition, which considers any operator S that commutes with the Hamiltonian, i.e. [S, H T ] = 0, to be a symmetry.Such an operator need not commute with the individual terms as we require here.For example, in the fermionic picture, the number operator i a † i a i (where a is the fermionic annihilation operator and its Hermitian conjugate a † represents the creation operator) commutes with the full second-quantized molecular Hamiltonian, but not with an arbitrary excitation term.
The operators of S will in general be algebraically dependent, but the theory of stabilizers 49 ensures the existence of a set of independent generators G such that S = G .Now, recall the Clifford group consists of unitary operators U ∈ B(H with the property U σU † ∈ P N ∀ σ ∈ P N , i.e., U normalizes the Pauli group.We may construct a Clifford operation U mapping each symmetry generator to distinct single-qubit Pauli operators σ p , where we are free to choose p ∈ {1, 2, 3}.More precisely, there exists a subset of qubit positions This is a powerful concept that provides a mechanism for reducing the number of qubits in the Hamiltonian whilst preserving its energy spectrum.This is at the core of qubit tapering 47,48 , in which it is observed that implying the rotated Hamiltonian H T := U H T U † consists solely of identity or Pauli σ p operators in the qubit positions indexed by I stab .Taking expectation values, one may replace the qubits I stab by their eigenvalues ν i = ±1; each assignment defines a symmetry sector and at least one such sector will contain the true solution to the eigenvalue problem.Note the other sectors still have physical significance and may for example relate to solutions with different particle numbers or to excited states.In the Supporting Information we report the symmetry generators and corresponding sector for the Hamiltonians representing the molecular systems listed in Table 1.
A quantum state consistent with any such sector must be stabilized by the operators p and we may define a projection onto the corresponding stabilizer subspace.In general, a projection is defined to be an idempotent operator P ∈ B(H ), i.e.P 2 = P ; the projection onto the ±1-eigenspace of a single-qubit Pauli operator σ p for p ∈ {1, 2, 3} may be written States with no component inside the chosen eigenspace are mapped to zero and observe that P ± p σ q P ± p = ±δ p,q P ± p (11)   for q ∈ {1, 2, 3}.
Let H stab be the reduced Hilbert space supported by the stabilized qubits I stab and H red its complement such that H = H stab ⊗ H red .Given an assignment of eigenvalues ν ∈ {±1} ×I stab , we may project onto the corresponding sector via and subsequently perform a partial trace over the stabilized qubits I stab .This is effected by the unique linear map Tr stab : H → H red satisfying the property Finally, we may define the full stabilizer subspace projection map which, using the linearity of Tr stab , yields a reduced Hamiltonian where σ = U σU † = N −1 i=0 σ q i and we have written σ = σ stab ⊗ σ red .The new coefficients h σ := h σ i∈I stab q i =0 ν i differ from h σ by a sign dependent on the chosen symmetry sector.
In qubit tapering U is taken as (7), with the corresponding basis G a generating set for a full Hamiltonian symmetry 47,48 .Assuming identification of the correct sector, the ground state energy of the (N − |G|)-qubit reduced Hamiltonian H red T will coincide with the true value of the full system H T .
This stabilizer projection procedure is straightforward with respect to the Hamiltonian, since the stabilized qubits contain only operators with non-zero image under conjugation with P ν .However, suppose we were to take another observable A ∈ B(H ) and wish to determine a reduced form on B(H red ) that is consistent with the reduced Hamiltonian H red T .
This may be achieved by following precisely the same process that was applied to H T , but the symmetry S will not in general be a symmetry of A and therefore the 'symmetry-breaking' terms (those which anticommute with the generators G) will vanish under projection onto the stabilizer subspace, as per (11).Letting A ⊂ P N be the set of terms in the Pauli-basis expansion of A, observe that recalling that q i indicates the type of single-qubit Pauli acting on qubit position i ∈ Z N in some tensor product σ, defined in Section 2.
The resulting form is identical to ( 14), except we are explicit that the terms surviving projection are only those whose qubit positions indexed by I stab consist exclusively of identity and Pauli σ p operators; this is trivially true for the Hamiltonian by construction.Most importantly, this extends the stabilizer subspace projection to ansätze defined on the full system for use in variational algorithms.It should be noted that the above operations are classically tractable and can be implemented efficiently in the symplectic representation of Pauli operators.
We would be remiss not to draw attention to the likeness of ( 13) with Positive Operator-Valued Measures (POVM) 50 ; indeed, the projectors (12) define a complete set of Kraus operators 51 .The stabilizer subspace projection procedure is reduced to a matter of enforcing a partial measurement over some subsystem of the full problem, for which the relevant outcomes have been determined via an auxiliary method.For example, this could involve identifying a quantum state with a known non-zero overlap with the true ground state; measuring the symmetry generators G in this state will yield the correct sector.
Hartree-Fock often provides such a state for electronic structure problems, although it is not immune to failure; this is particularly true in the strongly-correlated regime.In these cases, we should defer to more effective reference states such as those obtained from Møller-Plesset perturbation theory (MP2), coupled-cluster (CC) methods and so on.One can imagine a hierarchy of increasingly precise ground state approximations, for which we should hope to obtain at some point a non-zero overlap with the true ground state.

CS-VQE in the stabilizer formalism
We now describe the Contextual Subspace VQE (CS-VQE) method in the stabilizer setting introduced in Section 3. CS-VQE partitions the Hamiltonian (2) into two disjoint components -one that is noncontextual and another that is contextual which provides quantum corrections to the former via VQE 45 .Explicitly, this allows us to write where T nc is a noncontextual set of Pauli operators and T c := T \ T nc is what remains, which will in general be contextual.
CS-VQE differs from qubit tapering (described in section 3) in the following way: the latter exploits existing (i.e.physical) symmetries of the Hamiltonian, whereas in CS-VQE we impose additional 'pseudo-symmetries' derived from the noncontextual Hamiltonian.This results in a loss of information, since any terms of T not commuting with the symmetry generators will vanish under projection.

The noncontextual problem
The notion of contextuality goes back to the Bell-Kochen-Specker theorem [52][53][54] There is an implied structure where the C i are equivalence classes with respect to commutation -in other words, elements of the same class commute and across classes they anticommute.Conversely, such a set of Pauli operators is contextual if and only if commutation fails to be transitive on T nc \ S.
The symmetry S can be expanded by taking pairwise products within equivalence classes and we may define As before, in Section 3, S induces a symmetry group for which one may define independent generators G and a Clifford operation U G mapping the generators to single-qubit Pauli operators; the expectation value over these qubits will again be determined by an assignment ν ∈ {±1} ×|G| of eigenvalues, analogous to the selection of a symmetry sector in qubit tapering.
From each equivalence class C i we select a representative C i and construct an observable Kirby & Love 57 found that quantum states |ψ (ν,r) ∈ H stabilized by the operators {ν f (G) G | G ∈ G} ∪ {C(r)} are consistent with a classical objective function η(ν, r) (derived in the Supporting Information), in the sense that η(ν, r) coincides with the noncontextual energy expectation value H Tnc ψ (ν,r) for all parametrizations (ν, r).This is a consequence of the joint probability distribution chosen over the phase-space points of their (epistricted) model 57,58 .
The noncontextual energy spectrum is therefore parametrized by two vectors: the ±1 eigenvalue assignments ν, determining the contribution of the universally commuting terms, and r, encapsulating the remaining pairwise anticommuting classes.In this sense, we may refer to (ν, r) as a state of the noncontextual Hamiltonian itself, abstracted from quantum states of the corresponding stabilizer subspace.Optimizing over these parameters, we obtain the noncontextual ground state energy nc 0 := min and call an element (ν, r) of the preimage η −1 ( nc 0 ) a noncontextual ground state of H Tnc .
Let us denote by ∆ nc := | nc 0 − 0 | the absolute error with respect to the true ground state energy.
As a classical estimate to the ground state energy of the full Hamiltonian H T , in Section 5 we found the difference between the noncontextual ground state and Hartree-Fock energy to be negligible for each of the molecules simulated, since the heuristic used to choose H Tnc prioritizes diagonal Hamiltonian terms.In principle, it may be an improvement upon Hartree-Fock as the noncontextual set can also take into account an off-diagonal contribution within the anticommuting classes.This is highly dependent on the chosen form of noncontextual set; a reformulation in terms of graphs -e.g.representing Pauli operators as nodes with (non)adjacency indicating (anti)commutation -will allow one to identify what the equivalent problem(s) are in computer science and therefore draw upon the vast body of existing research and select the best algorithms designed to solve such computational problems of graph theory.It should be noted the 'optimal' noncontextual subset will not necessarily be that which minimizes the noncontextual ground state energy and some consideration of the resulting quantum corrections must inform this choice.

Quantum corrections
Our simulation approach has thus far been strictly classical -now we arrive at the quantum element of CS-VQE.We have derived a classical estimate of the ground state energy from the noncontextual part of the Hamiltonian H Tnc ; however, the contextual component H Tc has so far been neglected.
While C(r) is not a stabilizer in the strict sense (it is not an element of the Pauli group), it is unitarily equivalent to one as a linear combination of anticommuting Pauli elements.
Similar to the symmetry generators G, it is possible to define a unitary operation U C mapping C(r) onto a single-qubit Pauli operator, following the approach of unitary partitioning [32][33][34][35][36] .
However, unlike the U G rotation, U C is not Clifford as it collapses M terms onto a single Pauli operator and can therefore introduce additional terms to the Hamiltonian.Kirby et al. 45 cautioned that, in principle, this increase in Hamiltonian complexity could be exponential in the number of equivalence classes M , namely a scaling of O(2 M ).However, Ralli et al. 36 demonstrated that the general scaling is O(x M −1 ) where x ∈ [1, 2]; that is, still exponential, yet the necessary conditions to obtain the worst-case x = 2 are contrived and have not been observed for any molecular Hamiltonians investigated to date.Regardless, one may circumvent this potential adverse scaling by implementing the linear combination of unitaries approach at the expense of one ancillary qubit and its necessarily probabilistic nature 33,35,36 .
Appending C(r) to our set of generators G := G ∪ {C(r)} and defining U := U C U G , there exists a subset of qubit indices for each G ∈ G.We reiterate that p ∈ {1, 2, 3} may be chosen at will; the approach taken by Kirby et al. 45 is to select p = 3 to enforce diagonal generators.
Suppose we have a quantum state |ψ (ν,r) that is consistent with nc 0 ; since the rotated state |ψ (ν,r) = U |ψ (ν,r) must be stabilized by σ p ∀ i ∈ I stab , the qubit positions I stab must be fixed.This implies a decomposition where |b (ν,r) represents a single basis state of H stab and |ϕ ∈ H red is independent of the parameters (ν, r).Therefore, the expectation value of the full Hamiltonian may be expressed as where π U ν (H Tc ) contains only the terms of the contextual Hamiltonian that commute with all the noncontextual generators, just as in (15) this form is obtained naturally when applying the stabilizer subspace projection to the full Hamiltonian, which automatically includes the noncontextual energy by fixing the corresponding eigenvalue assignments.Now, we may perform unconstrained VQE to obtain a quantum-corrected estimate of the true ground state energy with absolute error We have equality when the stabilizers span every qubit position, which is the case when | G| = N since the generators must be algebraically independent: this means the initial quantum correction is trivial as the noncontextual part determines the entire system.
For instances of the electronic structure problem there is no guarantee that c 0 will achieve chemical accuracy (∆ c < 1.6mHa ≈ 4kJ/mol) and, indeed, it might not improve upon the noncontextual estimate (although it will never be worse, due to the variational principle applying in this case).However, one can easily define a subset of T nc that is again noncontextual -this is achieved by discarding one of the noncontextual generators G ∈ G, along with the operators that it generates.We now append the discarded operators to the contextual Hamiltonian, relaxing the stabilizer constraint on the qubit position f (G) and permitting a search over its Hilbert space.This process may be iterated until the noncontextual set is exhausted and we recover full VQE.This means that, unless the ground state energy of H Tnc and H coincides, CS-VQE will improve upon the noncontextual energy using less quantum resource than full VQE -this is more rigorously defined in the next section.
In summary, what we have described here is a technique of scaling the relative sizes of the noncontextual (read classical) and contextual (read quantum) simulations in a reciprocal manner.We can therefore trade-off quantum and classical workloads in CS-VQE.

Expanding the contextual subspace
Now we describe the process of growing the contextual subspace more rigorously.We select a subset of noncontextual generators F ⊂ G whose stabilizer constraints we mean to enforce and construct a new noncontextual set T nc := T nc ∩ F; the contextual set is expanded accordingly by appending the terms not generated by F, i.e., T c := T c ∪ (T nc \ F).As before, there exists a unitary operation U F , a subset of qubit indices I fix ⊂ I stab and a bijective map ∀ G ∈ F (the rotation U F may or may not be Clifford depending on whether C(r) is among the stabilizers we wish to fix).
Denote by nc 0 (F) the ground state energy of the new noncontextual Hamiltonian T nc with absolute error ∆ nc (F) ≥ ∆ nc .While this is weaker as an estimate of the true ground state energy of the full system, at the very least we are guaranteed to recover the initial noncontextual ground state energy from performing a simulation of the expanded contextual subspace 45 , which we describe below.
The stabilizer constraints of F are enforced over the Hilbert space H fix = (C 2 ) ⊗I fix of qubits indexed by I fix , whereas we may perform a VQE simulation over H sim = (C 2 ) ⊗I sim , the Hilbert space of the remaining N − |F| qubits indexed by I sim = Z N \ I fix .Invoking the stabilizer subspace projection map π U F ν with the eigenvalue assignments ν = (ν i ) i∈I fix yields an expanded contextual subspace Hamiltonian with an error satisfying ∆ c (F) ≤ ∆ c .Recall that ∆ c = ∆ c ( G) corresponds with the contextual error when we enforce the full set of noncontextual stabilizers.
Observe that, when |I sim | = N , we are simply performing full VQE over the entire system -this occurs when we do not enforce the stabilizer constraint for any of the noncontextual generators, i.e.F = ∅.Therefore, it must be the case that Furthermore, given a nested sequence of generator subsets the minimal F such that ∆ c (F) < 1.6mHa.In general, we will not have access to a target energy and so will not necessarily know when the desired precision is achieved; instead, we might iterate until the VQE convergence is within some fixed bound.
Greedily selecting combinations of d ≤ N generators that yield the greatest reduction in error, necessitating This idea comes from the theory of pseudopotential approximations 59 , in which it is observed that chemically relevant electrons are predominantly those of the valence space, whereas the core may be 'frozen', thus reducing the electronic complexity.
Alternatively, one might define a Hamiltonian term-importance metric that considers coefficient magnitudes 60 or second-order response with respect to a perturbation of the Hartree-Fock state 61 .In relation to this, it is also not clear which features of a molecular system mean that it might be more or less amenable to CS-VQE; additional insight here would allow one to predict how many qubits will be required to simulate a given problem to chemical accuracy.

The noncontextual projection ansatz
CS-VQE has thus far not been applied to systems exceeding 18 qubits and the resulting reduced Hamiltonians (23) have been solved by direct diagonalization 45 -clearly, this will not scale to larger systems, with the required classical memory increasing exponentially.
Instead, they must be simulated by performing VQE routines, but defining an ansatz for the contextual subspace provided an obstacle to achieving this in practice.
However, having now placed the problem within the stabilizer formalism described in Section 3, we have already introduced (in Sections 4.1 -4.3) the tools necessary to restrict an ansatz of the form (5) -defined over the full system -to the contextual subspace (23).
The approach adopted here is equivalent to that which we defined for qubit tapering in (15).
To restrict a parametrized ansatz operator in line with the stabilizer constraints F ⊂ G we may simply call upon the stabilizer subspace projection map π U F ν once more, which yields a restricted ansatz state where Any rotated ansatz term U F σU † F that is not identity or a Pauli σ p on some subset of the qubit positions indexed by I fix will vanish.
The restricted reference state | ψref is obtained from a partial projective measurement of U |ψ ref (see the discussion on POVMs in Section 3) with outcomes defined by ν , which yields a product state The post-measurement state |b (ν,r) ∈ H fix on the noncontextual subspace represents a single basis vector and can therefore be disregarded, leaving just the state of the contextual subspace -this we take as reference for our restricted ansatz.We stress this 'measurement' is not performed in circuit but is instead to be evaluated classically when constructing the restricted ansatz circuit.
We may now define the contextual subspace energy expectation function with HT c as in (23), at which point we have reduced the problem to standard VQE, performed over a subspace of the full problem.
In order to prepare the projected ansatz state (27), we first initialize the |I sim |-qubit quantum circuit in the noncontextual ground state, achieved by applying a Pauli σ 1 operator in each of the qubit positions i ∈ I sim such that ν i = −1.This is visible in Figure 2, in which the VQE routine is initiated with the optimization parameters zeroed, i.e. θ = 0, and since e i Ã(0) = 1 optimization begins at the noncontextual ground state energy.
It is not in general possible to implement the unitary operation e i Ã(θ) exactly as a quantum circuit (except for in the case of completely commuting terms A of A(r)), however one may do so approximately via the commonly used technique of Trotterization (see the Supporting Information for further details).

Simulation results
The molecular systems that were simulated to benchmark the noncontextual projection ansatz for CS-VQE are given in Table 1.The molecule geometries were obtained from the Computational Chemistry Comparison and Benchmark Database (CCCBDB) 62 and their Hamiltonians constructed using IBM's Qiskit Nature 63 with PySCF the underlying quantum chemistry package 64 .
Before we evaluate the efficacy of our noncontextual projection ansatz, there are a few features of ( 27) that should be highlighted.First of all, in (29) we are applying the operation U in-circuit, introducing further gates that will contribute additional noise.However, when the reference state is taken to be that of Hartree-Fock, we observed U ψ ref to coincide with the noncontextual ground state.This is an artifact of the noncontextual set construction heuristic -used within both this work and 45 -prioritizing diagonal entries.This need not always be the case, but for the molecular systems investigated this allows us to avoid performing U in-circuit and instead take the noncontextual ground state as our reference.
Secondly, application of the unitary partitioning rotations U C to the ansatz operator A(θ) may introduce additional terms by a scaling factor of O(x M −1 ) where M is the number of equivalence classes in (17) and x ∈ [1, 2] a parameter depending on the given Hamiltonian, as discussed in section 4.2.We obtained M = 2 for all of the molecules tested, although for a general Hamiltonian this need not be the case and is also dependent on the form of the noncontextual set T nc .Here we prioritize universally commuting terms, but it is equally valid to maximize the anticommuting contribution.
Despite this, upon the subsequent projection of A(θ), it is possible that a significant number of terms will vanish.This is highly dependent on the quality of the initial ansatz and how heavily it is supported on the stabilized qubit positions I fix .Figure 1 presents circuit depths of the noncontextual projection ansatz as a proportion of the base ansatz from which it is derived, in this case the unitary coupled-cluster singles and doubles (UCCSD) operator.
A net reduction in circuit depth is observed, which is quite dramatic up to the point of reaching chemical accuracy in the CS-VQE routine; in Table 2 we give the specific number of ansatz terms before and after application of the noncontextual projection to UCCSD and UCCSDT for the fewest number of qubits permitting chemical accuracy.In order to identify a compact ansatz that closely captures the underlying chemistry with minimal redundancy, we employ the ADAPT-VQE methodology [16][17][18][19] .The algorithm centres around an operator pool from which terms are selected in line with a gradient-based argument and appended to a dynamically expanding ansatz whose parameters are optimized at each cycle via VQE.The particular approach we implement here is that of qubit-ADAPT-VQE 17 , which searches at the level of Jordan-Wigner encoded Pauli operators; the seminal ADAPT-VQE paper 16 instead defines its operator pool over fermionic excitations.
The Jordan-Wigner transformation 65 maps a single fermionic annihilation operator onto two Pauli operators with the creation operator given by its Hermitian conjugate a † i .Therefore, an excitation on s ∈ N spin orbitals of the form is represented by 2 2s Pauli operators under this encoding.In the unitary coupled cluster theory, we are interested rather in the operator a−a † to ensure unitarity upon exponentiation -this may be expressed by 2 2s−1 Pauli terms.
As such, after a mapping onto qubits via the Jordan-Wigner transformation, single, double and triple excitations account for 2, 8 and 32 Pauli operator terms respectively; while these are required to enforce various electronic symmetries in the ansatz state, not all are necessary to reach chemical accuracy.This idea lies behind qubit-ADAPT-VQE, which will select only the necessary Pauli terms and therefore yields considerably reduced circuit depths 17 .
To leverage ADAPT-VQE in the context of CS-VQE, we define an operator pool O ⊂ P N and apply to it the stabilizer subspace projection (13) to define a reduced pool π U F ν (O) for the corresponding contextual subspace.The algorithm is then executed as normal, only terminating once the ADAPT-VQE energy is chemically accurate with respect to the FCI energy; for scalability, one should terminate computation when the largest gradient in magnitude falls below some predefined threshold, since the true ground state energy will not in general be known.In the Supporting Information, we provide a detailed description of the specific ADAPT-VQE implementation used within this work.
For the following, we take our pool O to be the terms of the UCCSD operator for each of the molecules in Table 1 before tapering and projecting into the relevant contextual subspace.
In Figure 2, we present the ADAPT-VQE convergence data with expectation values obtained via exact wavefunction (statevector) calculations (i.e.no statistical/hardware noise); chemical accuracy is achieved in each instance.We used the adaptive moment estimation (Adam) 66 classical optimizer and computed parameter gradients as per the parameter shift rule 67 .
The number of ADAPT-VQE cycles (and therefore the number of terms in the resulting ansatz operator) are presented In Table 2, alongside the size of the projected UCCSD operator pool used; one observes a significant reduction in the number of terms.The optimized ADAPT-VQE ansatz operators are reported in the Supporting Information, along with a  27), used here in conjunction with ADAPT-VQE [16][17][18][19] .We plot (on a log 10 scale) the absolute error of wavefunction simulations conducted for the suite of trial molecules outlined in Table 1, each shown to achieve chemical accuracy; the horizontal axis indicates the number of function evaluations (nfev).Adaptive moment estimation (Adam) 66 is the classical optimizer taken in the VQE routine performed over the contextual subspace at each ADAPT-VQE cycle.The parameter gradients ∂ Ẽ(θ)/∂θ i , required for both operator pool term selection and VQE, were computed using the parameter shift rule 67 .
description of the smallest CS-VQE problem permitting chemical accuracy.This includes the optimal noncontextual generator subset F, the resulting noncontextual projection ansatz (27), restricted reference state | ψref (29), the target error ∆ c (F) (25) and that which was actually achieved in our VQE simulations (Figure 2).We also include the corresponding contextual subspace Hamiltonians for reproducibility.
Extracting the optimal parameter configuration θ min -i.e. that which minimizes (30)from the wavefunction simulations in Figure 2, we subsequently assess the effect of sampling noise on the simulation error with our ansatz circuit preparing the optimal quantum state | ψanz (θ min ) .Note that, for each of the molecular systems in 1, θ min is given explicitly in the Supporting Information.
To achieve an absolute error of ∆ > 0, one should expect to perform O( 1 ∆ 2 ) shots (for each term of the Hamiltonian) 4 .Conversely, suppose we are allocated a quantity S ∈ N of shots -the obtained error should be of the order O( 1 √ S ).In order to increase estimate accuracy, we collected the Pauli terms into qubit-wise commuting (QWC) groups 25 using the graph-colouring functionality of NetworkX 68 ; such groups may be measured simultaneously.
In Figure 3, the number of shots S = 2 n for n = 0, . . ., 20 carried out per QWC group is varied and we observe the root mean-square error (RMSE) over twenty realizations of the ground state energy estimate, plotted on a log-log scale.For clarity, note the only source of noise here is that which arises from statistical variation of the quantum circuit sampling -we have not introduced hardware noise in the form of imperfect quantum gates or decoherence.Two error regimes are observed, one of which is quite trivial: at high shot-counts we see a plateau resulting from the optimal error | Ẽ(θ min ) − 0 | being recovered.To assess the convergence properties outside of this limiting region, we plot a line of best fit m•log 10 (S)+c among the data not exhibiting such behaviour; since the data is represented on a log-log scale, this corresponds with a decay in error of O(S m ).In each plot of Figure 3  per QWC group.However, our shot budget could be reduced by implementing more advanced allocation strategies, for example according to the magnitude of Hamiltonian term coefficients 69 or a classical shadow tomography approach 30,31 .

Conclusions
We have placed CS-VQE on the theoretical footing of stabilizer subspace projections, which allows one to compare it against other qubit reduction techniques such as qubit tapering 47,48 .
Tapering defines a projection dependent on a symmetry of the full Hamiltonian and preserves the ground state energy exactly, whereas CS-VQE is approximate and projects onto a contextual subspace consistent with a symmetry of the noncontextual sub-Hamiltonian, augmented by an anticommuting contribution.In combination, the two techniques can effect a significant reduction in quantum resource requirements, as illustrated by Kirby et al. 45 and in Figure 1.
Previously, the only obstacle to building a CS-VQE framework that would be faithful to deployment on quantum devices was that of the ansatz, which has been addressed within this work.Furthermore, we demonstrated how CS-VQE may be combined with the ADAPT-VQE [16][17][18][19] ansatz construction framework by applying our noncontextual projection to the operator pool; validation was presented in Figure 2 in which we achieved chemical accuracy for the suite of small molecules outlined in Table 1.This combination provides considerable flexibility in both qubit count and circuit depth, allowing one to identify a reduced problem that may be simulated on the available quantum resource.
A number of research questions concerning the scalability of CS-VQE remain; we recapitulate these here.Firstly, the success of CS-VQE is sensitive to the generator subset F one chooses to constrain in the stabilizer subspace projection.To date, the most effective method for choosing this subset has been a greedy-search heuristic necessitating O(N d+1 ) VQE simulations where d ≤ N is the search depth; this is expensive for NISQ hardware and there is room for more targeted heuristics.For example, we may draw on chemical intuition to inform the selection of a contextual subspace that captures information about the underlying electronic structure problem.The second obstacle lies in the approach taken to construct the noncontextual sub-Hamiltonian.There is currently no intuition as to what constitutes an effective choice here, although it should be noted the 'optimal' noncontextual subset will not necessarily be that which minimizes the noncontextual ground state energy; some consideration of the resulting contextual subspaces must come into the construction of the noncontextual problem.We leave these issues for future work.
Finally, we have written an open-source Python package that facilitates the stabilizer subspace projection techniques of this paper, with in-built tapering and CS-VQE functionality.
We welcome the reader to make use of our code 70 , which is freely available on GitHub.
up to multiplication by ±1, ±i.Note the distinction between the bold font σ denoting tensor products and σ p a single-qubit Pauli operator; we will sometimes write σ(i) p to index explicitly the qubit position i ∈ Z N on which it acts.We shall also make use of the commutator [A, B] := AB − BA and anticommutator {A, B} := AB + BA, defined for operators A, B ∈ B(H ), which are zero when A and B commute/anticommute, respectively.
Performing an |I sim |-qubit VQE simulation over the contextual subspace we obtain a new quantum-corrected estimate c 0 (F) := min |ϕ ∈H sim HT c ϕ and the convergence is monotonic.In this way, CS-VQE describes an interpolation between a purely classical estimate of the ground state energy and a full VQE simulation of the Hamiltonian performed over some ansatz space.In the context of electronic structure calculations, this often permits one to achieve chemical accuracy at a saving of qubit resource, as indicated by Kirby et al. 45 for a suite of tapered test molecules of up to 18 qubits.Suppose we wish to find the optimal contextual subspace Hamiltonian of size N < N .The problem reduces to minimizing the error ∆ c (F) over the | G| N −N generator subsets F ⊂ G satisfying |F| = N − N .CS-VQE is highly sensitive to this choice and remains a vital open question for the continued success of the technique.For chemistry applications, we grow the contextual subspace until the CS-VQE error attains chemical accuracy, which means finding

Figure 1 :
Figure 1: Ideal CS-VQE errors (left-hand axis) and corresponding noncontextual projection ansatz circuit depths as a proportion of the full UCCSD operator from which it is derived (right-hand axis) against the number of qubits simulated.

Figure 2 :
Figure 2: Validation of the noncontextual projection approach to ansatz construction for CS-VQE(27), used here in conjunction with ADAPT-VQE[16][17][18][19] .We plot (on a log 10 scale) the absolute error of wavefunction simulations conducted for the suite of trial molecules outlined in Table1, each shown to achieve chemical accuracy; the horizontal axis indicates the number of function evaluations (nfev).Adaptive moment estimation (Adam)66 is the classical optimizer taken in the VQE routine performed over the contextual subspace at each ADAPT-VQE cycle.The parameter gradients ∂ Ẽ(θ)/∂θ i , required for both operator pool term selection and VQE, were computed using the parameter shift rule67 .

Figure 3 :
Figure 3: Each of the plots 3a -3i correspond with 2a -2i above and illustrate the statistical effect of sampling noise at the optimal parametrization θ min determined from the ADAPT-VQE statevector simulations in Figure 2. We plot the root mean-square error (RMSE) for twenty 'realizations' of the ground state energy estimate with S ≤ 10 6 shots executed via IBM's QASM simulator; determining the line of best fit m • log 10 (S) + c with respect to the log-log data indicates a decay in error of O(S m ).
. Here we use an explicit condition for the noncontextuality of a set of Pauli operators, developed by Kirby & Love 55 and independently by Raussendorf et al. 56 .Strictly speaking, this condition tests for strong measurement contextuality.In this setting, a set T nc is understood to be noncontextual if and only if commutation forms an equivalence relation on T nc \ S, where we have defined the sub-Hamiltonian symmetry S . It was observed by Kirby et al. 45 that any term which anticommutes with at least one noncontextual generator must have zero expectation value and our stabilizer subspace projection captures this fact.
Inspecting(20), we may optimize freely over quantum states ϕ, i.e., we are not constrained by the noncontextual ground state within H red .In fact, we may absorb the noncontextual ground state energy into the reduced contextual Hamiltonian, defining the contextual subspace Hamiltonian simulations, is an effective stabilizer relaxation ordering heuristic.Taking d = 2 produces a good balance between efficiency and efficacy 45 , but there is room for more targeted approaches that exploit some structure of the underlying problem.For example, in quantum chemistry problems it could be that one should relax the stabilizers that have non-trivial action near the Fermi level -between the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO).Excitations clustered around this gap are more likely to appear in the true ground state and should therefore not be assigned definite values under the noncontextual projection.

Table 1 :
The systems investigated to benchmark the noncontextual projection ansatz (all in the STO-3G basis).The CS-VQE column indicates the fewest number of qubits required to achieve chemical accuracy.

Table 2 :
The number of Pauli terms |A| for a selection of (tapered) ansätze.The |I sim | column indicates the number of qubits in the contextual subspace over which the ansatz is projected and each tuple (full/proj) gives the number of terms pre and post projection.The final column gives the number of ADAPT-VQE cycles required to achieve chemical accuracy, with the operator pool consisting of the projected UCCSD terms; each simulation is plotted in Figure2.