Fermionic Reduced Density Low-Rank Matrix Completion, Noise Filtering, and Measurement Reduction in Quantum Simulations

Fermionic reduced density matrices summarize the key observables in Fermionic systems. In electronic systems, the two-particle reduced density matrix (2-RDM) is sufficient to determine the energy and most physical observables of interest. Here, we consider the possibility of using matrix completion to reconstruct the two-particle reduced density matrix to chemical accuracy from partial information. We consider the case of noiseless matrix completion, where the partial information corresponds to a subset of the 2-RDM elements, as well as noisy completion, where the partial information corresponds to both a subset of elements and statistical noise in their values. Through experiments on a set of 24 molecular systems, we find that 2-RDM can be efficiently reconstructed from a reduced amount of information. In the case of noisy completion, this results in a multiple orders of magnitude reduction in the number of measurements needed to determine the 2-RDM with chemical accuracy. These techniques can be readily applied to both classical and quantum algorithms for quantum simulations.


I. INTRODUCTION
Although quantum states live in a Hilbert space that is exponentially large in physical system size, most information of physical interest can be captured by quantities of much reduced dimension.For time-independent fermionic observables, the relevant quantities are the fermionic reduced density matrices (RDMs) [1,2].For example, the -RDM, defined as where  †  ,   denote fermionic creation and annihilation operators in an orbital basis, contains all information on fermion observables.We will be interested in electronic systems, where the interparticle interaction is Coulombic, and the Hamiltonian is thus of two-body form.In this case, the 2-RDM 2  , = ⟨ †   †      ⟩ is of particular interest, as it determines the electronic energy [3,4].
Because Ψ can be quite complicated in a correlated electronic state, obtaining an accurate -RDM can be expensive.Here we discuss how to obtain improved approximations to the -RDM (specifically, the 2-RDM 2  , although the procedures are general) from incomplete information on its elements.We consider two types of incomplete information.The first is a noiseless setting where we have only computed a subset of the RDM elements.This situation is relevant to deterministic algorithms (or stochastic algorithms in a setting where the statistical noise is very small) when obtaining the full -RDM is expensive.The second is a noisy setting, where the goal is to reduce the total number of measurements.Such a noisy setting arises in both quantum Monte Carlo algorithms (as a statistical noise) [5][6][7][8][9][10][11][12][13] and in quantum simulations (as a measurement shot noise) [14][15][16][17][18][19][20][21][22].In the latter case, measurement reduction  is especially relevant to hybrid quantum-classical algorithms [26,43,[46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63] which rely on feedback from measured quantities.The quantum shot noise will be the specific noise setting considered in this work.
Here we focus instead on the k-fermionic RDMs, and the specific matrix completion heuristics applicable to an electronic structure setting.
Matrix completion algorithms rely on a number of input parameters.We first define how such input parameters, such as the target rank, sampling method, incoherent basis, etc. can be determined in an electronic structure setting.We further introduce simple postprocessing (or error mitigation) techniques to improve the results of the completion.We then analyze noiseless and noisy matrix completion using a testbed of molecules from a subset of the G2 dataset [99].In general, we find that with an optimized completion protocol, it is possible to reduce the measurement cost, either with respect to the number of elements or with respect to the number of shots, by 1-3 orders of magnitude across our dataset, while retaining a relevant accuracy to chemistry.

A. Recap of matrix completion and low-rank noise filtering
We briefly recall some relevant aspects of matrix completion.For a more detailed introduction, we refer to Refs.[75,78,100].We restrict ourselves to square symmetric positive semi-definite matrices.The objective is to recover an approx-arXiv:2306.05640v1[quant-ph] 9 Jun 2023 imation to a low-rank  ×  matrix  from incomplete information about its elements.We first consider the case where we can measure the elements exactly (i.e.without noise) and the incompleteness is from measuring a subset of the elements Ω.Then, given |Ω| ≡  sample elements of matrix , we can solve for a positive low-rank approximation   through the minimization: where M is the desired low-rank completion.Because we restrict to square symmetric positive definite , we can use the parametrization M =  †  where  is a real  ×  matrix, and then perform minimization over  by gradient-based techniques.
The efficiency of the above matrix completion can be discussed in terms of the fraction of sampled elements  sample =  sample ∕ 2 required to obtain a given distance between M and , such as the relative error (in the Frobenius norm) The efficiency clearly depends on the sampling scheme (i.e. the elements in Ω) and how information about the matrix is distributed in its entries (the matrix coherence).Assuming a random sampling scheme, successful matrix completion requires information to be spread over all matrix elements.For example, a matrix with only one nonzero element can only be completed correctly if the nonzero element is sampled.The distribution of such nonzero information can be quantified by the coherence in terms of the singular vectors of  [75,100]: for  =  Λ  with  a × matrix, we define the geometric coherence  as where   ∈ ℝ  is the standard basis.If all elements of  have magnitude 1∕ √ , this yields the minimum coherence  = 1, while if the columns of  align with the standard basis, we obtain a maximum coherence  = ∕.The number of elements required to complete the matrix successfully can be shown to increase linearly with the coherence as ( poly(log )) [75,100].
In our application, we require two generalizations of the above matrix completion.The first is that  is only approximately low rank, i.e. there are  singular values above some threshold, but also singular values below this threshold.Given some assumed rank  in the matrix completion, we can expect the best recoverable matrix to be   =  Λ    (where Λ  contains the  largest singular values) and there is a remaining rank truncation error ∑ > Λ 2  where Λ  are the singular values in decreasing order.The best choice of rank  is not known ahead of time.We thus discuss how  can be estimated below using an independent approximate model of .
The second generalization is that we consider matrix completion in the presence of noise.The statistical noise decreases as we increase the number of measurement shots  like 1∕ √ .The efficiency of matrix completion can be assessed as   = ∕ 0 where  0 is the number of shots required in some standard measurement scheme to achieve a given error in .Matrix completion is a useful technique in this context because statistical noise does not have a low-rank structure.Thus, if the noise is not too large, performing low-rank matrix completion filters out the noise.There are thus two potential gains in noisy matrix completion: one from measuring fewer distinct elements of the matrix, and one from requiring fewer shots to reduce the noise.

B. The fermionic 2-RDM
The fermionic 2-RDM, which we label  for simplicity, determines the electronic energy of the system.Given  , , we obtain the 1-RDM where  el = Tr is the number of electrons.The electronic energy is then where   and   are the one-and two-electron integrals, respectively.We denote the size of the orbital basis by .We will refer to the two-electron part of the energy as The 2-RDM has a number of symmetries.In a real-orbital basis, the 2-RDM has 8-fold symmetry We will only sample or measure unique elements (e.g.only the lower triangular part of  ), and non-zero spin sectors, in our completion tests below, although for simplicity, we will refer to all spin sectors collectively as  .
The maximum rank  is .For orientation, if one assumes the Hartree-Fock density matrix, where and  is idempotent, then (  , ) =   (  − 1)∕2 and (  , ) =     where   and   are the number of spinup and spin-down electrons, respectively.These are the minimum ranks for an electronic system: if there are electron correlations, the rank of the 2-RDM increases.In Fig. 1, we show the singular values of the spin-components of  for two models of electron correlation: coupled cluster singles and doubles (CCSD) and second-order Møller Plesset perturbation theory (MP2) [101][102][103].In both models, the singular value spectrum contains large singular values, corresponding to the Hartree-Fock piece of the 2-RDM.Beyond these, the singular values decay approximately exponentially [104].

C. Noiseless completion of the 2-RDM
We first consider the noiseless completion of the 2-RDM, where we have an incomplete sampling of the elements.To define the minimization problem in Eq. 2 concretely, as described in Sec.II A, we must specify (i) how to sample the elements,(ii) the estimated matrix rank , and (iii) the number of elements to sample.
While there are procedures to estimate the approximation rank  on the fly [105,106], here we use a simpler process that is likely available in many applications.Recall that we wish to use matrix completion in a setting where obtaining the elements of  is expensive.We can determine a less accurate model 2-RDM   via a cheaper procedure, and use the model to determine the optimal sampling, choice of rank , and the number of elements to sample.We define the rank  model approximation    =  Λ   † and choose  such that where  0 is our target completion error, and  is an empirical constant to account for the fact that our final error includes not only the rank truncation error arising from Eq. 9 but also a completion error from incomplete sampling.Here we use  = 1∕2.Next, we consider the element sampling.Since we have a model available, one might consider sampling elements in the descending order of magnitude of elements of   in a basis such as the canonical molecular orbital (MO) basis.However, for the smaller elements necessary to complete  to chemical accuracy in energy, we observe a significant difference between our model   and  .Thus, sampling in this order does not give a favorable completion efficiency.As a result, we instead use uniform random sampling of the elements, which is efficient if the matrix is not very coherent.To minimize the coherence of the 2-RDM, we optimize orthogonal matrices  such that the coherence of See III for practical implementation.Fig. 2 shows the reduction in the coherence of a model MP2   after such orbital rotations.To estimate  sample (the fraction of elements to sample), we perform matrix completion on the model   for the specified rank , and coherence optimized orbitals, and choose  sample so ( P   ,   ) <  0 .An example of such a model matrix completion is shown in Fig. 3.As the fraction of sampled elements increases, the completion error saturates at the rank truncation error.However, there is an unusual feature where the com-pletion error rises near the theoretical information bound (the number of elements needed to exactly complete a symmetric matrix with the exact rank of ).This appears related to the approximate low-rank nature of   , and the non-trivial feature which is difficult to describe purely theoretically illustrates the value in having an explicit model   to determine the parameters of matrix completion.RDMs in the cc-pVDZ basis as a function of the fraction of sampled elements  sample .The completion rank is chosen according to Eq. 9.
The theoretical information bound of (2 −  2 + )∕(( + 1)) is the ratio of degrees of freedom in a rank- symmetric  ×  to that of a rank- symmetric matrix.

D. Measuring the 2-RDM in the quantum setting
We now consider the problem of measuring the 2-RDM with noise, which we will take to arise from quantum measurements.We choose a Jordan-Wigner encoding of fermions and assume we are measuring Pauli operators.The expectation value of strings of Pauli operators can then be converted to fermion expectation values.For a quartet of fermion labels , , , , the 3 fermion expectation values not related by permutational symmetry ⟨ †   Because the quantum state is not in a simultaneous eigenstate of all the measured operators, there will be statistical errors in (some of) the measurements.Although there exist a variety of techniques to minimize the number of measurement settings by grouping simultaneously measurable operators [27,30,32], we use the straightforward approach of independently measuring each Pauli string and leave potential improvement by grouping to future work.Thus we sample fermionic terms in sets of 3 in Eq. 10, each set associated with a , , ,  quartet and reconstructed from the same 8 Pauli strings.The measurement variance for the Pauli string  is then obtained from the binomial distribution as  2 = (1 + ⟨⟩)(1 − ⟨⟩)∕ Q where  Q is the number of measurements of the string.
To define the efficiency of matrix completion, we first need to define a "standard" measurement procedure, where no matrix completion is performed.In this scheme, all Pauli strings required for the fermionic 2-RDM are measured with the same number of shots yielding a noisy P .To estimate the total number of shots  required, we measure  in the coherence minimized orbital basis, and choose  such that ( P ,  ) <  0 .In our tests, coherence minimization does not appreciably change the  required, as illustrated on the model   in Fig. 4.

E. Noisy completion of the 2-RDM
We now discuss matrix completion when measurements include statistical errors.Given our model   , we use the same rank estimation procedure and coherence minimization procedure as in the noiseless setting.However, we need a different procedure to determine the number of (sets of) fermionic elements  sample to measure, as the actual cost we wish to optimize is related to the total number of measurement shots .For simplicity, we assume each Pauli string is being measured with the same number of shots.For a given  we should then search over  sample to find the number of fermionic elements to measure that complete   with the lowest completion error; we then increase  until the completion error is below  0 .
In Fig. 5 we illustrate a typical result from searching for the optimal  sample .We see that in this problem, for a given  ( = 0 line) it is in fact optimal to sample close to 100% of the elements.This means matrix completion is performing almost entirely as a low-rank noise filter.In the other lines, we illustrate how the cost balance changes if we introduce a cost to switch the measurement setting when changing the Pauli strings (in multiples of the measurement cost,  = 500, 10000 data;  = 500 corresponds to the reported cost to switch measurements for the Sycamore quantum processor [107]).For a very high measurement setting cost, e.g. the case of  = 10000, there is a benefit to sample fewer elements.However, in the subsequent calculations, we will neglect the cost of measurement switching and determine the best ,  sample pair assuming  = 0.

F. Post-processing the completed 2-RDM
We can improve the results of matrix completion and noise filtering through post-processing.In the noisy quantum setting, this can be viewed as a form of error mitigation.We perform the following steps: 1.For noiseless completion, we replace sampled terms in the completed 2-RDMs with their exact values (i.e.giving zero completion error on the sampled terms).
3. For noiseless completion, we apply matrix completion to obtain P   , and the 2-RDM error of the model   − P   is added to our completed P  .

III. COMPUTATIONAL DETAILS
We use 24 small molecular systems from the G2-1 test set [99], including both singlet (s) and triplet (t) states of CH 2 and SiH 2 and 20 other molecules in their lowest spin state for our noiseless completion studies, and a further subset of the 7 smallest ones to study the basis dependence of noiseless completion and for the noisy completion studies.(The 7 smallest molecules serve as representative examples for the larger set, encompassing both those with the smallest and largest cost reductions in the noiseless completion results).We used (aug-)cc-pVXZ bases [108][109][110] throughout and in CCSD and MP2 calculations froze the lowest energy orbitals (1 for first row, 122 for second row).Completion is thus only performed for the non-core part of  .Molecular geometries at the B3LYP/6-31G(2df, p) [111][112][113] level of theory were taken from Ref. [114,115].We computed unrestricted CCSD density matrices, as the reference "exact" density matrices.We computed unrelaxed unrestricted MP2 density matrices as the model density matrices.
For our completion studies, we used a target completion error of  0 = 1%.Minimizing the geometric coherence corresponds to minimizing where  , are the singular vectors of [   ] , and  is the basis rotation matrix to be optimized.However, this minimization is numerically inconvenient because the max function is not differentiable everywhere.Instead, we perform the minimization starting from 10 Haar random [116] initial guesses of orthogonal matrices .In the noiseless matrix completion, for the MP2   , we randomly generated 10 different element samplings for each  sample , and for each sampling, used a maximum of 15000 iterations in the matrix completion optimization with the L-BFGS-B algorithm [117,118]. sample was chosen so that  ≤  0 in no less than 90% of the model completions.The CCSD 2-RDM completion was carried out using the same 10 element samplings as for MP2 2-RDMs.The CCSD completion errors were then averaged over all 10 trials except for cases where the optimization was not converged.
In the noisy measurement setting, 10 random samplings were generated for each  sample to estimate the best ,  sample pair.All quantum chemistry calculations were carried out with PySCF [119,120], while the conversion of the fermionic operators to Jordan-Wigner form was carried out using the Open-Fermion package and the OpenFermion-PySCF plugin [119,121].The Jordan-Wigner transformations were carried out on the 2-RDM in the coherence minimized basis.

A. Noiseless completion
Fig. 6a shows the completion results for  0 = 1% for 2-RDMs in the cc-pVDZ basis for the 24  The error of the MP2 model translates to the observed errors in the target CCSD 2-PDM matrix completion, with  ≤ 1.5%.The CCSD two-particle energy error is somewhat larger and grows from left to right in the plot.The molecules in the plot are ordered in terms of increasing  el ∕.For molecules such as BeH, where there is a significant difference between the singular value of the MP2 model and CCSD (reflecting the stronger correlation described by CCSD), the main reason for the increased CCSD completion error comes from the increased rank truncation error.In Fig. 6c, we see that  sample is roughly proportional to ( el ∕) 2 .This comes from the rank of the Hartree-Fock 2-RDM as discussed in section II B.
The above suggests that matrix completion is more useful in larger basis sets.We test this in the subset of 7 molecules in Fig. 7. To achieve 1% completion error in the model MP2 2-RDM, we find that the fraction of samples needed decreases with basis size as ∼ 1∕ 2 log 0.6 ().This falls between the fundamental information lower bound of 1∕ 2 [93,122], below which  is underdetermined, and the best provable bound of 1∕ 2 log(), which guarantees a high probability to complete  to a constant error [98].
As expected, post-processing reduces the energy error of the completed target 2-RDMs.In Fig. 8 we show the effect of the different steps on the two-particle energy  2 .Out of the 3 post-processing steps, normalizing the trace of  reduces the error the most, by 1-2 orders of magnitude.This suggests that the majority of energy error comes from the low-rank approximation, i.e. from truncating small eigenvalues, which reduces the trace of  .After all post-processing steps, the two-particle energy errors are around chemical accuracy (1.6 mHa).

B. Noisy 2-RDM completion
We now carry out similar numerical experiments in the presence of measurement noise.In Fig. 9a, we report the average number of shots (i.e.number of shots divided by the number of unique elements of  , denoted m) in the standard measurement scheme and the average number of shots in the matrix completion scheme required to obtain  0 = 1% for the subset of 7 molecules in the aug-cc-pVDZ basis.We see that across all molecules, there is a significant reduction (1∕  ) in the average number of shots required compared to the standard measurement scheme; the total reduction is between 1 to 3 orders of magnitude in the aug-cc-pVDZ basis.For matrix completion, the associated  sample used to generate the matrix completion data in Fig. 9a is reported in Fig. 9b.Almost all terms are measured for all the molecules, consistent with Fig. 5. Thus, resource reduction primarily comes from filtering the statistical noise in the measurements.In Fig. 9c we show the observed measurement cost reduction 1∕  to complete to 1% accuracy as a function of the estimated rank of   .We find   ∼ ∕ (the relative rank of the 2-RDM), which, when rescaled for || || ∼ , matches theoretical sample complexities of low-rank completions performed on normalized density matrices || P || = 1 [93,98].Therefore, resource reduction due to high-rank noise filtering is closely related to the low-rank property of the 2-RDM.
In Fig. 10, we show the energy error and 2-RDM error before and after normalizing  to the correct number of electrons.After this post-processing, the energy errors are all within chemical accuracy.

V. CONCLUSIONS
We have demonstrated that matrix completions can effectively reduce the effort to obtain fermionic -RDMs of interesting in electronic structure, and in particular, the 2-RDM.This was achieved by exploiting the low-rank structure as well as information obtained from approximate models of the 2-RDM.
The current work has immediate applications in both classical and quantum algorithms to obtain 2-RDMs.In the classical setting, we envision that these techniques can easily be employed in quantum Monte Carlo simulations.In the hybrid quantum algorithm setting, there exist other techniques to reduce the measurement resources, such as optimizing the groups of qubit-wise commuting Pauli terms [30,32], or employing classical shadows [27,29].It is likely these methods can be employed in conjunction with the matrix completion technique.In addition, it will be interesting to explore analogs of matrix completion which use the tensor structure of the 2-RDM, or impose additional constraints, such as representability conditions [123].

FIG. 1 .
FIG. 1. Singular values of the a)   , b)   , and c)   sectors of the (unrestricted) CCSD (solid) and MP2 (dash) 2-RDMs of the HF and CH molecules in the cc-pVDZ basis.The -axis is the rank  divided by the rank of the corresponding Hartree-Fock 2-RDM.

FIG. 3 .
FIG. 3. Completion errors of BeH, CH, OH and HF MP2  RDMs in the cc-pVDZ basis as a function of the fraction of sampled elements  sample .The completion rank is chosen according to Eq. 9.The theoretical information bound of (2 −  2 + )∕(( + 1)) is the ratio of degrees of freedom in a rank- symmetric  ×  to that of a rank- symmetric matrix.

FIG. 4 .
FIG. 4. Average variance per Pauli string of MP2 2-RDMs in the augcc-pVDZ basis from a quantum measurement where each Pauli term is measured with 1 shot.The gray bars labeled "MO" denote measurements in the canonical MOs; the green bars labeled "incoherent" denote measurements in the coherence minimized orbital basis.

FIG. 5 .
FIG. 5. Completion errors of BeH CCSD  in the aug-cc-pVDZ basis in three settings, each with a different cost  to switch the measurement, for a fixed number of shots per element (1317636).The completion error at each  sample is averaged over errors from 1000 different random samplings and the standard deviation is taken as the error bar.

FIG. 9 .
FIG. 9. a)The average number of shots per unique Pauli term m for 7 molecules needed in the "standard" measurement scheme and when using "matrix completion".Their ratio 1∕  , i.e. the factor of measurement cost reduction, is reported next to the bars.b) Fraction of fermionic terms sampled  sample used in a) "matrix completion".c) The measurement cost reduction, 1∕  , is proportional to ∕, where  is the MP2 rank estimate used in matrix completion.( and  are averaged over the 3 spin sectors).

FIG. 10 .
FIG.10.Absolute energy error from the completed 2-RDM before and after step 2 (trace normalization) of post-processing (PP).After post-processing, the two-particle energy error is within chemical accuracy (1.6 mHa).
Coherence of the MP2 model 2-RDM in the cc-pVDZ basis.In the canonical MO basis, the coherence (purple) is close to the maximal coherence  = ∕ (black circle).After random orbital rotations, the coherence diminishes significantly, approaching the minimal coherence  = 1 (black dashed line).Note that after discarding the core orbitals (see main text) Li 2 , LiH, and BeH only have a non-zero   sector.
. 8. The effect on the two-particle energy  2 of each postprocessing option as defined in Sec.II F. FIG