High Magnetic Field Stability in a Planar Graphene-NbSe2 SQUID

Thin NbSe2 retains superconductivity at a high in-plane magnetic field up to 30 T. In this work we construct a novel atomically thin, all van der Waals SQUID, in which current flows between NbSe2 contacts through two parallel graphene weak links. The 2D planar SQUID remains uniquely stable at high in-plane field, which enables tracing critical current interference patterns as a function of the field up to 4.5 T. From these we extract the evolution of the current distribution up to high fields, demonstrating sub-nanometer sensitivity to deviation of current flow from a perfect atomic plane and observing a field-driven transition in which supercurrent redistributes to a narrow channel. We further suggest a new application of the asymmetric SQUID geometry to directly probe the current density in the absence of phase information.

(see Fig. S1b). Note that there was trapped flux in the system in one of the measurements at B lab x = 0T , as evidenced by the finite compensation field. Note also that as B lab x approaches 4T, the compensation curve becomes non-linear; this is due to the automated cross-correlation algorithm gradually failing as the critical current descends below the minimal current detectable by the set voltage threshold. We present here another measurement nearly identical to the one shown in the main text in Fig. 4a, but with a wider range of B ⊥ , thus containing a few more lobes in the interference pattern. We used this data set to extract the current density in main text Fig. 4b using the maximum entropy method (as shown in supplementary section 3), since the additional lobes improve the spatial resolution of the extracted current density. However, this measurement loses stability at B ∥ = 2.5T , compared with the measurement shown in main text Fig. 4a which remains stable up to B ∥ = 4.5T . Indeed, we performed around 10 similar measurement maps with different B ⊥ ranges, and found that in general, the narrower the range of B ⊥ , the higher the values of B ∥ we could reach without incurring flux jumps. We attribute the flux jumps as arising from the penetration of vortices to the vicinity of the junction.

S3 Supplementary Section: Analytical calculation of two channel interference pattern
The Josephson effect occurs when current flows between two superconducting electrodes (in this case NbSe 2 ) connected by a weak link. The proximity of the superconductors, under the correct conditions, allows a supercurrent to flow through the junction. Upon application of magnetic field perpendicular to the junction, the superconducting order parameter ∆e iφ acquires a position-dependent phase and undergoes interference, resulting in a diffraction pattern of the critical current in magnetic field. The first Josephson relation relates the critical current density J(x) to the phase difference between the order parameters of the two superconductors A and B, γ( Here J 0 is the maximal possible critical current density at location x. The order parameter must retain a single-valued phase around any closed loop through which current may circulate, leading to the requirement: Where ϕ A is the magnetic flux through a loop connecting x 1 , x 2 and extending across the junction length d into the superconductors up to the London penetration depth λ on either side (see Fig. S3b).
For convenience we denote L ≡ 2λ + d.
Now we address our specific geometry. For this device we use NbSe 2 of thickness around 12-20 layers (≈7-13 nm). Fifteen layers are clearly seen in cross-section TEM measurement (Fig. S3c), below 3-5 nm of oxide; also from optical images, NbSe 2 thickness may vary slightly throughout the sample (Fig. S3e).
The junction, of length d=140 nm in the direction of the current flow (y axis), has an MLG weak link of width W M LG ≈ 1.45µm and an FLG weak link of width W F LG ≈ 0.45µm and thickness h=2.4 nm (8 layers), their centers separated by a distance 2δ ≈ 2.7µm. We define the x axis in the plane of the MLG/FLG flakes, perpendicular to current flow, and z perpendicular to the MLG/FLG plane. The magnetic field B ∥ we refer to as "in-plane" is oriented parallel to the mean SQUID plane: the plane connecting the center of the MLG and FLG flakes. This plane is at a small angle tan (θ) = h 2δ with respect to the x axis. The field referred to as B ⊥ is perpendicular to B ∥ . This choice in alignment of B ∥ maintains the peak of the central SQUID oscillation at B ⊥ = 0 regardless of the applied B ∥ (as described in Section S1). The applied field in x,z coordinates is as follows: The approximations are to the first order in the small parameters θ and B ⊥ /B ∥ . The corresponding wave-numbers are: We define a reference phase γ 0 = γ(x = 0), with x = 0 at the center of the junction, such that the centers of the MLG and FLG flake are both at the same distance δ from zero. Finally, the critical current is given by integrating over the current density J 0 (x)sin(γ(x)). We take care that the accumulated phase difference around a closed loop is always 0, accounting for the B x flux exiting through loops with a vertical portion formed by the step h between the MLG and FlG: The integral can be written as the imaginary part of a complex exponential, and the whole expression becomes a Fourier transform: To get a simple "zero-order" analytical expression for the critical current in our junction, we normalize the interference pattern by dividing by I C (B ⊥ = 0), and assume a constant current density in each channel, with the ratio J F J M ≡ f an unknown parameter. Normalization of I C (B ⊥ = 0) implies that J F W F +J M W M = 1, and leaves us with the following current densities expressed in terms of f, W F , W M : Thus we obtain: And the analytical expression for the interference pattern: The angle of the SQUID plane with respect to the MLG and FLG plane translates into a phase difference between the two channels. Following is a tabulation of the fit parameters W M , W F , 2δ, θ, h and their errors, extracted from a Matlab non-linear least squares fit of data shown in main text Figs.
3, 4 to the analytical model. The extracted dimensions may be compared to the measured dimensions given above. Note that there are different combinations of slightly different parameter values that can also yield a similar fit, therefore the true fit error is larger than the error bars given in the table.  In order to extract the current distribution in greater detail, we postulate an initial current density profile sampled at N discrete spatial points and subject to physical constraints, and calculate the corresponding interference pattern. We then adjust the density profile sequentially to obtain the best least-squares fit of the calculated interference pattern to the data, subject to a maximum entropy constraint. This is done via Markov chain Monte Carlo simulated annealing.
We begin by guessing a current density vector sampled at N pointsJ 0 = [J 0 (x 1 ), J 0 (x 2 )...J 0 (x N )] normalized such that n J x (x n ) = 1. Valid guesses are constrained such that non-zero current density can exist only within the MLG and FLG channels, and current reversal (negativeJ 0 ) is disallowed. We then calculate the corresponding interference pattern based on equation S9, with the perpendicular field sampled at M pointsB ⊥ = [B 1 ⊥ , B 2 ⊥ ...B M ⊥ ], B ∥ set to some value, step height h = 1 nm and junction length L = 2.2µm extracted from the analytical fit described in Section S2. Explicitly, we define the matrix element A m n = exp (i 2π ϕ0 LB m z x n ) for x n < 0, and A m n = exp (i 2π ϕ0 L(B m z x n − B m x h)) for x n > 0. We then compute: In order to quantify the fit of our guessJ 0 we calculate the least squares difference of I calc C (B m ⊥ ) with respect to the measured interference pattern I meas C (B m ⊥ ): We then sequentially adjust the fit by employing a Metropolis algorithm Markov chain Monte Carlo (MCMC) process which samples possibleJ 0 configurations and assigns them a free energy reflecting a competition between the goodness of fit χ 2 and the entropy.
Samples are correspondingly weighted with the standard Boltzmann weight e −βF . The first term in the free energy penalizes a large deviation of the fit from the measurement, the second (entropy) term penalizes non-uniformity of the postulated current distribution, and the hyper-parameter λ tunes between them. A finite "temperature" T = β −1 > 0 introduces noise to the equilibrium current distribution but allows the algorithm to consider corrections toJ 0 which result in energy loss, and thus helps to avoid converging to local minima of χ 2 . In order to find the minimum of F , we employ simulated annealing, increasing the inverse temperature from 0 to β linearly with each Monte Carlo (MC) step. We ensure thatJ 0 remains normalized by making changes in discrete units of size ∆J; a unit removed from site x i must be added to some other site x j . The steps of the algorithm are as follows: 1. Make an initial guessJ 0 obeying normalization and constraints.

Calculate
3. Choose x i , x j at random from among the sites at whichJ 0 is allowed to be non-zero. Propose is negative, return to step 3. Otherwise, continue.
5. Calculate the new free energy F new (J ′ 0 ) 6. If F new < F , accept new current density and return to step 3 updatingJ 0 →J ′ 0 . Otherwise, accept changeJ 0 →J ′ 0 with probability e −βi(Fnew−F ) . The temperature at step i is given by β i = β/K * i, where K is the total number of MC steps. 7. Return to step 3 and iterate K times.
We have freedom to change our initial guessJ 0 , and to tune the hyper-parameters λ, β, N, ∆J, K.
The x range of the simulation is defined between ±2µm, while the current in the MLG/FLG channels is bounded within the widths determined from the SEM measurement. Fourier uncertainty indicates that an interference pattern with N nodes yields a spatial resolution of W N ; that is, we can choose N evenly spaced discrete points to sample within the overall width W of the current carrying channels. However our fitting method is not an inverse Fourier transform; it introduces additional information through geometrical constraints as well as the maximum entropy constraint. Increasing the number of sampling points helps better fit the width and separation of the channels, while the maximum entropy constraint smooths any sharp spatial features. Thus, we choose N = 50, a few times larger than the bandwidth.
We use a uniform initial distributionJ 0 = 1 N * , with N * being the number of sample points allowed to carry current.
Supplementary Figure S4 At zero magnetic field and maximal charge carrier density (V G =-1V) we have measured a stable, nearly symmetric interference pattern with many lobes (see main text Fig. 3). We use this pattern to tune all of the hyper-parameters of the fitting algorithm. To tune the parameters β, ∆J we set λ = 0 and try several initial guesses for β, ∆J in powers of 10 before settling on β = 10000, ∆J = 0.001 to obtain a convergence of F in a few thousand MC steps (see S4a). We note that any 10 3 < β and ∆J < 10 −2 would work as well. We then choose K = 5000, several thousand MC steps after convergence. Finally, we tune λ by studying the L curve (see S4b), and choosing λ = 2.4 which provides a trade-off between decrease in goodness of fit and increase in entropy. Any λ in the vicinity (up to ≈ 6) gives a qualitatively similar current distribution, with smaller λ generating a noisier distribution and larger λ providing a poorer fit. As a sanity check, the spatial features of the current distribution generated by the fit do not have significant features on a length scale finer than that offered by the bandwidth of the original signal, as can be seen in Figs. 3, 4 of the main text. These parameters, chosen once based on the best data set, were then used to fit all of our measured data. See for example measured data of I C vs. B ⊥ and gate voltage (S4c) and I C vs. B ⊥ ,B ∥ ) (S4e), and compare to the fit in panels d,f. As the MLG current channel grows narrow at high B ∥ (see main text figure 4), the pre-defined geometric constraint of the width of the MLG flake provides more freedom, leading to reduction in the quality of the maximum entropy fit.
We note that there can in principle be multiple current distributions which give a comparable fit to the interference pattern, due to the loss of phase information. However, fitting three different experimental repetitions of the B ∥ measurement, changing the fitting hyper-parameters, changing initial conditions, etc. in the above described fitting procedure always results in similar current distributions if we use λ in the vicinity of the optimal λ determined by the L curve.

S5 Supplementary Section: Current density extraction using the Wiener Khinchin theorem
The Wiener Khinchin theorem states that the energy spectral density of a function and its autocorrelation C(l) are Fourier transform pairs. In our case, |I C (B ⊥ )| 2 is the energy spectral density of J 0 (x), and thus: Consider a two channel device, with the current density in the first channel very sharp and narrow, approximated by the Dirac delta function located at x = a, and the current density in the second channel given by some function F of finite width W centered at x = −a. The current density is thus: The autocorrelation of the current density in this case is: δ(x−a)δ(x−a+l)+F (x+a)F (x+a+l)+F (x+a)δ(x−a+l)+F (x+a+l)δ(x−a)dx (S14) The first two terms give some function centered at l = 0. If we assume W<a, this function extends no further than l = ±W . The second two terms give F (2a ± l): this is the current density of the second channel, centered at shifts equal to the distance between the two channels, l = ±2a (and mirrored with respect to l around l = 2a). In our case the FLG is only a few times narrower than the MLG, and carries a similar current density. In this instance, the autocorrelation convolves the FLG and MLG densities, resulting in a feature which qualitatively resembles the MLG current density "smeared" at the scale of the FLG width, and centered at l = −2δ equal to the distance between the centers of the two channels.
Note that the calculated auto-correlation functions plotted in figures 3,4 in the main text are smoothed as a result of zero-padding before calculating the discrete Fourier transform of |I C (B ⊥ )| 2 .