Polarization-encoded co-localization microscopy at cryogenic temperatures

Super-resolution localization microscopy is based on determining the positions of individual fluorescent markers in a sample. The major challenge in reaching an ever higher localization precision lies in the limited number of collected photons from single emitters. To tackle this issue, it has been shown that one can exploit the increased photostability at low temperatures, reaching localization precisions in the sub-nanometer range. Another crucial ingredient of single-molecule super-resolution imaging is the ability to activate individual emitter within a diffraction-limited spot. Here, we report on photoblinking behavior of organic dyes at low temperature and elaborate on the limitations of this ubiquitous phenomenon for selecting single molecules. We then show that recording the emission polarization not only provides access to the molecular orientation, but it also facilitates the assignment of photons to individual blinking molecules. Furthermore, we employ periodical modulation of the excitation polarization as a robust method to effectively switch fluorophores. We bench mark each approach by resolving two emitters on different DNA origami structures.


Introduction
With the advent of super-resolution methods, optical microscopy has provided fascinating new insights into the sub-cellular domain and has become an indispensable tool in elucidating the structure and function of biological systems at the nanoscale. 1 The high specificity and spatial resolution of fluorescence imaging has the potential to deliver further information on the molecular architecture of proteins and their complexes even in a native environment, e.g. membrane proteins or protein aggregates implicated in diseases. Recently, it has been recognized that super-resolution microscopy performed at cryogenic temperatures can be of great value. [2][3][4][5][6][7][8][9][10][11] The main advantage of this approach stems from the fact that photochemistry is considerably slowed down at low temperatures. As a result, each fluorophore can emit more than two orders of magnitude more photons than at room temperature before it photobleaches. This translates into a higher localization precision and, thus, better resolution in co-localization of several fluorophores. Another important benefit of cryogenic light microscopy is its potential for combination with cryogenic electron microscopy and correlative microscopy. 9-13 While cryogenic super-resolution microscopy in organic crystals predates conventional super-resolution microscopy by about a decade, 14,15 its use in biologically relevant applications has been a theme of research only recently. 2,[4][5][6][7] The best resolution in biological super-resolution microscopy has been reported by Cryogenic Optical Localization in three Dimensions (COLD), reaching Angstrom resolution of up to four fluorophores on a single protein. 2 In that work, cases of exceptionally slow blinking were used to identify brightness levels of the individual emitters and their combinations on a single protein. However, this strategy limits the yield of the experimental procedure because as we discuss in this work, most molecules show faster photophysics. To understand and tame this difficulty, we have performed more detailed photophysics studies at liquid helium temperature. Furthermore, we have exploited the polarization degree of freedom associated with the dipole moments of the fluorophores as a resource for separating their signals and discuss its influence on localization accuracy. 16,17 Blinking of red fluorescent organic dyes at cryogenic temperatures Naturally occurring stochastic blinking of fluorophores is a ubiquitous phenomenon with different physical origins, e.g. intersystem crossing to triplet states, charge trapping, conformational changes or transient binding. [18][19][20][21] Blinking offers a convenient universal scheme for nanoscopic studies that involve a handful of molecules within a range of a few nanometers, but the different time scales, spanning from microseconds to minutes, can make the distinction of a large number of emitters in a diffraction-limited spot extremely difficult.
In the simplest approach, one records videos from a field-of-view of about 1000 µm 2 and examines the time trace from each diffraction-limited spot. 2 In the ideal case one obtains 2 N intensity levels corresponding to N active fluorophores such that it is possible to find frames where only one fluorophore remains on, and can thus be localized. By repeating this procedure for videos as long as tens of minutes or hours, one gathers sufficient data to reach sub-nanometer localization precision for each fluorophore and, hence, resolve their relative positions. For this procedure to work, it is important to know about the switching rates of the fluorophore under the specific experimental conditions. While the most commonly used organic dyes have been well characterized at roomtemperature, 22 there is still little information on the photophysics of fluorescent labels at low temperatures. A quantitative understanding of this topic requires a thorough study of many parameters regarding the fluorophore and its environment and is beyond the scope of our work. Nevertheless, we attempt to present a flavor of the phenomena at hand for the three common red-fluorescent dyes Alexa Fluor 647, Cy5 and ATTO647N in a poly-vinyl alcohol (PVA) host matrix. We perform single-photon counting to obtain the time traces of single molecules and extract the duration of on-and off-periods. As shown in Figure 1a, application of a threshold at 3 standard deviations from the mean background photon count of the brightness histogram allows us to flag an event as on or off. We verified that small variations of the threshold did not change the obtained on-and off-times significantly and also found good agreement with time constants computed from the analysis of the autocorrelation function. Fluor 647 molecule in poly-vinyl alcohol binned in 1 ms intervals. A zoom-in shows short emission bursts followed by longer dark periods. The red line indicates the threshold used to compute on-and off-times. The right plot shows the intensity histogram of the signal bursts. b) Histogram of the on-times of the trace shown in a) with an exponential fit. c) Histogram of off-times for the same trace. Here, the statistics is best described by a bi-exponential function with a short and a long component. d) Summary of long off-times for the different dye species at different excitation intensities. e) Summary of the corresponding on-times.
In Figure 1b,c we find that in the case of Alexa Fluor 647 the duration of on-times follows an exponential distribution whereas off-times are best described by a bi-exponential function with a short and a long time constant. Figure 1d shows the (long) off-times for three different excitation intensities, revealing little dependence over the investigated range. The on-times, however, decrease down to 5 ms at elevated excitation intensities (exceeding 1 kW/cm 2 ) as shown in Figure 1e, a phenomenon which is exploited in a typical (d)STORM situation and can even be chemically engineered. 23 As a rule of thumb, the larger the off-on ratio, the higher the probability to localize individual emitters from a set of many within a diffractionlimited spot. However, considerations such as the relation between the integration time and the on-and off-times should also be taken into account. 24 The three dyes investigated here show very similar transition rates and brightnesses. We point out that the characteristic exponential blinking kinetics are sometimes interrupted by long emission bursts or long dark periods in all cases, possibly indicating reversible changes of the triplet state lifetime 25 or the molecular configuration. 26 This behavior is in line with a similar conclusion found for ATTO647N at room temperature, where the blinking statistics were shown to depend on the environment 19,20 with primary sources of charge transfer, triplet states and radical ion states.
Next, we briefly present the blinking behaviors of two Alexa Fluor 647 dyes placed on an origami nanoruler. As displayed in Figure 2a, one can clearly identify two brightness levels at a temporal resolution of 1 ms where the blinking events are sufficiently oversampled.
The brightness histogram of the same trace shows a continuum without any distinct levels if binned to 10 ms, even though the blinking of individual fluorophores is still temporally resolved. This study emphasizes that frame acquisition slightly faster than the on-time is required for assigning an intensity level to a given molecule, limiting the performance of this method.

Selection via the emission polarization
To alleviate the difficulty of identifying two fluorophores based on their fluctuating brightness levels, we now exploit the polarization degree of freedom. 27,28 The emission dipole moment of a dye molecule is usually well defined with respect to its backbone such that the polarization of the radiated field can directly report on its orientation. While in room-temperature aqueous environments the fluorescent label is free to rotate about its linker, the orientation of an emitter is typically fixed at cryogenic temperatures. Hence, the emission dipole orientation in the image plane can be determined from a measurement of the emission intensities I x,y projected along two orthogonal lateral axes x and y according to Given that the orientations of several individual molecules are independent, one can distinguish their signals if their polarization angles are sufficiently spaced. To implement this idea, we separated the two polarizations along the x and y directions with a polarizing beam splitter and directed them to two separate synchronized cameras (see inset in Figure 3a). Figure 3b shows an exemplary time trace of the polarizations extracted by applying Eq. 1 to signals I x,y . Although both label molecules might be blinking during one frame, the extreme signals marked by the blue and red circles clearly point to situations where only one molecule was on. We only use the data from these frames for localization to avoid any overlap. Figure 3c presents the same data as a histogram. We see that in contrast to the brightness histogram (see Figure 2b), the dipole angle follows a symmetrical distribution that clearly identifies two distinct polarizations, greatly facilitating the assignment of the signals to individual molecules, indicated by the red and blue portions. The correspondingly color-coded spots in Figure 3a display the registered localizations from the two fluorophores of the nanorulers. A zoom into one of the spots shows a strong overlap of the localized positions, which after averaging yields two spots separated by 7.5 nm. In this example, the two point-spread functions (PSFs) could be clearly separated because the dipole orientations of the two molecules on a DNA origami were sufficiently different.
To state a statistically meaningful distance between the labeling sites of a nanostructure, we examine a large number of particles. Figure 4a,c display the distribution of the separations deduced from measurements on origami samples where two Alexa Fluor 647 molecules were placed at a design separation of 22.8 nm (Tilibit) and two ATTO647N molecules were placed at a design separation of 6.5 nm (GATTAQuant), respectively. We now discuss the various effects that determine the shapes of these distributions.
We point out that our samples have a linear architecture so that the two-dimensional projection of two PSFs should always report distances equal to or smaller than the design distance. Furthermore, the PSF of a molecule depends on the orientation of its dipole moment. 16,17 Unless the dipole fully lies in the lateral or the axial planes, its PSF is skewed leading to systematic errors in localization and thus an apparent shift of the molecular center of mass (see Figure S2). Thus, to account for the distribution of the occurrence frequencies in the histograms, one has to consider the localization uncertainty, which can be estimated as the quadratic sum of independent contributions. These include a statistical localization error σ loc shown in Figure 4b,d, an image registration error σ reg (see Figure S3), a residual sample drift σ dft and an average error σ dip due to the fixed dipole orientation (see Figure   S2), yielding Considering the estimated error, we can now fit the localization distributions using a bivariate normal distribution with non-zero mean, also called Rician distribution. 29,30 In this model, if the separation d between the two molecules is much larger than the localization un- 14.0 ± 0.8 nm. In Figure 4e, we plot the dependence of the extracted distance from data fits on the input value of σ, verifying that the distance assignment is very sensitive to σ for the smaller nanoruler. 30 The shape of the distance distributions using an extended model accounting for the 3D orientation of origamis (see Figure S4) indicates that the origami structures mostly lie parallel to the surface. Nevertheless, the dipole moments of the individual molecules could be arbitrarily oriented in space. The radiation of an axial dipole moment placed at a dielectric interface is emitted into larger angles, leading to a doughnut-shaped PSF. 16 It follows that the fluorescence of such a molecule is less efficiently excited and collected by an air objective and the molecule appears less bright. A mixture of in-plane and out-of-plane dipole components leads to an asymmetric PSF, thus, introducing a systematic localization error if the PSF is simply fitted by a Gaussian function. However, the PSFs of two dipoles with the same orientations are shifted synchronously, leaving their center-to-center separation almost unaffected even if the fit function is not ideally adapted. Figure 4f shows that, indeed, by selecting bright dipoles and small relative angles between the two fluorophores of the data presented in Figure 4b, we arrive at narrower and more symmetric distributions. We remark that the complication caused by the 3D orientation of the dipole moment could be addressed more rigorously by direct measurement of the complete orientation 8 or by filtering the azimuthal contributions of the PSF with a phase mask. 31 We also remark that the success of polarization selection comes at the cost of a lower signal in each channel since we have to split the emission from single molecules. To maintain a good signal-to-noise ratio, we placed a mirror at the substrate surface in order to also capture the light that is emitted away from the microscope objective. 32,33 Besides enhancing the excitation and collection efficiencies (see Figure S1), the mirror also eliminates autofluorescence of the glass substrates, which would introduce a considerable background. However, it also affects the PSF and therefore the distance measurement (see Figure S2).  We now excite a nanoruler carrying two dye molecules with absorption dipole moments along θ 1 and θ 2 , respectively (see Figure 5b). As the excitation polarization angle α is rotated, the total brightness of the detected light can be expressed as a superposition of the components P 1,2 originating from the two fluorophores along the unit vectors e 1 and e 2 which signify the directions of their dipole moments, respectively. We note that although the absorption and emission dipoles of organic dye are generally not aligned, 37 we assume this to be the case for the sake of simplicity here. onto two regions of a camera to analyze the emission polarization. The remaining light was projected to a second synchronized camera to perform localization (see Figure 5c). With this imaging scheme, localization and emission polarization can be measured independently while each individual localization can be assigned to a polarization state. An important advantage of this approach is that it is largely independent of the blinking dynamics. The excitation polarization can be rotated very slowly as long as the fluorophores do not photobleach during one 180 • rotation. This allows for long integration times and an increased signal-to-noise ratio per camera frame. Another advantage of the scheme is that localization can be performed without the need for image registration.

Switching molecules via the excitation polarization
In Figure 5d-g, we present an example of a nanoruler carrying two dyes. Here, each frame was recorded for 3s, resulting in a single-molecule localization precision of a few nanometers per frame. Three frames were then averaged for each excitation polarization angle, which was incremented in steps of 2 • . Figure 5d shows the total brightness recorded from the two molecules on a nanoruler. While a periodic modulation is evident, the angles θ 1 and θ 2 are not easily identifiable. However, if we exploit the information about the emission polarization, i.e. our knowledge of the vectors e 1 and e 2 , we can assign a polarization to the detected fluorescence as illustrated in Fig. 5b. Figure 5e shows that, indeed, the angle attributed to the total emission is confined between two extrema. In the special case that the two absorption dipole moments are perpendicular to each other and lie in the substrate plane, the vertical axis in Figure 5e would cover the full range of 0 − 90 • . We note that comparing Figure 5d with Figure 5e, we also find a clear correlation between brightness and emission polarization. Having identified the conditions where only one fluorophore is on, we can now localize it on the second camera.
The camera images of the cases where both molecules contribute also remain useful because they help obtain a robust fit to the outcome of a fit according to the vectorial model illustrated in Figure 5b. In particular, we expect the PSF of such intermediary states to wander between two extreme positions. Indeed, the x-and y-displacements of the recorded PSF shown in Figure 5f reveals that the center-of-mass of the fluorescence spot moves back and forth between two locations as the excitation polarization rotates. By analyzing these data, we could determine the distance between the two fluorophores to be 21.1 ± 2.0 nm, in agreement with the previous result (see Figure 5g).
The controlled switching of single molecules alleviates the experimental work because we can use long camera integration times and work with much lower light levels as required for stochastic switching. Furthermore, polarization measurement and localization are now separated, eliminating the errors associated with image registration between two cameras.
The only requirement for this method is linearly polarized absorption and emission dipole moments of the fluorescent labels, regardless of their other photophysical properties such as blinking.

Conclusions
Cryogenic optical localization has reached nanometer resolution making it a valuable tool for structural biology and other applications in physics and material science, e.g. localizing defects and color centers in 2D materials. We have shown that including the polarization degree of freedom in the localization analysis allows for more robust assignment of fluorescence photons to individual emitters. Aside from boosting the localization accuracy, this approach also provides direct access to molecular orientations which can be useful in the context of studying agglomeration or oligomerization of proteins. [38][39][40][41] Furthermore, we have shown that polarization can be used to achieve controllable switching, which is largely independent of the stochastic blinking, works with lower light levels and allows for longer camera integration times. The implementations of the ideas in our work are straightforward and not restricted to specific photophycial properties. We have shown the potential of this technique for imaging nanostructures containing two molecules. Future efforts will tackle problems with many fluorophores, where the current work could also be combined with SOFI 13 or sparsity-enhancing algorithms. 34

Acknowledgement
We are grateful to the Max Planck Society for financial support. We acknowledge Tobias Utikal for assistance with cryogenic experiments and Alexander Gumann for preparing mirror-enhanced substrates.

Supporting Information Available
The following files are available free of charge.
The following files are available free of charge.

Sample preparation S4
Data acquisition S5

Image analysis S5
Estimation of distance and uncertainty S6

Cryogenic microscope
All measurements were performed in a cryogenic microscope where the sample is kept at liquid helium temperature. Fluorescence is excited in wide-field configuration with circularly polarized light at 640 nm from a diode laser (iBeamSmart, Toptica) and collected through a long-working distance objective with a numerical aperture of 0.9 (MPLAN 100x, Mitutoyo) mounted in vacuum. The laser light is filtered with a 647 nm dichroic mirror (RazorEdge, Semrock) and further suppressed with 650 nm long-pass filters (Thorlabs). In order to determine the switching rates of the stochastic on-off blinking we use single-photon counting with a 30 : 70 beam-splitter dividing the signal between an EMCCD camera (iXon, Andor) and an avalanche photo diode (Lasercomponents). For super-resolution imaging a polarizing beam-splitter (Thorlabs) is used together with a second EMCCD camera and acquisition of frames is synchronized via common trigger pulses. Polarization modulation nanoscopy was performed in a slightly modified optical setup. The unmodified point-spread function is extracted from one camera receiving 70% of all photons. The weights in each frame are determined from the remaining 30% of photons projected into a polarization-resolved channel on two halves of a second camera chip using a Wollaston prism. We use custom designed dielectric mirrors (Laseroptik) and low angles of incidence on other components to minimize polarization-dependent phase shifts throughout the imaging optics. This turned out to be sufficient for our co-localization analysis. More quantitative studies of molecular orientation may require to take the distortion of polarization in high-NA collection into account. S1,S2

Mirror enhancement
Although the cryogenic use of microscope objectives with high numerical aperture has been demonstrated, S3-S5 their operation imposes some restrictions and, thus, most cryogenic microscopes still use air objectives. This restricts the available signal-to-noise ratio as a consequence of having to image through vacuum. Since the emission at an interface occurs predominantly into the medium of higher refractive index the collection efficiency is generally low. For a dipole close to a cover glass/air interface only 14% of the radiated power falls within the solid angle of a 0.9 NA objective. In order to mitigate these losses we employ a simple antenna design. Specifically, we apply a 100 nm gold layer on top of a silicon substrate to reflect emitted fluorescence towards the objective. In order to avoid quenching we also apply an 80 nm aluminum oxide spacer. The presence of this mirror leads to a threefold enhancement of both the excitation intensity and the power radiated into the upper half-space (see Figure S1). However, it also causes a modification of the far-field emission pattern. To investigate the consequences of this effect, we performed simulations for dipoles placed at different distances. Figure S2a displays the systematic localization error for a single molecule of arbitrary orientation angle with respect to the substrate plane and its distance to the mirror. Figure S2b shows that co-localization of two emitters at a nominal location of 23 nm leads to a broadening of the distribution of separations if the two emitters have random azimuthal orientations. We also found more frequent outliers at smaller and larger distances. whereas the 22.8 nm ruler is based on a box-shaped design. In both cases we took care to keep origamis in the right buffer conditions to avoid bending or unfolding.

Data acquisition
Before the start of a new measurement sequence we let the experiment settle for 2h after cool-down to reduce sample drift. We then take 100000 frames at 70 Hz for every field-of-view (FOV). The excitation laser intensity was set to 1 kW/cm 2 to generate the desired low on-off ratio as described in Figure 1. For polarization modulation measurements the excitation intensity was reduced to 0.05 kW/cm 2 to avoid saturation and transitions to long-lived dark states. The input polarization is rotated in steps of 2 • by a motorized linear polarizer with 3 frames taken for 3 s at each position. We typically image 3 complete cycles of the modulation to ensure a robust fit to our model.

Image analysis
A 2D median filter with a kernel size of 5 µm is applied to suppress background fluorescence we obtain the histogram shown in Figure S3. After drift correction and image registration localizations are assigned to a fluorophore according to the estimated polarization. To this end, we determine the two peak positions in the polarization histogram and pick frames where the value is smaller or larger than the peak value. To further limit false assignments due to a small residual overlap we require a minimum separation of the peak values of 15 • .
As indicated in Figure 4 of the main manuscript, we require a localization precision better than 1 nm for both fluorophores. Data from polarization modulation measurements were analyzed in a similar fashion. However, here we only require a coarse image registration on the pixel scale to identify the same PSF in the two polarization channels as well as on the localization camera. For statistical analysis of distances we only kept data points with an uninterrupted trace.

Estimation of distance and uncertainty
In order to estimate the distance between the localized sites in an ensemble of particles one has to keep several considerations in mind. First, our technique relies on averaging many  Figure S4: Extended model of the distance distribution between two fluorophores. The Rician distribution describes the effect of finite localization uncertainty while its convolution with a cos α function also accounts for arbitrary 3D orientation. Tilting of a nanoruler with respect to the optical axis leads to a shorter apparent distance between the fluorophores in the image plane. a) For a true distance of 20 nm we compare the expected distributions for arbitrary 3D orientation at two different localization uncertainties (red and blue) with only flat-lying nanorulers (gray line). As the experimental distributions in Figure 4 do not show the features of arbitrary orientation we infer that they are mostly parallel to the surface. b) The same comparison for a true distance of 6 nm.
particles and therefore assumes a homogeneous ensemble. Second, the distance distribution is generally asymmetric. S7 Third, depending on the sample preparation one might have to consider the 3D orientation of the particles. In Figure S4 we show the theoretical distance distributions of a model that allows for inclination of a nanoruler outside the image plane. S8 One can see clearly that a 20 nm ruler with arbitrary 3D orientation shows distributions distinct from our experimental data. For completeness, we also show the distributions of a 6 nm ruler. In the uncertainty of the distance measurements we also considered several contributions. First, we have to take the calibration of the pixelsize of our cameras with an uncertainty of 1 nm into account. Next, we included an estimation error of the model parameter σ on the fitted distance as shown in Figure 4e. Here, the largest contribution to the uncertainty is due to the dipole error. Furthermore, the influence of false positive detection events through occasional failures of the analysis routine was excluded by performing manual particle picking. These statistically independent contributions add to give at a final distance uncertainty of about 2 nm in our experiment. Lastly, it should be pointed out that DNA S7 origamis can exhibit significant structural heterogeneity leading to an additional uncertainty contribution that is difficult to characterize but may depend on the buffer environment S6 and even exceed 1 nm.