Combining Graphical and Analytical Methods with Molecular Simulations To Analyze Time-Resolved FRET Measurements of Labeled Macromolecules Accurately

Förster resonance energy transfer (FRET) measurements from a donor, D, to an acceptor, A, fluorophore are frequently used in vitro and in live cells to reveal information on the structure and dynamics of DA labeled macromolecules. Accurate descriptions of FRET measurements by molecular models are complicated because the fluorophores are usually coupled to the macromolecule via flexible long linkers allowing for diffusional exchange between multiple states with different fluorescence properties caused by distinct environmental quenching, dye mobilities, and variable DA distances. It is often assumed for the analysis of fluorescence intensity decays that DA distances and D quenching are uncorrelated (homogeneous quenching by FRET) and that the exchange between distinct fluorophore states is slow (quasistatic). This allows us to introduce the FRET-induced donor decay, εD(t), a function solely depending on the species fraction distribution of the rate constants of energy transfer by FRET, for a convenient joint analysis of fluorescence decays of FRET and reference samples by integrated graphical and analytical procedures. Additionally, we developed a simulation toolkit to model dye diffusion, fluorescence quenching by the protein surface, and FRET. A benchmark study with simulated fluorescence decays of 500 protein structures demonstrates that the quasistatic homogeneous model works very well and recovers for single conformations the average DA distances with an accuracy of < 2%. For more complex cases, where proteins adopt multiple conformations with significantly different dye environments (heterogeneous case), we introduce a general analysis framework and evaluate its power in resolving heterogeneities in DA distances. The developed fast simulation methods, relying on Brownian dynamics of a coarse-grained dye in its sterically accessible volume, allow us to incorporate structural information in the decay analysis for heterogeneous cases by relating dye states with protein conformations to pave the way for fluorescence and FRET-based dynamic structural biology. Finally, we present theories and simulations to assess the accuracy and precision of steady-state and time-resolved FRET measurements in resolving DA distances on the single-molecule and ensemble level and provide a rigorous framework for estimating approximation, systematic, and statistical errors.

Fluorescence parameters of the fluorescent dye Alexa647 coupled to various proteins.  Table S1. Fluorescence parameters of the fluorescent dye Alexa647 coupled to various proteins.
(a) Different variants were labeled by Alexa647 C2 maleimide (order number: A20347). The naming scheme highlights introduced mutations and potential labeling position by the original amino acids, sequence numbers and the introduced mutations. In PSD-95, hGBP1, HIV-RT, LiF and T4L native amino acids were replace by cysteines. These cysteines were labeled by using maleimide chemistry. In PSD-95 and p27 two cysteines were present. Thus, Alexa647 is distributed among two potential labeling sites. (b) The fluorescence lifetimes were determined by a fitting a multiexponential relaxation model were calculated using the fitted species fractions x (i) and lifetimes τ (i) . (d) The residual anisotropies r ∞ were determined by the offset of the time-resolved anisotropy decays r(t).

Fluorescence lifetime distribution (b)
Species weighted lifetime c

Protein
Variant x (1) τ (1) /ns Alexa488 hydroxylamine (order number: A30632). The naming scheme highlights the labeling position by the original amino acid, its sequence number and the introduced mutations. In PSD-95, hGBP1, HIV-RT, LiF native amino acids were replace by cysteines. These cysteines were labeled with Alexa488 C5 maleimide. In T4L the unnatural amino acid p-acetyl-L-phenylalanine (pAcF) was introduced. The keto group of pAcF was labeled by Alexa488 hydroxylamine. As HIV-RT is complex consisting of two sub-units p51 and p66, the respective subunit name additionally given. (b) The fluorescence lifetimes were determined by a fitting a multi-exponential relaxation model were calculated using the fitted species fractions x (i) and lifetimes τ (i) . (d) The residual anisotropies r ∞ were determined by the offset of the time-resolved anisotropy decays r(t). (e) The simulated species averaged lifetimes were determined by simulating the fluorescence decay using parameters as given in Fig. 8 and protein structures as given in Table S3. S4 Table S3. Crystal structures used in the BD simulations presented in Figure 8 Protein PDB T4 Lysozyme (T4L) (a) 148L, 172L Human guanylate binding protein 1 (hGBP1) 1F5N HIV reverse transcriptase (HIV-RT) 1RTD PSD-95 3ZRT (a) In case of T4L it was assumed that 50% is in the "closed" conformation 148L and 50% in the open conformation 172L.

Note S1. Decay analysis by normally distributed distances
As demonstrated in Fig. 13 is given by the species weighted average: By combining the above equation with eq. (7), (9) and (17) To obtain a time-resolved quantifier which provides the steady-state transfer-efficiency E in the limit have to be replaced by the cumulative intensities: Using the cumulative intensities, a time-dependent quantity is obtained with the meaning of a transfer-efficiency: This quantity describes the time-dependent yield of the FRET-process up to the time T.
Van der Meer defines the "time-dependent transfer-efficiency" (TRE) as: This does not quantify the yield of the FRET-process. Thus, the TRE is not a FRETefficiency. In a mixture of fluorescent species its asymptote provides the species fraction of FRET-active molecules and not the FRET-efficiency.

Note S3. Accessible volume simulations to assess the effect of labeling symmetry
Since, the dyes were tethered to the protein by long linkers, the spatial distribution of the flurophores had to be considered. The dye distributions were modeled by the accessiblevolume (AV) approach according to [75,108,109]. The AV-approach uses a geometric search algorithm to determine all dye positions within the linker-length from the attachment point which do not cause steric clashes with the macromolecular surface. The dyes were approximated by ellipsoids. The center of each ellipsoid was connected to its attachment point by a flexible linkage of a length L link . Here, the C β -atoms were used as attachment points. The linker-length is given by the longest distance from the attachment point (C β -atom of the cysteine) to the center of the dye. It includes the reactive group, a spacer and the internal linker of the dye. Both, Alexa Fluor 488 C5 maleimide (Alexa488) and Alexa Fluor 647 C2 maleimide (Alexa647) were modeled with a linker width of 4.5 Å. As linker-lengths L link 20.5 Å and 22 Å were used for Alexa488 and Alexa647, respectively. The radii of the ellipsoid (R Dye1 , R Dye2 and R Dye3 ) were determined by the spatial dimensions of the dyes. Alexa488 was modeled using radii of 5.0 Å, 4.5 Å and 1.5 Å. Alexa647 was modeled using radii of 11.0 Å, 4.7 Å and 1.5 Å. To study the effect of the linker-length on the symmetry, the fluorophore pair BodipyFL C1 iodacetamine (Bodipy) and Alexa647 was simulated. To simulate Bodipy a linker-length of 10.8 Å and width of 4.5 Å were used while the dye shape was approximated by radii of 4.5 Å, 3.2 Å and 0.9 Å.
To determine the effect of the labeling symmetry as shown in Fig. 13, a set of 5592 protein structures with at least 360 amino acids in the chain, a minimum resolution of 1.8 Å and no unresolved amino acids was selected from the protein databank using the program "PDBselect" [103]. For each structure at least 180 random amino acid pairs were chosen.

S6
Next, for each pair of amino acids the accessible volumes of the pair DA, where the donor is located at the first amino acid, and the AD-pair, where the donor is located at the second amino acid, were simulated. Using the AV-simulations, for both pairs the mean and the width of the distance distribution were calculated. In case if one of the two amino acids was buried within the structure and inaccessible for the dye, the amino acid pair was discarded. To discriminate inaccessible labeling sites FRET-pairs were discarded if a volume of an AV was smaller than 3.0% of the average AV-volume over all structures.
In absence of surface interactions, a main peak and a shoulder are visible (Fig. 9A, top). In presence of surface interaction, the width of x(R app ) increases, its mean distance shifts by ~3 Å and the shoulder is less pronounced (Fig. 9A, bottom). The features of x(R app ) depend on the diffusion coefficients. In case the dyes interact with the surface, they diffuse in average slower and thus, the differences between x(R DA ) and x(R app ) are less pronounced. In both cases the mean distance of x(R app ) is shifted by 2 Å indicating that the mean of x(R DA ) can be approximated by the mean of x(R app ). However, the width of the x(R app ) is decreased by ~3 Å for the chosen diffusion coefficient. Such narrowing was previously experimentally observed [9,75].

Note S4. The estimation of statistical errors
To estimate statistical errors due to the photon noise we use the Fisher information matrix (FIM) and the Cramér-Rao inequality. The Cramér-Rao inequality states that the variancecovariance matrix Σ is bigger than the inverse of the FIM I (Σ ≥ I -1 ). For two model parameters α and β the elements of the FIM are given by: Under these conditions the FIM does not contain the experimental information. Hence, the variances and co-variances can be predicted a priori given a model function.