Dimerization of European Robin Cryptochrome 4a

Homo-dimer formation is important for the function of many proteins. Although dimeric forms of cryptochromes (Cry) have been found by crystallography and were recently observed in vitro for European robin Cry4a, little is known about the dimerization of avian Crys and the role it could play in the mechanism of magnetic sensing in migratory birds. Here, we present a combined experimental and computational investigation of the dimerization of robin Cry4a resulting from covalent and non-covalent interactions. Experimental studies using native mass spectrometry, mass spectrometric analysis of disulfide bonds, chemical cross-linking, and photometric measurements show that disulfide-linked dimers are routinely formed, that their formation is promoted by exposure to blue light, and that the most likely cysteines are C317 and C412. Computational modeling and molecular dynamics simulations were used to generate and assess a number of possible dimer structures. The relevance of these findings to the proposed role of Cry4a in avian magnetoreception is discussed.


Experimental supplementary figures
Full SDS-PAGE gel, peak isolation using tandem native MS and XL-MS results 1 Figure S1: (A)-(C) Denaturing SDS-PAGE gels run for the different ErCry4a proteins displayed in the native MS spectra shown in the main text. Proteins are as indicated at the tops of the columns. All leftmost columns display protein ladders. In (A) the SeeBlue TM Plus2 Pre-stained Protein Standard ladder (Invitrogen) was used and in (B) and (C) the PageRuler Prestained Protein Ladder (Thermo Scientific) was used with the molecular weight (MW) mass markers in kDa as indicated. The columns displayed with no indicator are not relevant for the present investigation. (D)-(F) Native MS in combination with tandem MS was used to isolate a monomer and a dimer peak of ErCry4a WT (D) (without its His-tag) and ErCry4a C317S (E) (with Histag; shipped in 10 mM BME to prevent higher order oligomerisation during transport) to compare their stability upon exposure to high HCD (high energy collisional dissociation) energies. The first rows of both (D) and (E) show the full spectra, the second the isolated dimer peaks and the third the isolated monomer peaks. HCD values were applied as indicated in the spectra. The spectra displayed for comparison in (F) were of CRP (C-reactive protein), a pentameric protein known to be non-covalently bound 1 . The first row shows the full spectrum, the second an isolated pentamer peak and the third shows the isolated pentamer peak after applying an HCD value of 70 V, displaying how it falls apart into smaller subunits at comparatively low HCD energies. The ErCry4a dimers still did not dissociate when the highest HCD energies possible on the instrumental setup (500 V) were applied. The coloured rectangles show which peaks were isolated using tandem MS.

S3
Full SDS-PAGE gels displaying the degree of covalent dimerisation over time Figure S2: Denaturing SDS-PAGE gels, parts of which are shown in the main text. Proteins are as indicated at the tops of the columns. Both of the leftmost columns display the PageRuler Prestained Protein Ladder (Thermo Scientific) with the molecular weight (MW) mass markers in kDa as indicated. The samples in (A) were kept in darkness and the samples in (B) were incubated under blue light. The columns displayed without labels are not relevant for the present investigation. The SDS-PAGE gel on the left side of (A) is the same as displayed in Fig. S1 (B) and is repeated here for easier comparison.

S4
Photometric cysteine exposure measurements 2 Figure S3: Cysteine accessibility calibration curve. 60 µM of DTNB were added to 0, 5, 10, 15, 20, 25 and 30 µM of L-cysteine. After 10 min incubation at room temperature the absorbance was measured at 412 nm. The y-axis shows the absorbance of TNB at 412 nm resulting from the reaction of DTNB with the thiol groups of the cysteines. The solid line comes from a linear regression analysis and has slope 0.0124 µM -1 . *A disulphide bond between two C412 residues was found in both the monomer and dimer fractions, possibly due to cross-contamination between the two fractions on the SDS-PAGE gel. Table S2: XL-MS results. Only disulphide bonds and crosslinks between peptides containing lysine residues 2 that were identified with a MeroX score greater than 50 were considered candidates for further analysis. The residues in brackets clarify the position of the linked residue within each peptide.

ErCry4a non-covalent dimers
The full length ErCry4a WT structure (1-527) was used to produce the stable non-covalently bound dimer ncov A , and a truncated sequence ) was used to construct the ncov M dimer (illustrated in Fig. S4Figure S4). Several analyses were performed on the simulated dimers: the results are summarised in Table S4 and Fig.Figure S5. The average values calculated over the duration of the production simulation are shown in Table S4, while the time evolution of the is presented in Fig. Figure S5. The ncov A and ncov M dimers appear to be rather stable with an average value of 3.51 ± 0.36 Å and 3.36 ± 1.36 Å, respectively, whereas ncov 4 is rather unstable with an average value of 8.98 ± 2.00 Å. The contrast in the stabilities of the three dimers is attributed to the differences in interaction energy and the hydrogen bonding network, both factors being much stronger in ncov A and ncov M (see Table  S4).
Average interaction energies, E tot , for the non-covalent dimer family are given in Table S4. The ncov A dimer was selected for further comparative analysis because of its exceptionally strong interaction energy that already manifests itself after the 2 ns equilibration simulation (E tot = 928 ± 25 kcal mol 1 , Table S5). The ncov 4 dimer was considered interesting because of its similar spatial arrangement to the covalent dimer cov 317A discussed below. Considering that the full length ErCry4a protein was used for ncov 4 and ncov A simulations, the presence of the CTT might have added to the stability of the dimer by contributing favourably to the resulting interaction energies.
The average value of the radius of gyration, R g , was computed for the non-covalent dimers, where the averaging was over the span of the MD trajectories. The results in Table S4 reveal that R g for ncov A and ncov M is significantly lower than for ncov 4 , which indicates that the protein structures are more compact.
The average value turns out to be smallest for ncov A (Table S4), where the largest fluctuations occur in the CTT domain (residues 498-527) as one would expect for an intrinsically disordered region of the protein (see Fig. Figure S5D-F).
There are 106 and 111 inter-monomer hydrogen bonds in the ncov A and ncov M dimers, respectively, while only 40 exist in ncov 4 . The numbers of inter-monomer salt bridges in ncov 4 and ncov A are similar and significantly larger than in ncov M .
The interaction surface areas of ncov A and ncov M are more than double that of ncov 4 . Interestingly, even without the CTT, ncov M has a bigger interaction surface than ncov 4 . Taking all the factors into consideration, ncov A and ncov M were selected as the two most promising non-covalent ErCry4a candidates for a further comparative analysis.  The table includes the computed values of the radius of gyration R g , the average ( ) and ( ) values, the total area of the binding interface A IS , and the total number of inter-monomer hydrogen bonds and salt bridges. Only salt bridges present in more than 10% of the MD frames were counted. All values have been averaged over the duration of the corresponding MD simulations.

ErCry4a 317 dimer family
Residue C317 is of interest in the context of covalent dimerisation of ErCry4a as it is close to the Trp tetrad (which is involved in magnetic sensing) and is the most solvent-exposed cysteine residue in WT ErCry4a ( Fig. 4A and Table S3). C317 is also considered a promising linking residue, as a result of the experiments described in the main text. For this and all other covalent dimers, a truncated sequence (8-495) was used to construct the dimers. This truncation was based on the sequence used for the structure determination of pigeon Cry4a 3 .
Three different dimeric structures were constructed to investigate the involvement of C317 in covalently-bound ErCry4a dimers (see Table S16). Dimers cov 317A and cov 317B are illustrated in Fig. Figure S6. Table S6 gives the average values for the three dimers covalently linked through the C317 residue. The time-dependence of the RMSD is given in Fig. Figure  S7A-C. The results in Table S6 demonstrate that cov 317A and cov 317B are much more stable than cov(317) 3 . Table S6 also gives the average values which provide information on the flexibility of the three structures. The average is smallest for cov 317A , indicating that the structure is least flexible, and unusually large for cov 317B .
The analysis of the average radius of gyration, R g , in Table S6 shows that R g is comparable for all three structures indicating that their geometric shapes are similar, which is also suggested by their similar interaction surface areas. Furthermore, the interaction energy of the subunits shown in Fig. Figure S7D-F, shows that the most favourable interactions occur between the subunits of the cov 317A dimer. The hydrogen bonding network is more elaborate for the cov(317) 3 dimer; here the number of hydrogen bonds could be directly related to the larger interaction surface area. The salt bridge analysis favours cov 317A , supporting this dimer as the representative candidate of the ErCry4a C317 dimer family.  ErCry4a cov D -a dimer with two disulphide bonds Figure S1 shows that ErCry4a dimers can still be formed after mutation of important cysteine residues that account for a part of the dimerisation. Figure S1 also indicates that small amounts of higher order oligomers exist in addition to the more prominent monomers and dimers, suggesting that several oligomerisation surfaces in ErCry4a may exist and therefore that more than one disulphide bridge could be involved in linking monomers. Oligomers were also found for Arabidopsis thaliana cryptochrome 2 (AtCry2) where a tetrameric structure is important in the regulation of plant growth 4,5 . Another experimental indication of dimer stability involves the use of higher HCD which demonstrates that ErCry4a dimers do not disintegrate easily (see Figure S1). This result suggests that a covalent linkage between the subunits might involve more than just one disulphide bond. Using M-ZDOCK 6 , a tool that symmetrically docks multimers, a dimeric structure was found in which C116 in each of the monomers was close to C313 in the other monomer (see Table S19). This led the construction of a potential stable ErCry4a dimer containing two covalent bonds between the monomers, cov D = cov(116 A 313 B -313 A 116 B ), where A and B stand for monomers ( Figure S8). Table S7 shows the values indicating that the dimeric cov D structure is stable. A full temporal analysis of RMSD and RMSF is shown in Figure S9. Furthermore, the analysis indicates low flexibility, while the interaction energy between the two subunits, averages at  178 ± 44 kcal mol 1 and is comparable with the values for cov 317A and cov 317B , even though the interaction surface area for cov D is twice those of the cov(317) dimers. Figure S8 shows that the ErCry4a cov D dimer has an inversion centre (centre of symmetry) which is not found for any of the dimeric structures presented in Fig. 1 in the main text.   Cys189, the third most solvent-exposed residue of monomeric ErCry4a, was found to have a possible binding motif to the Cys458 residue from another monomer, suggesting a dimer with a disulphide bond between the Cys189 and Cys458, denoted cov 189A (see Table S15).
Computational analysis reveals that cov 189A is stable, with a low RMSF. Although its interaction surface is not particularly large compared to some of the other covalent dimers, the interaction energy of its monomers, 533 kcal mol 1 , is the largest of all the covalent dimers studied. Figure S10 shows the spatial orientation of the monomeric subunits; this structure does not have inversion symmetry. Figure S11 shows the temporal analysis of the RMSD, RMSF and E tot .    Residue C412 was also investigated computationally as a possible linker of covalent ErCry4a dimers (Table S17). Eight structures were created: cov(412) 1,2,3,A,B,6-8 .
The structures of the most stable dimers, cov 412A and cov 412B (Figure S12), display a symmetric orientation of monomeric subunits relative to one another. Table S9 shows their values: cov 412A and cov(412) 7 appear to be the most stable. cov 412A has the smallest (see Figure S14). The interaction energies (Table S9 and Figure S15) indicate that cov 412A and cov 412B should be the most stable of the dimers, even though cov 412B would be considered highly unstable based on the analysis (see Figure S13). The strongest interaction energy (for cov 412A ) is accompanied by the largest interaction surface area, suggesting that cov 412A is the most stable dimer from the 412 family.    Figure S15: Time evolution of the interaction energies between two monomeric subunits of the cov(412) n ErCry4a dimers. Red and green denote respectively the van der Waals and Coulomb contributions to the total interaction energy between the monomers, shown in blue.

Figure S16
: cov 412A dimer with the two monomers colored in red and blue for clarity. All of the cysteine residues are shown in VDW representation, and some distances are noted between the sulphur atoms at the end of the simulation. Insets show the distance plots between the C361-C361 and C458-C458 cysteines plotted against simulation time.

Distance analysis between cysteine residues
To determine which cysteine residues are in close contact, the six most surface-exposed cysteines (C317, C116, C189, C68, C412, C73) and C179 (see Table S3) were analysed in the 49 dimeric structures obtained using the docking tools. Tables S11-S17 summarise the distances between these cysteine residues. The tables are organised by the contact residue used in the docking procedure to produce the various families of structures. Distances between cysteines less than 10 Å are coloured green, and those between 10 Å and 20 Å are in yellow. Tables S18 and S19 summarise the distances between cysteines for complexes that were docked without defining a contact residue, in one case using ZDOCK 7,8 , and in the other using M-ZDOCK 5 which was used for symmetric multimer docking. The names in the following tables (Complex N) are used internally only here and not in the rest of this paper.  Table S10: Analysis of inter-monomer distances between cysteine residues in ErCry4a for the 10 complexes produced by ZDOCK, where Cys68 was used as a contact residue. The distance measures the separation of the sulphur atoms in the two cysteine residues.     Table S15: Analysis of inter-monomer distances between cysteine residues in ErCry4a for the 3 complexes produced by ZDOCK, where Cys317 was used as a contact residue. The distance measures the separation of the sulphur atoms in the two cysteine residues.  Table S17: Analysis of inter-monomer distances between cysteine residues in ErCry4a for the 10 complexes produced by ZDOCK, where no contact residue was specified. The distance measures the separation of the sulphur atoms in the two cysteine residues. Table S19: Summary of all the MD simulations performed. cov and ncov denote covalent and noncovalent dimers, respectively. aa stands for amino acid. cov D is the dimer with two disulphide bonds (Cys116-Cys313 and Cys313-Cys116). ncov M is the non-covalent dimer based on the structure of mouse Cry2.