Noncovalently Associated Peptides Observed during Liquid Chromatography-Mass Spectrometry and Their Effect on Cross-Link Analyses

Cross-linking mass spectrometry draws structural information from covalently linked peptide pairs. When these links do not match to previous structural models, they may indicate changes in protein conformation. Unfortunately, such links can also be the result of experimental error or artifacts. Here, we describe the observation of noncovalently associated peptides during liquid chromatography-mass spectrometry analysis, which can easily be misidentified as cross-linked. Strikingly, they often mismatch to the protein structure. Noncovalently associated peptides presumably form during ionization and can be distinguished from cross-linked peptides by observing coelution of the corresponding linear peptides in MS1 spectra, as well as the presence of the individual (intact) peptide fragments in MS2 spectra. To suppress noncovalent peptide formations, increasingly disruptive ionization settings can be used, such as in-source fragmentation.


S1 Visualization of cross-links and noncovalently associated peptides
Fig. S1 visually describes the different types of peptides relevant for the main manuscript. Importantly, only the cross-linked peptide (a) and the noncovalently associated peptide (c) have the same mass because of the loop-link in c). This mass ambiguity is the reason that noncovalently associated peptides can be misidentified as cross-links. More details on general cross-linking nomenclature can for example be found in Rappsilber [1]. Note that the two peptides in c) do not need to have the same sequence. Figure S1: Visualization of different peptide definitions. (a) a typical cross-link between two peptides. (b) a non-covalent peptide association between two peptides. (c) same as (b) but one of the peptides is loop-linked, the mass of this species is the same as that of a cross-linked peptide.

S2 CLMS identifications assuming cleavability of SDA
In this section we used MeroX [2] to identify MS-cleavable cross-linker products from the Q Exactive (QE) data set. We used the same settings for MeroX (v. 1.6.6.6) as in [3]. The SDA reaction product is assumed to be cleavable when involving a carboxylic acid functional group [3]. The individual search results were combined using the MeroX Merger (v. 1.2) with the -P 5 setting to set the desired FDR cut-off to 5%. From the merged results we extracted the unique links and computed their distance in the crystal structure of HSA (PDB: 1AO6). For the QE data, 184 unique links were identified of which 160 could be mapped to the crystal structure. 38% (61 links) were long-distance links (C α distance ≥ 25Å), while 62% (99 links) matched the distance constraint. For the Velos data, 34 unique links were identified of which 29 could be mapped to the crystal structure. 21% (6 links) were long-distance links, while 79% (23 links) matched the distance constraint. The results are consistent with the presented data in Figure 1 of the manuscript. The distance histogram for the QE data shows a very prominent enrichment of false positives exceeding the distance threshold. While in both cases the desired FDR is not met, we hypothesize that the Velos results are suffering from the low number of identified links. Therefore, reliable FDR estimation is hindered. In general, Fig. S2a-b shows that MeroX is also able to identify the non-covalent peptide associations using a cleavable cross-linker search. However, it is difficult to judge how many of the identified cross-links below the 25Å cut-off are true cross-links (assuming cleavage of the cross-linker) and how many are non-covalent associations. Since the search itself is not aware of any distance constraint, an obvious assumption is that non-covalent associations should be distributed without preference below and above the distance cut-off. In contrast, true cross-links will be enriched below the distance cut-off. Visually projecting the number of longdistance links to the area below the distance cut-off indicates that a large portion of the within-distance links are in fact non-covalent associations.
In addition, we also analysed the retention times (RT) from linear peptides with SDA modifications (e.g. loop-linked or hydrolyzed cross-linker, see [4] for visualizations of the modifications). Interestingly, the RT of linear peptides is approximately increased by 24 minutes with a single sda-loop modification (Fig. S2c). Subsequently, the RT is almost doubled (42 minutes) when two loop-links were found in a peptide compared to the unmodified version. This information can be used to compare the RT of the linear peptides that were identified in a cross-link / non-covalent peptide association. We used the simplified assumption that identifications are true cross-links when the distance constraints were met and non-covalent association otherwise. In Fig. S2d, the RT difference between the two peptides in a cross-link / non-covalent association is shown. Initially, we tried to map the individual peptide sequences identified by MeroX to the linear (modified) peptide identifications from MaxQuant. For this one of the two peptides identified in a cross-link by MeroX was assumed to carry a loop-link modification. Under these assumptions only a small number of cross-linked peptides yield a RT for both peptides. The reason is that the individual peptides identified by MeroX were not identified in their loop-linked form in MaxQuant. For peptides that are cross-linked, the RT difference from the individual peptides should be randomly distributed. For peptides that are noncovalently associated, the RT difference from the individual peptides should be closely distributed zero. Because MeroX does not search for loop-link modifications in the search for noncovalently associated peptides the RT difference that is introduced through this modification needs to be accounted for. Therefore, the expected RT difference for the individual peptides from non-covalent associations is on average 24 minutes. Indeed, the two RT difference distributions from cross-links and non-covalent associations look different and match the above described expectation (Fig. 3c). However, the large enrichment of within-distance links with a very small RT difference hints on these identifications being non-covalent peptide associations.

S3 Flow Rate Analysis on Q Exactive High-field
To further investigate the effect of different flow rates on the formation of non-covalent associations we acquired the protein mix (non-cross-linked sample) on the Q Exactive High-field with three flow rates (in triplicates): 0.2 µL min , 0.25 µL min and 0.3 µL min . The differences in the number of identifications were only small (Fig. S3a) and comparable to the results from the main text (24 PSMs with IS-CID 0). To achieve the desired FDR cut-off of 5% the results were cut after the first decoy hit (S3b).

S4 Falsely identified cross-link suggesting homo-dimerization
The spectrum in Fig. S4 shows an example of a cross-link that can falsely lead to the assumption of homodimerization.   Figure S4: Alternative explanation for a cross-link that would suggest homodimerization. Upper panel, annotation from non-covalent search. Lower panel, annotation from cross-link search. Raw file: V127 J; scan: 34926