Optimization Workflow for the Analysis of Cross-Linked Peptides Using a Quadrupole Time-of-Flight Mass Spectrometer

Cross-linking mass spectrometry is an emerging structural biology technique. Almost exclusively, the analyzer of choice for such an experiment has been the Orbitrap. We present an optimized protocol for the use of a Synapt G2-Si for the analysis of cross-linked peptides. We first tested six different energy ramps and analyzed the fragmentation behavior of cross-linked peptides identified by xQuest. By combining the most successful energy ramps, cross-link yield can be increased by up to 40%. When compared to previously published Orbitrap data, the Synapt G2-Si also offers improved fragmentation of the β peptide. In order to improve cross-link quality control we have also developed ValidateXL, a programmatic solution that works with existing cross-linking software to improve cross-link quality control.

C ross-linking mass spectrometry (XLMS) can be used to gain structural insights into proteins and complexes that cannot easily be studied by high-resolution structural techniques. 1,2 Cross-linking is based on the covalent bonding of two amino acids that lie within a certain proximity. This low-resolution fixation of structure provides information on both distance restraints and solvent accessibility. Following digestion of the cross-linked species, a mixture containing both linear and cross-linked peptides is produced. One of the challenges of a cross-linking experiment is that cross-links are low in abundance. As a result, their signal is often suppressed by more intense linear peptide ions during analysis. 3 Earlier studies utilized an LTQ-Orbitrap mass analyzer, with most experimental designs using linear ion trap (LIT) for both collision-induced dissociation (CID) and analysis of fragment ion spectra. 4−6 The lower resolution data obtained using the LIT, however, may lead to incorrect assignments of peptide sequence. This in turn causes an increased risk of missed or false positive cross-link assignment. Current developments in the field now recognize that the accuracy of this analyzer is insufficient to correctly annotate fragment ions, with higher energy collision dissociation (HCD) becoming the dominant analysis method. 7 Recent advancements in time-of-flight (TOF) technology have allowed an increase in achievable resolving power, providing fragment ion scans up to 40 000 fwhm. 8 In addition, TOF resolving power does not depend on acquisition time, offering a near constant resolution across the mass range and faster scan speeds compatible with ultrahigh performance liquid chromatography (UPLC) for both precursor and fragment ion scans. 9,10 Furthermore, the ability to seamlessly integrate quadrupole time-of-flight analysers (QTOFs) with ion mobility separation (IMS) offers an extra degree of separation by size, shape, and charge, without the requisite need for additional analysis time. 11 This potentially provides an opportunity to increase cross-link yield through exploitation of the larger and more highly charged nature of cross-linked peptides.
Although cross-linked peptides have been studied using a variety of analyzers, including QTOFs, 18,29−31 the majority of cross-linking studies have been performed on Orbitrap mass analyzers. As a consequence, current cross-linking software has been optimized for Orbitrap data. Furthermore, a recent study showed that HCD and ion trap CID required different collision energies for optimal fragmentation of cross-linked peptides. 12 It is therefore also necessary to perform optimization of collision energies for the analysis of crosslinked peptides on a QTOF instrument.
Here, we present an optimized method for the analysis of cross-linked peptides analyzed using a QTOF geometry. We performed triplicate analysis of six different energy ramps in order to determine the optimal energy for cross-link fragmentation. The resulting approach allows QTOF data to be analyzed with existing cross-linking software, xQuest, 13 with minimal adaptations. We analyze the effects of collision energy on the fragmentation efficiency of bovine serum albumin crosslinked with isotopically labeled bis(sulfosuccinimdyl) suberate (BS 3d0d12). We offer a programmatic solution, ValidateXL.py, which works with result files from xQuest to improve crosslink quality control and reduce manual validation. This software is freely available to download at https://github. com/ThalassinosLab/ValidateXL. Finally, in contrast to data collected from Orbitrap analyzers 14 we also demonstrate improved fragmentation of the β peptide in the cross-link.

■ RESULTS AND DISCUSSION
In order to determine the most efficient settings for cross-link fragmentation using a QTOF, a comparison of fragmentation energies was carried out using the collision energy ramps described. The ramps were tested in triplicate using crosslinked bovine serum albumin (BSA). Data were then analyzed by xQuest according to the workflow outlined in Figure 1A. Although only minor adaptations were necessary, QTOF data sets were not immediately readable and required further conversion by MSConvert. In addition, a far greater number of cross-link identifications were found, with improved scores, when using the slow de-isotoping algorithm during data processing at both the precursor and fragment ion level ( Figure  S1, parts A and B, and Table S1).   Cross-links that pass a prescoring filter are all returned to the user by xQuest. A final score threshold of 30 was originally recommended to determine true positive cross-link identifications. 15 More recent analysis revised this score to 16 for trypsin digests where this work also revealed a dependence of score threshold on the enzyme used to digest samples. 16 Following evaluation of the various scoring thresholds we chose a conservative score of 20 in our initial validation ( Figures S2−S4). All identified unique BSA cross-linked peptides are shown in Figure 1B. Unique cross-links include those with linkages in the same absolute position but with different peptide lengths and/or modifications such as oxidized methionine. 12 Overall, there is good reproducibility in the cross-links scoring above 20 ( Figures S5−S10). The technical replicates of the mid energy ramp contain the greatest degree of overlap with 38 identical cross-link residue pairs appearing in all three repeats. The high-iTRAQ and mid-iTRAQ repeats contain an intersection of 23 and 22 identifications, respectively (Figures S5−S8). The mid energy ramp is the most reproducible and identifies the most high-scoring cross-links: 103, 111, and 131 in the respective triplicate analysis.
Cross-links were then manually validated in the raw data by confirming their charge, mass-to-charge ratio (m/z), and the presence of doublet precursors with a 12 Da mass shift, corresponding to the d0 and d12 versions of the cross-linker. Figure 1C shows the mean and standard deviation for the validated cross-links found in the triplicate studies. The wide energy ramp and those including the iTRAQ (isobaric tags for relative and absolute quantitation) method display the highest degree of variability. This is most likely due to the broad range of energies sampled by these ramps. The iTRAQ method and the wide energy ramp expose peptides to low collision energies irrespective of their m/z. The iTRAQ method does so in a temporal manner with 50% of the scan time devoted to these low ranges. 17 The wide energy ramp encompasses the full range of energies tested by all five ramps. As the m/z of the cross-link increases the range of energies it is exposed to also increases. Thus, by reducing the selectivity of the energy range a greater distribution of fragmentation patterns are observed. This most likely leads to greater variability in the number of cross-links identified. In Figure 1C the high-iTRAQ method shows the best performance of all energy ramps. This is due to variability in the performance of the second and third technical replicates ( Figure S2) and is most likely the result of the broader range of energies applied in the iTRAQ method. At higher scoring thresholds the mid energy ramp continues to perform the best ( Figure S4). The combination of all three triplicates analyzed using the mid energy ramp identified the highest number, some 277 unique cross-links (see the Supporting Information). As defined above, unique crosslinks are described by their sequence not solely their position. Hence, this high number of identifications is due to differences in peptide lengths and modifications.
To assess the fragmentation patterns of the cross-links tandem mass spectrometry (MS/MS) spectra were interrogated for the presence of cross-linked and linear fragment ions. Figure 2 shows spectra for one of the three cross-links identified by all of the energy ramps (see also Figure S11 for another such peptide). Ions corresponding to cross-linked fragments are shown in red, and linear ones are in blue. It can be seen that the high energy ramp contains no cross-linked fragment ions. The sequence coverage of the low and wide energy ramps are considerably worse, with the base peak in the wide energy ramp representing the intact cross-linked precursor. As this ramp covers a vast range of energies over the course of the scan, insufficient time is spent at any one energy to achieve efficient fragmentation. The mid energy ramp appears to offer the best fragmentation for this cross-link, providing the highest degree of sequence coverage. It should also be noted that, despite the poor sequence coverage of the cross-link in both the low and wide energy ramps, xQuest scores each favorably at 26 and 30, respectively. This shows that although a score threshold is necessary it is not sufficient to differentiate reliable cross-link identifications.
xQuest distinguishes between the two types of fragment ions and scores the correlation between the observed and theoretical fragment ion spectra separately for each type of ion. The results of these correlations are reported as two subscores: XCorrx for cross-linked fragment ions and XCorrb for linear fragment ions. In order to better visualize the effects of the ramps on each type of fragment ion produced, a kernel density estimation (KDE) was performed on the XCorrx and XCorrb subscores. Kernel density estimation estimates the underlying probability distribution from which a sample is drawn. It requires that each instance within the sample is both independent and identically distributed. Parts A and B of Figure 3 show a comparison of the estimated distribution for both the linear fragment ion correlation score and the crosslinked fragment ion correlation score, for the high, mid, and low energy ramps.
For the linear fragment ion correlation the low ramp performs poorly while the high ramp performs well, as indicated by the shifts in KDE in Figure 3A. This suggests that at lower energies linear peptides are fragmenting less efficiently. In Figure 3B the opposite phenomenon is observed.  For cross-linked fragment ions the low ramp performs better than the high ramp. These results demonstrate an opposing effect between preserving cross-linked fragments and obtaining sufficient sequence coverage for linear peptides. It is, therefore, not surprising that the mid energy ramp provides a higher overall number of cross-link identifications when used in combination with the xQuest analysis software.
To further examine the fragmentation patterns associated with each ramp, we analyzed the data for the presence of BS3/ DSS diagnostic ions 18 (Figure 4). These ions have previously been identified in 71% of cross-link spectra analyzed by HCD. 19 In good agreement with previous studies we find that at high energy 52% of cross-link spectra contain the tetrahydropyridine modification ( Table 1). The negative XCorrx scores in the high energy are likely due to the fragmentation of the cross-linker amide bond preventing the observation of cross-linked fragment ions.
When cross-links are utilized in structural modeling approaches, cross-link residue pairs, defined solely by their position in the structure, are of paramount importance. 2,20 To compare in detail the effects of the collision energy ramps on residue pair identifications, cross-links were further validated using in-house software ( Figure 5A). The program works with the xQuest xml result files to filter cross-links based on whether the identification meets the more recent published standards for confident cross-link assignment. 7 These include signal/ noise, fragmentation of both peptides, and the presence of both cross-linked and linear fragment ions. As expected, these more stringent criteria lower the overall number of identifications ( Figure 5B). The identifications exposed by the high ramp were reduced significantly, with a mean cross-link identification rate of 11 ( Figure S12). As mentioned, this is most likely due to the loss of cross-linked fragment ions. The mid ramp remains the best performing identifying 37, 37, and 39 cross-links in each of the technical repeats (Table S2). In addition, the intersection of the three best performing ramps shows that there is little overlap between the identifications made by each one ( Figure  5B). (The intersection for all energy ramps is shown in Figure  S13.) Incorporating one or more energy ramp leads to a higher number of cross-link identifications. This can be explained by the relationship between fragmentation efficiency and m/z. 21 Cross-links are composed of two peptides each with their own optimal fragmentation energy. By combining different energy ramps, optimal fragmentation of up to 40% more diverse crosslink identities can be achieved ( Figure 5B).
It has frequently been reported that fragmentation of both peptides within a cross-link is not symmetric. 7,12,14,19 One peptide, in most cases the larger of the two, will fragment more readily than the other, so a higher sequence coverage is observed. In line with xQuest we define the larger of the two peptides within the cross-link as the α peptide for this analysis. Figure 6 displays the distribution of annotated peaks for both the α (lilac) and β (purple) peptides within the cross-links. Across most of the energy ramps the α peptide fragments more than the β peptide, with the only exception being the low ramp. However, annotation of both peptides is higher than 40%. This is in contrast to previous reports, when using both ion trap CID 14 and HCD 12 with an Orbitrap analyzer, where the β peptide consistently displayed poorer fragmentation. In the case of ion trap CID only 22% of the most intense annotated fragment ions corresponded to the β peptide. 14 Kolbowski et al. attribute this phenomenon to the nature of cross-linked peptides rather than the cross-linker used. 12 One possible explanation for this difference in β-peptide annotation may involve the way fragmentation energy values by vendor instrument operator software are calculated. Normalized collision energy reported by ThermoScientific is a linear percentage of the available collision energy, which compensates for the mass dependency of optimal collision energy. 22 An ion of a particular m/z value will be exposed to a single energy value. As previously discussed, the Waters Corporation energy ramp exposes a precursor of a particular m/z to a range of energies across the course of a scan. As crosslinks are a combination of two peptides, each with different m/ z ranges, this may be advantageous; the optimal energy for fragmentation is unlikely to be related solely to the m/z of the entire species.
xQuest does not consider ions generated through fragmentation of both peptides in a cross-link. As the software was written to using data from fragment ions generated by ion trap CID, with a mass range of 200−2000 Da, immonium ions are also not considered. These ions are diagnostic of the presence of specific amino acids in a peptide and are often used in de novo peptide sequencing. During analysis of fragment ion spectra the presence of these ions was found to inhibit rather than enhance the xQuest scoring algorithms. Figure 7 shows the fragment ion spectra of a monolink identified by xQuest. Immonium ions for both lysine, following the loss of ammonia (84 Da) and valine (72 Da), are present and indicated in the figure. The mass shift between these ions is 12 Da; xQuest has erroneously identified the peak at 84 Da as a cross-linked peak. Although xQuest can be modified to include a wider mass range, it cannot annotate the peaks correctly as the algorithms do not expect the presence of immonium ions. As the correlation algorithm does not have a low-mass cutoff this leads to unannotated peaks in the spectra and poorer overall correlation of the theoretical and observed cross-linked fragment ions. Adjustment of the xQuest search parameters

Analytical Chemistry
Article to include a wider mass range is therefore not recommended as a necessary modification for searching QTOF data.

■ CONCLUSION
QTOF mass spectrometers are ubiquitous in most lines of MS research including metabolomics, proteomics, and smallmolecule analysis; however, they have not yet been widely applied to the field of cross-linking mass spectrometry. We have shown that QTOF instruments can indeed be used to analyze cross-linked peptide samples and that data from this instrument configuration can be processed directly with existing cross-linking software applications, with little modification to their parameters. Due to the unique divergent energy ramp that applies a range of energies across each scan we have demonstrated that the Synapt G2-Si offers improved fragmentation of the β peptide when compared with published data collected on an Orbitrap instrument. 12,14 Our analysis also shows that, by combining two energy ramps that occupy alternate energy ranges, different cross-links can be identified. This presents a unique opportunity to improve cross-link yield by up to 40%. Given that adaptation to collision energy is not required for the analysis of different cross-linker chemistries by an Orbitrap, 23 this protocol could be applicable to a broad range of cross-linkers. This includes the widely used MS cleavable cross-linkers. 24,25 In addition, validation of the results is laborious; time requirements increase with cross-link yield. Employment of a simple score threshold serves only as a guide. We offer a programmatic solution which works directly with the xQuest result xml file to gain confidence in the results generated from the search. This enables cross-link identifications to feed directly into structural modeling approaches, as we have previously shown that higher quality cross-link identifications are more beneficial for modeling purposes. 20,26,27 Finally, the development of an optimized method for the analysis of cross-linked peptides on a QTOF provides the groundwork for exploration of the addition of IMS to a crosslinking experiment. This gas-phase fractionation is conducted online and may lead to a reduction in sample preparation   Sample Preparation. Amounts of 0.3 mg/mL BSA and 1 mg BS3 d0/d12 were prepared in 20 mM HEPES at pH 7.6. The cross-linker was added to the protein and diluted to a final concentration of 2.5 mM BS3 d0/d12. The sample was then incubated at room temperature for 40 min under mild agitation. Following incubation, the reaction was quenched by adding 1 M ammonium bicarbonate to a final concentration of 50 mM. The samples were then evaporated to dryness in a vacuum concentrator and resuspended in 8 M urea at 1.1 mg/ mL concentration.
A solution of 1% RapiGest to a final concentration of 0.1% was added to aid solubilization before digestion. The sample was then incubated with 10 mM dithiothreitol at 37°C for 30 min to denature the protein and reduce the disulfide bonds. Following incubation, the sample was cooled to room temperature. In order to prevent disulfide bond reformation iodoacetamide was added to a final concentration of 20 mM. As iodoacetamide is unstable when exposed to light the mixture was incubated in the dark, at room temperature for 30 min. The sample was then diluted with 50 mM ammonium bicarbonate to reduce the final concentration of urea to <1 M. Trypsin 50:1 protein to enzyme was added to the sample. The reaction mixture was incubated overnight at 37°C with mild agitation. Following overnight incubation enzymatic activity was quenched by adding formic acid to a final concentration of 2% (v/v). In preparation for size exclusion chromatography (SEC) fractionation the sample was purified using Sep-Pak SPE cartridges and evaporated to dryness.
Following SPE the sample was resuspended in 20 μL of SEC buffer (degassed water/acetonitrile/TFA at 70/30/0.1 v/v/v). An amount of 15 μL of sample was injected onto an equilibrated Superdex Peptide column. Fractions of 100 μL were collected. The samples were then evaporated to dryness and resuspended in 10 μL of liquid chromatography−mass spectrometry (LC−MS) buffer: 95% H 2 O, 5% acetonitrile, and 0.1% formic acid.
LC−MS/MS Analysis. Samples were introduced using nano ultraperformance liquid chromatograph (10kPsi nanoACQ-UITY Waters) with buffers (A) MS grade water with 0.1% formic acid and (B) MS grade acetonitrile with 0.1% formic acid. Samples were desalted by a reversed-phase Symmetry C18 trap column (180 μm internal diameter, 20 mm length, 5 μm particle size, Waters Corporation) at a flow rate of 8 μL/ min for 3 min in 99% buffer A. Peptides were then separated using a linear gradient (0.3 μL/min, 35°C; 97−60% buffer A over 90 min) using a BEH130 C18 nanocolumn (75 μm internal diameter, 400 mm length, 1.7 μm particle size, Waters Corporation). The TOF analyzer was externally calibrated from m/z 175.11 to 1285.54 using [Glu1]-fibrinopeptide B at 500 fmol/μL. Data-Dependent Acquisition. Samples were analyzed using a Waters Synapt G2-Si quadrupole time-of-flight mass spectrometer tuned to a resolution of 20 000 (fwhm). Accurate mass measurements were made using DDA. The top 10 most intense precursors with charge states between +3 and +7 were selected over a mass range of 50−3000 Da with a scan time of 0.15 s and an interscan delay of 0.05 s. MS2 spectra were acquired using collision energy ramps specified in Table 2.
Cross-Linking Analysis. Raw files were processed in PLGS v3.2.0 (Waters) using slow de-isotoping algorithm for both MS and MS2 data. After PLGS processing, the files are exported in mascot generic format (mgf) and further converted to mzXML format for input to xQuest. 4 Conversion to mzXML was accomplished with MSConvert (v3.0.7414) using 32-bit binary encryption.
xQuest was designed and trained on sample data obtained from an Orbitrap mass analyzer operated with ion trap CID, as such modifications to the search parameters were necessary to accommodate QTOF data. These can be found in Tables S3−  S7. Briefly, data obtained from an Orbitrap mass analyzer differs from data obtained in a QTOF as it is not deconvoluted and contains multiple charge state fragment ions. Tolerance for peak matching was adjusted to 5 ppm for precursors and 10 ppm for fragment ions. Additional search parameters included the following: cross-linking was assumed to occur with lysines, tyrosine, serine, threonine, and the protein N′ terminus; two possible missed cleavages; variable modification of methionine oxidation; fixed modification of cysteine carbamidomethylation. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE 28 partner repository with the data set identifier PXD011704.
ValidateXL.py is a python script written in Python 3.5. ValidateXL.py analyses the xQuest merged_xquest.xml files which are generated and stored in the xQuest result file after a completed search.
ValidateXL extracts information from the XML files to assess the quality of a cross-link identification based upon the signal/ ratio of both the cross-linked and linear fragment ions. It filters cross-links and creates three CSV files: cross-links which are of significantly high quality to be used for modeling, cross-links that require manual validation, and cross-links that should be rejected. ValidateXL requires the XML and fasta files for the protein of interest and will run on multiple xQuest search results. For full description and details on usage can be found in the Supporting Information.