Isobaric Labeling Proteomics Allows a High-Throughput Investigation of Protein Corona Orientation

: The formation of the biomolecular corona represents a crucial factor in controlling the biological interactions and tra ﬃ cking of nanomaterials. In this context, the availability of key epitopes exposed on the surface of the corona, and able to engage the biological machinery, is important to de ﬁ ne the biological fate of the material. While the full biomolecular corona composition can be investigated by conventional bottom-up proteomics, the assessment of the spatial orientation of proteins in the corona in a high-throughput fashion is still challenging. In this work, we show that labeling corona proteins with isobaric tags in their native conditions and analyzing the MS/MS spectra of tryptic peptides allow an easy and high-throughput assessment of the inner/outer orientation of the corresponding proteins in the original corona. We put our results in the context of what is currently known of the protein corona of graphene-based nanomaterials. Our conclusions are in line with previous data and were con ﬁ rmed by in silico calculations.


■ INTRODUCTION
The past 20 years have witnessed a steadily increasing research interest in the use of nanomaterials for biological and biomedical applications. 1−3 Among nanomaterials, graphenebased nanoflakes have shown interesting potentials for biological applications, 4−6 and several works have investigated their properties in terms of interactions with proteins and cells. 7−13 In this context, the study of the biological interactions occurring on the nanomaterial surface and their engagement mechanisms with biological systems is of paramount importance to identify successful strategies for the design of effective nanoformulations. 14 It is well accepted that the layer of biomolecules, mostly proteins, adsorbing on the surface of nanomaterials, determines the final biological identity of the nanomaterials once in contact with biological fluids in vitro and in vivo (the so-called biomolecular or protein corona). 15 −20 It has been shown that the biomolecular corona can mask the nanoparticle-designed functionality, strongly affecting the efficacy of the formulation and providing an entirely new set of interacting moieties. The corona composition, generally characterized through mass spectrometry, 21−23 is strictly related to several factors such as the core material, the nanomaterial size and shape, the surface chemistry, and the biofluid composition. However, only the biomolecular domains that are exposed at the periphery of the layer participate in the biological engagement with other biomolecules, cell membranes, and cell receptors. 21,24−26 Therefore, it is not sufficient to characterize the protein layer in terms of mere composition, but it is of utmost importance to identify how proteins preferentially orient their structure onto the surface of the nanomaterial to identify key epitope domains driving the bionano-interactions. To this aim, several recent works have made use of immunometric mapping techniques based on various types of reporters. 27−31 These techniques allow identifying the availability and functionality of specific recognition sites by exploiting the affinity and specificity of antibodies, and it is conceptually possible to multiplex the mapping of antibody signal by using a plethora of different reporters. However, the methodology still presents some practical challenges, and complications arise when considering that the biomolecular corona in situ is not a monolayer, and proteins undergo dynamic exchanges in vivo. 32 Zhang and colleagues 31 recently characterized the binding proteins forming the total corona, suggesting that only a minor percentage of proteins is able to bind its target. Based on their findings, they proposed a multilayer corona structure, focusing on the active sites of biological recognitions of the external corona. These approaches, however, do not provide information on the spatial orientation of the whole inner protein structure (in contact with the material surface, the socalled hard corona), an aspect that is commonly investigated in silico. 33,34 In this work, we propose a novel, complementary, high-throughput methodology for the study of protein corona orientation. Graphene-based materials, due to their specific affinity for proteins and large carrier capability, are emerging as a novel class of biomedical nanomaterials for drug delivery, sensing, and in vitro diagnostics. 35−37 Given the involvement of our group in the Graphene Flagship, we used two graphenebased materials (with distinct surface chemistries) as models for a proof-of-concept test of our approach. This approach exploits isobaric labeling of proteins, 38 followed by conventional bottom-up proteomics, to derive information on the spatial orientation of a given peptide directly from the list of identified proteins. Other methods, 39 while similar in the general idea, do not allow easy and high-throughput investigation or multiplexing. The results we obtained for the protein corona of few-layer graphene (FLG, bare graphene surface) and graphene oxide (GO), confirmed by in silico analysis, suggested an intriguing connection between protein preferential adsorption and orientation onto specific graphenebased surface chemistries.

■ MATERIALS AND METHODS
Our work relies on a standard approach for bottom-up proteomics investigation of protein corona performed by highresolution mass spectrometry. In short, the materials were suspended in PBS and incubated with commercial human plasma. Following the labeling at the protein level in native conditions with the first tag, the corona was denatured and labeled with the second tag, then digested with trypsin, and analyzed. The only difference with conventional workflows for corona proteomics is related to the labeling with isobaric tags itself, which is performed following the protocol suggested by the vendor. The full description of our experimental approach is reported in the Supporting Information.

■ RESULTS AND DISCUSSION
We moved from previous works in the field of protein corona orientation 27,30 to figure out a high-throughput way to obtain relevant information directly from conventional bottom-up proteomics experiments used to identify the corona composition. The idea is to employ well-established techniques for protein isobaric labeling, widely used over the past 10 years to quantify protein over-or underexpression in various experimental conditions. 40,41 Briefly, this approach uses isotopically labeled tags to covalently label K residues and N-terminal of tryptic peptides from different experimental conditions, say A and B. While each tag is isotopically and specifically labeled with the use of 15 N and 13 C atoms, the overall mass of the tag is the same for all tryptic peptides deriving from either A or B condition. In the bottom-up proteomics experiments, both peptides elute simultaneously from the chromatographic column and, having the same massto-charge ratio (m/z), are selected together for the MS/MS fragmentation. The fragment ions related to the primary sequence of the peptides (allowing one to assign the peptide to a given protein) are the same for both peptides. At the lower end of the MS/MS spectrum, two distinct reporter ions, each at a peculiar m/z ratio, are present. The relative intensities of the reporter ions are directly related to the abundances of the two peptides (and proteins) in the experimental groups. To our purposes, we chose Thermo tandem mass tags (TMT) 42 to selectively label the "outer" and "inner" surfaces of the protein corona. This approach holds the advantage to be independent of the complex dynamics happening at the periphery of the layer as it can identify the preferential orientation of a protein onto a certain nanomaterial with given surface chemistry. To maintain the spatial information, we labeled the corona at the protein level rather than at the peptide level: therefore, labeling occurs before digestion with trypsin, opposite to the usual protocol. This procedure results in a generally slightly lower number of positive protein identifications and a lower protein sequence coverage and produces larger peptides as labeled K residues are not recognized by trypsin, resulting in missed cleavage sites. This alternative approach has also been widely used in the past, 43,44 including by our group for proteomics on graphene oxide. 7 It is important to point out that the information on the orientation of the K residues of each peptide positively assigned to a protein ID is automatically calculated by any proteomics data analysis software and it is retrieved directly from the list of identified proteins. The orientation data thus come automatically, together with all other conventional bottom-up proteomics results, making this methodology suitable for high-throughput screening of nanomaterials. Figure 1 shows a general schematic of the implemented isobaric labeling strategy in the corona orientation workflow.
As a proof-of-concept study, two distinct materials, few-layer graphene (FLG) and graphene oxide (GO), were incubated in human plasma (HP) for 48 h to enrich the corona in proteins having the highest affinities. 36 The materials characterization is reported in Figure S1. After washing the loosely attached proteins with a series of centrifugal steps in phosphate-buffered saline (PBS; see Materials and Methods), the exposed K residues of the long-lasting corona under its native condition were labeled with the 126 TMT tag (an isobaric tag that produces a 126 m/z fragment ion in the MS/MS spectrum). After removing the unreacted tag, the protein corona was denatured using very harsh conditions (see Materials and Methods). This treatment detached the corona from the material, exposing the "inner" part of the corona that was then labeled with the 127 TMT tag. After precipitation in cold Analytical Chemistry pubs.acs.org/ac Article acetone, the labeled proteome was digested with trypsin, and a conventional LC−MS/MS protein identification experiment was performed. As usual, peptides underwent MS/MS experiments to match their backbone fragments against suitable databases for protein identification. Only the peptides carrying labeled K were selected as they were the ones showing the two reporter fragment ions (126 and 127 m/z) in their MS/MS spectrum. The relative intensity of these ions was correlated to the preferential In/Out orientation of the corresponding peptide in the native corona. The samples were analyzed by LC−MS/MS, and all the obtained MS/MS spectra were searched against the Homo sapiens protein database. We first focused on protein identification, retaining those protein-matching all the three replicates of each group. We then compared the protein IDs from the two groups, as reported in Figure 2 (left).
A total of 57 proteins were identified in both groups, while 4 and 66 were exclusively identified in the FLG and GO groups, respectively (see Supporting File 1 for the complete list of proteins). With gene enrichment analysis ( Figure 2, right), we highlighted the top 10 most significant biological pathways differently represented by the proteins forming corona over the two materials. Figures S2 and S3 show the corresponding most enriched biological processes and molecular functions, respectively. We then focused our attention on the 57 proteins shared by the two materials, visually scrutinizing the peptides assigned to each ID to determine the influence of the surface chemistry on the protein preferential orientation. The full peptide dataset is reported in Supporting File 2. We identified 46 peptides, assigned to 22 of the common proteins, observed in both samples (FLG and GO) in all the three replicates. All these peptides have K in their sequence and are thus carrying information about their In/Out orientation. From a visual   With the aim to have a general overview of the whole dataset, we then used principal component analysis (PCA) using the observed In/Out ratios of the 46 peptides as variables. We then performed t-test statistics for the two groups and a heatmap analysis to highlight differential orientation trends. Figure 3 summarizes our findings.
The PCA shows that despite the highly complicated appearance of the quantitative dataset, it is indeed possible to observe patterns in the In/Out orientation of the 46 peptides, as also depicted in the heatmap analysis, which shows orthogonality between the orientations on the two surfaces for some of the proteins. For example, two of the three mapped albumin peptides (NECFLQHKDDNPNLPR and VTKCCTESLVNR) appear to be oriented "In" when in FLG corona (red values) and "Out" when in GO corona (blue values). The third peptide (YTKKVPQVSTPTLVEVSR) generated apparently random In/Out ratios, with blue and red values mixed in the replicates. In this case, the covalent labeling is capturing a nonspecific orientation of this part of the sequence. This happens because that part of the sequence is flexible enough to be dynamically moving on the corona structure. Similar conclusions can be drawn for the APOE peptide LSKELQ AAQAR and for the HPDEAAFFD-TASTGKTFPGFFSPMLGEFVSETESR peptide of fibrinogen alpha chain (FIBA). An opposite trend ("Out" for FLG and "In" for GO) is observed for the DASGVTFTW TPSSGKSAVQGPPER peptide of immunoglobulin heavy constant alpha 1. Our method is, in this case, capturing kinetic processes ongoing for parts of the sequences of corona proteins. The observed trends in the In/Out ratios might thus be associated with preferential orientations, or most likely configurations, that these sequences assume in the corona. Based on this assumption, we then hypothesized that proteins showing high specificity for a given material might have a specific affinity for its surface, therefore assuming more univocally oriented conformation in the corresponding corona. We then looked into the 4 and 66 proteins exclusively observed in FLG or GO, respectively. We used the Venn analysis of the peptides ( Figure S4) to select those peptides unique to each of the two materials. In the end, a list of 42 peptides was present in all three replicates of each group (the details of the analysis are reported in Supporting File 2). The proteins corresponding to these peptides present a more specific orientation onto the surface of the two graphene materials. Indeed, 19 out of 42 peptides (45%) show concordant In/Out ratios in all three replicate samples. This suggests a link between the protein affinity for given surface chemistries and preferential orientation of the protein onto that surface. Quite interestingly, all the concordant peptides were observed in the GO group only, perhaps suggesting that this material can induce a more defined orientation of the specific proteins in its corona. Table 1 reports the summary of these 17 concordant peptides (14 corresponding proteins).
To confirm our findings, we modeled the interaction of some of these proteins with the GO surface by performing docking simulations for proteins showing an In/Out peptide ratio smaller than 1 and a known experimental structure. In general, we observed a good agreement between the measured In/Out ratio of the peptides and the predicted docked orientation. Figure 4 reports the highest-ranking docked poses of five proteins: carboxypeptidase N1 (CBPN), hemoglobin a (HBA), immunoglobulin heavy variable 3-23 (HV323), serotransferrin (TRFE), and ubiquitin 60S (RL40) on GO, along with the average measured In/Out ratio of the corresponding observed peptides (represented as yellow spheres).
As expected from the experimental proteomics data, the peptides with In/Out < 1 are mostly located on the periphery of the protein, thus indicating that the preferential orientation of specific proteins is efficiently "captured" by this labeling strategy and proteomics workflow. When comparing the results obtained with published proteomics data for FLG and GO, some difficulties are expected, given the use of different graphene-based materials (GRMs), experimental setups, and  Figure 4.
Analytical Chemistry pubs.acs.org/ac Article protein identification databases. We compared our results to those of two recent publications on protein corona for bare graphene nanoflakes in the human serum 11 and GO in human plasma. 35 The extensive comparison of protein IDs of our study with those reported in these two papers is reported in Supporting File 3. We first noticed that despite the already discussed limitations of protein-level labeling, the total numbers of identified proteins in these studies (100 for the work of Castagnola et al. 11 and 153 for the work of Di Santo et al. 35 ) are only slightly higher compared to those found in this study (61 for FLG and 123 for GO). While comparing our GO data with the work of Di Santo et al. (Figure S5A), roughly 50% of the IDs match, indicating that our GO corona composition is comparable. As far as FLG is concerned, the comparison with the human serum protein corona from Castagnola et al. 11 ( Figure S5B) indicates that 35% of the corona proteins are shared by all three materials. GRM materials are not standard nanomaterials used for proteomics studies as they do not present a regular geometrical nanoshape, and they are mostly a collection of "flakes" with different sizes and shapes. Moreover, GRMs present diverse dispersion stability in relation to their surface chemistry and hydrophobicity level. It is interesting to note that despite the intrinsic complexity in handling these materials, a trend in the preferential protein affinity for different GRM surfaces can be extrapolated from data coming both from the present study and other reports, even though the GRMs and biofluids employed came from different sources (see Figure S6). Thirtyone proteins are reported in Table 2, which are recurrent in the protein corona of these four GRMs. This comparison suggests that our orientation data obtained with TMT experiments were generated on a protein corona that is representative of what is currently known for other GRMs. We also focused on the corona orientation data reported by Castagnola et al. 11 Based on a positive recognition  Analytical Chemistry pubs.acs.org/ac Article of the corona proteins by an antibody targeting the 113−243 sequence of ApoA1, the authors show that this (large) part of the protein (54% of the sequence) is exposed on the outer surface of the corona. It is difficult to compare results from such different techniques: the broad part of a protein recognized by an antibody might be globally oriented outward, while individual K residues present in its sequence might still be facing the material. We nevertheless searched our orientation data on ApoA1 for the corona of FLG, the closest material to the graphene nanoflakes used in this paper. The sequence (155−173) QKLHELQEKLSPLGEEMR shows an apparently random orientation in our data, with an average In/ Out ratio of 1.1. The preceding peptide 108−140 QEMSK-DLEEVKAKVQPYLDDFQKKWQEEMELYR, on the contrary, was not immediately present in our proteomics dataset. This type of very large peptide (this one has a mass of 5325 Da with the TMT tags) is not usually taken into consideration in bottom-up proteomics experiments. We then manually searched our RAW data, looking for MS and MS/MS data of this individual peptide. Indeed, we found evidence of this peptide ( Figure S7A) as demonstrated by the presence of the extracted ion current of its charge state 6+ ion at 887.98 m/z. The mass spectrum ( Figure S7B) shows that this peptide is detectable as charge states 6+, 5+, and 4+. It is normally difficult to obtain good MS/MS data from such large peptides, but by dedicated experiments, we managed to confirm the sequence of this molecule ( Figure S8). Despite the presence of five K residues in its sequence, each in principle having a different orientation, we retrieved the data on its global In/Out ratio that, in our experiments, was constantly <1 for the three FLG replicates ( Figure S9). This result supports the global "outer" orientation of this part of ApoA1, and it is in agreement with previous data.

■ CONCLUSIONS
This work represents a proof-of-concept study highlighting an innovative strategy to simultaneously investigate the composition and the orientation of the proteins forming the biomolecular corona onto nanomaterial surfaces. Our corona composition data are in line with the current knowledge of the protein corona of widely used GRMs (FLG and GO), and the orientation data we obtained with the TMT tags match the poses calculated by molecular docking. Our method allows us to obtain data on protein orientation directly from large-scale proteomics datasets, thus in a high-throughput way. In addition to conventional bottom-up proteomics experiments for protein corona, this approach is (1) fast (a few additional hours of treatment are required), (2) easy to perform, (3) relatively cheap, and (4) minimally invasive toward posttranslational modification analysis (several tens of phosphorylation sites were identified in our data). Moreover, our method can probe individual K residues, and it thus provides a higher resolution compared to conventional immunometric mapping, which normally relies on the antibody recognition of large sequences of corona proteins. For a better understanding, Table 3 gives a general overview and a comparison of the advantages of our method compared to others currently available.
The workflow we developed could be routinely implemented into protein corona proteomics studies, and it would represent a valuable tool in line with the great efforts for nanomaterial biointeraction screening and classification promoted by several EU nanosafety projects (Nanosolutions: https://nanosolutionsfp7.com/, NanoREG I and II: http:// nanoreg2.eu/, Nanoclassifier: https://cordis.europa.eu/ project/id/324519/it, etc.). These efforts would greatly benefit from easy-to-obtain information on the epitopes of corona proteins that are exposed to biological interactions. Furthermore, our approach is multiplexable: TMT technology (Thermo) allows us to label 2, 6, or 10 different conditions, and ITRAQ (Sciex) allows 4 or 8 multiplexed experiments. In this study, we used the TMT-6plex with the 126 and 127 tags. All proteomics data analysis software allow us to easily handle the plexing multiplicity, customize the reporting in many ways, and retrieve different In/Out ratios directly from the protein lists. This protocol opens up possibilities for analysis of increasing complexity to get closer to the realistic in vivo scenario. For example, different "layers" of corona proteins can be analyzed in the same experiment by detaching treatments of increasing strength followed by selective labeling with the corresponding multiplexed tags. This might give an opportunity to investigate proteins that are more loosely bound to the material surface and prone to dynamic exchange equilibria with the solvent and the biological environment (so-called soft corona). 52 This investigation is currently performed with in situ spectroscopic techniques, 29,53−55 given the difficulty in tuning the incubation/removal conditions to discriminate and isolate the different corona layers, i.e., the outer layers of the corona, loosely bound to the material. While our method can generate orientation data with small additional effort compared to routine proteomics experiments on protein corona, the graphical representation of protein orientation will require dedicated software solutions. Finally, to achieve the highest sequence coverage (thus orientation data), overcoming the , is gratefully acknowledged for the use of Raman spectroscopy. The authors acknowledge the electron microscopy facility members of the Nanophysics Department at IIT Genova for the assistance with electron imaging.