MicroPOTS Analysis of Barrett’s Esophageal Cell Line Models Identifies Proteomic Changes after Physiologic and Radiation Stress

Moving from macroscale preparative systems in proteomics to micro- and nanotechnologies offers researchers the ability to deeply profile smaller numbers of cells that are more likely to be encountered in clinical settings. Herein a recently developed microscale proteomic method, microdroplet processing in one pot for trace samples (microPOTS), was employed to identify proteomic changes in ∼200 Barrett’s esophageal cells following physiologic and radiation stress exposure. From this small population of cells, microPOTS confidently identified >1500 protein groups, and achieved a high reproducibility with a Pearson’s correlation coefficient value of R > 0.9 and over 50% protein overlap from replicates. A Barrett’s cell line model treated with either lithocholic acid (LCA) or X-ray had 21 (e.g., ASNS, RALY, FAM120A, UBE2M, IDH1, ESD) and 32 (e.g., GLUL, CALU, SH3BGRL3, S100A9, FKBP3, AGR2) overexpressed proteins, respectively, compared to the untreated set. These results demonstrate the ability of microPOTS to routinely identify and quantify differentially expressed proteins from limited numbers of cells.


INTRODUCTION
Mass spectrometry (MS) has emerged as the most powerful technology for analysis and discovery of proteins. 1 Since the term proteome was coined in 1994, 2 researchers have used MS to comprehensively define the molecular mechanisms that underpin cellular functions. To achieve this goal, there are two main approaches to proteomics: top-down and bottom-up, with the latter being more often applied today. 3,4 Significant gains have been made by applying these approaches in largescale studies 5 to fully profile protein expression and their posttranslational modifications. However, standard proteomic analysis demands substantial amounts of starting material to exhaustively characterize a proteome. For instance, about 10 5 to millions of cells have typically been used to achieve a high proteome coverage. 6 Historically, utilization of such large amounts of starting material often precluded the ability for proteomics to compete with genomics in the analysis of small numbers of cells. This is because genomics allows sample material to be amplified through polymerase chain reaction (PCR). 7 Proteomics on the other hand has had to pay special attention to sample preparation of small numbers of cells in order to avoid adsorptive losses of low abundance proteins. Therefore, this limitation has hampered the application of proteomics to study samples of limited availability such as human tissues from, e.g., biopsies.
Against the backdrop of the shortcomings of traditional macroscale sample preparation, mostly inherited from the field of protein chemistry, methods for working with limited numbers of cells have recently been reported, including laser capture microdissection, immobilized enzyme reactors, fluorescence-activated cell sorting (FACS), and microfluidics formats. 8−14 For example, using laser capture microdissection, Clair et al. 10 identified >3400 proteins from 4000 cells. However, none of these recent developments allowing characterization of proteomes from fewer than 1000 cells by MS-based proteomics can yet be said to be the method of choice for exploring proteomes. Thus, there is burgeoning interest in the community to develop and optimize highly sensitive and specific microscale proteomic workflows to interrogate protein changes in both health and disease. Consequently, reports of micro-and nanoscale MS-based proteomics have dramatically increased in number lately because they facilitate new opportunities to explore trace levels of samples previously out of reach to researchers. 15−18 For instance, an ultrasensitive nanoscale method, which used gold nanoparticles, identified 650 proteins from a proteomic analysis of 80 cells with a detection limit of proteins reaching 50 zmol. 19 Additionally, many integrated proteome methods optimized for single-cell analysis are increasingly becoming commonplace. 20,21 Some newly assembled proteome analysis devices have reported high numbers of identified proteins, e.g., 328 proteins identified from analysis of 10 single HeLa cells, and with a detection limit approximated to be between 1.7− 170 zmol. 22 One new microfluidics-based platform termed nanodroplet processing in one pot for trace samples (nanoPOTS) that was developed recently has demonstrated remarkable results from the proteomic analysis of small samples. 23 By applying a bottom-up proteomic approach, nanoPOTS proved to be capable of processing samples in nanowells with volumes of less than 200 nL. 23 This method was applied in the analysis of about 10 to 140 cells, and over 1500 proteins were confidently identified. Recently, nanoPOTS was also integrated with a topdown proteomic workflow, and ∼170 to ∼620 proteoforms from ∼70 to ∼770 HeLa cells were quantitatively identified with high confidence. 24 An adaptation of nanoPOTS that utilizes conventional micropipettes and operates in lowmicroliter range called microdroplet processing in one pot for trace samples (microPOTS) has also been developed to address a few bottlenecks such as the demands for nanoliter pipetting platform and highly skilled personnel to run nanoPOTS. Initially, microPOTS was applied to ∼25 cultured HeLa cells and 50 μm square mouse liver tissue thin sections, and about 1800 and 1200 unique proteins were generated from HeLa cells and mouse liver, respectively. 25 Additionally, high reproducibility was reported based on pairwise Pearson's correlation coefficient values of 0.96−0.98, and with median CVs of ≤12.4% from the results of the previously mentioned analysis.
In this study, we applied the microPOTS to characterize proteomes of ∼200 cells used as a Barrett's esophagus cell model following various perturbations. Barrett's esophagus is a premalignant condition thought to arise in the lower esophagus due to chronic reflux of gastric acid and bile leading to genotoxic stress and mutation of the gatekeeper genes TP53 and SMAD4. 26 Barrett's confers an approximately 100-fold greater risk of development of esophageal adenocarcinoma and understanding the molecular changes during carcinogenesis may be useful to guide preventative therapy. 27 It is known that gut bacteria modify bile acids derived from cholic acid and chenodeoxycholic acid to deoxycholic acid and lithocholic acid (LCA). 28 In turn, deoxycholic acid and lithocholic acid are conjugated to yield a variety of conjugated bile salts that can exist in wide-ranging concentrations in patients being monitored for the acid-reflux disease. It is not known whether there is a specific role for bile acids in the selection for specific genetic mutations in esophageal adenocarcinoma progression. Recent work has demonstrated a novel sponge-device (Cytosponge) can sample small numbers of surface cell populations from the esophagus without endoscopy and can determine the presence of Barrett's esophagus. 29 This device can also be used to triage patients with Barrett's for more intensive endoscopic surveillance according to the presence of markers of progression to esophageal adenocarcinoma. 30 As these devices become integrated into clinical practice the molecular changes during the progression from Barrett's to esophageal adenocarcinoma need to be identified from the small numbers of cells retrieved during sampling. Identifying protein markers of progression that can be tested by immunohistochemistry will aid in improving the sensitivity and translation of this technology. The findings of this study reveal that microPOTS allowed for the identification of >1500 proteins from fewer than 200 cells, and radiologic and physiologic stress induce proteomic changes in cell models.

Materials
MicroPOTS chips were fabricated in-house as described previously. 25 The microwell chips were designed with a diameter of 2.2 mm and a well-to-well spacing of 4.5 mm. LC-MS grade water and acetonitrile, formic acid (FA), iodoacetamide (IAA), and dithiothreitol (DTT) were purchased from Thermo Fisher Scientific (Waltham, MA). N-Dodecyl β-D-maltose (DDM) was a product of Sigma-Aldrich (St. Louis MO). Both Lys-C and trypsin were purchased from Promega (Madison, WI).

Cell Culture
CP-A cells were cultured in keratinocyte media (Thermo-Fisher) supplemented with human recombinant epidermal growth factor (rEGF), bovine pituitary extract (BPE), 1% penicillin/streptomycin (Invitrogen) and incubated at 37°C with 5% CO 2 . All other chemicals and reagents were obtained from Sigma unless otherwise mentioned. The guide RNAs targeting the p53 and smad4 genes to generate isogenic gene knockout cells are described in a separate manuscript. Briefly, the guide RNAs were either cloned into lentiCRISPRv2 transfer plasmid or procured as custom synthetic crRNAs from Integrated DNA Technologies (IDT), USA; tracrRNA was also manufactured by IDT. Cells were either transfected using attractene transfection reagent (Qiagen) or electroporated using Nucleofector Kit V (Amaxa, Lonza). After the electroporation, cells were transferred into a 6-well plate and allowed to recover for 3−5 days. The bulk population were single-cell isolated using flow cytometry and individual cells were deposited into 96-well plates using BD FACSJazz cell sorter. Individual colonies obtained were later replicated into 96-well plates and screened for successful gene deletion using immunocytochemistry against the p53 or Smad4. The clonal lines that stained negative for corresponding proteins were further expanded, and the loss of functional p53 and Smad4 protein was confirmed by immunoblotting against respective antibodies and Sanger sequencing for selected knockout clones using the primers flanking the gRNA cleavage site confirmed the genetic editing.

Proteomic Sample Preparation in Microwells
To prepare samples for LC-MS analysis, 5 μL of 1% DDM and 0.5 μL of 500 mM DTT were added to 50 μL of sample, followed by incubation at 65°C and 600 rpm for 1 h to lyse the cells and denature proteins. The cell lysates were diluted to 200 cells/500 nL with 50 mM ABC (pH 8.5) and 500 nL of cell suspension was pipetted into the microwells. Next, 500 nL of 10 mM IAA was added, and the samples were allowed to incubate in the dark at room temperature for 45 min. Two-step enzymatic digestion was applied by sequentially adding 500 nL of 10 ng/μL Lys-C and 500 nL of 20 ng/ μL trypsin in Ammonium bicarbonate buffer, followed by incubation at 37°C for 3 and 10 h, respectively. Thereafter, 500 nL of 5% FA was added, followed by incubation at RT for 1 h. The chips were stored in the humidified box sealed in a Ziploc bag at 4°C until analysis.

LC-MS/MS Analysis
A nanoPOTS autosampler was employed to introduce samples in microwells into LC-MS. 31 The samples in microwell chips were extracted and loaded into a solid-phase extraction (SPE) column (4 cm long, 150 μm i.d., packed with 3 μm, 300 Å C18 particles, Phenomenex, Torrance, CA, USA) using 100% buffer A (0.1% formic acid) delivered by a Dionex UltiMate NCP-3200RS pump. After sample loading, the concentrated peptides were separated using a 50 cm long, 50 μm i.d. nanoLC column with an integrated electrospray emitter (PicoFrit column, New Objective, Woburn, MA, USA). The LC column was packed in house with the same C18 particles used for the SPE column described above. The LC flow rate was 150 nL/min. A 50 min linear gradient from 8% to 22% buffer B (0.1% formic acid in ACN) was used for peptide elution, followed by raising the gradient to 35% buffer B in 10 min to elute hydrophobic peptides. The column was then washed by flushing the column with 80% buffer B for 5 min. Finally, the column was equilibrated using 2% buffer B for 15 min before the next injection. An Orbitrap Fusion Lumos Tribrid MS (ThermoFisher, San Jose, USA) operating in data dependent acquisition mode was employed for peptide signal collection. To trigger electrospray, a high voltage of 2200 V was applied at the metal union (Valco, Houston, USA) between the SPE column and LC column. The ion transfer tube was set at 200°C for desolvation and the radio frequency of the ion funnel was set at 30% for optimal peptide transmission. For MS1 acquisition, an Orbitrap resolution of 120 000, a MS scan range from 375 to 1600, an AGC level of 1 × 10 6 , and a maximum injection time of 100 ms were used. Precursor ions with charges between +2 and +7, and intensity values over 1 × 10 4 were selected for HCD fragmentation and MS2 scanning. Precursors were isolated with an m/z window of 2 and fragmented by high energy dissociation (HCD) set at 30%. The fragment ions were transferred to Orbitrap for MS2 acquisition at a scan resolution of 60 000 and a maximum injection time of 118 ms. To reduce repeated sampling, an exclusion duration of 30 s and m/z tolerance of ±10 ppm were applied.

Data and Statistical Analysis
The .raw files from LC-MS/MS were loaded in the MaxQuant software (v1.6.7.0) for analysis. Identification of peptides was performed using the built-in Andromeda to search against the reviewed UniProt human proteome database (2019 release with a total of 42 427 entries, where 20 350 were reviewed and 22 077 were unreviewed). All of the search parameters were used in their default setting. The enzyme was set to trypsin, and maximum missed cleavages set to 2, while fixed modification was set to carbamidomethylation of cysteines and a false discovery rate (FDR) at peptide-to-spectrum matches (PSM) and protein levels set to 0.01. The resulting .txt output files from MaxQuant were loaded into R statistical environment (v4.0.2) and preprocessed before analyzing with DEP (differentially enrichment analysis of proteomic data) package. 32,33 The DEP package offers a robust and reproducible analysis workflow for MS-based proteomics data when determining differentially enriched/expressed proteins. DEP filters out contaminant and reverse protein sequences, logarithmically transforms the data, and then normalizes the data by variance stabilizing normalization (vsn) method. Subsequently, it imputes missing values and runs statistical tests to determine proteins with significantly altered expression levels. The latter step is made feasible by the test_dif f function, which performs a differential enrichment test based on proteinwise linear models and empirical Bayes statistics using limma. 34, 35 Data visualization was carried out using the BPG (v6.0.1) 36 and BioVenn (v1.0.2) 37 packages. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE 38 partner repository with the data set identifier PXD020741.

Protein Annotation and Assessment of Physicochemical Aspects
The sequences of the identified proteins and gene ontology (GO) for protein annotation regarding cellular components were accessed and retrieved from UniProtKB. 39 To calculate the grand average of hydropathy (GRAVY) value for the protein sequences, an online GRAVY calculator was used. 40

Analyzing Barrett's Esophageal Cell Samples by MicroPOTS
The field of proteomics is rapidly changing with new technologies and methods advancing the capacity to examine a larger proportion of the proteome from small numbers of cells even down to the single-cell 41 level with a high degree of certainty. Over the past five years, the quest to design and develop highly sensitive and specific proteomic methodologies for this purpose has become intense, and many such methods have now been made available. 42−44 Moving away from macroscale preparative techniques in proteomics to micro- Journal of Proteome Research pubs.acs.org/jpr Article and nanotechnologies offers researchers the ability to characterize smaller numbers of cells that are more likely to be encountered in clinical settings. The recently unveiled microPOTS system is one of the methodologies that are promising and poised to widen the window of possibilities in proteomics research. 25 This study aimed at applying the microPOTS separation system to identify proteome signatures of either physiological stress or radiation in 200 Barrett's esophageal cells. Initially, CP-A p53 single null (CP-A KO) and CP-A p53-SMAD4 double null cells (CP-A dKO) were generated, and together with the parental CP-A wild-type cells (CP-A WT), were subsequently subjected to either LCA or Xray treatments. The microPOTS system was then applied for protein extraction and a nano-LC-MS workflow followed as depicted in Scheme 1.

Protein Identification from Different CP-A Genotypes and Stress Conditions
All the samples reported a high number of identified protein groups, with each replicate having over 1500 protein groups from only 200 cells. Replicates from each sample type were averaged and the mean value of protein groups plotted in a bar graph (Figure 1a).   (Figure 1c). Out of these 1867 proteins, 93 (3.4%), 185 (6.7%), and 195 (7%) were unique to LCA-WT, LCA-KO, and LCA-dKO, respectively. The X-ray treatment equally reported a high overlap of 2354 (77%) proteins between the groups (Figure 1d). Also, the WT, KO, and dKO samples were overlapped with their corresponding treatment groups (LCA and X-ray) to determine the number of shared proteins and proteins that were unique to each sample. There was a high overlap of greater than 60% for the number of identified proteins for each sample type ( Figure S1). The number of identified proteins in this work is comparable to a previous study that performed proteomic analysis on small numbers of cells (∼100) on an LTQ-Orbitrap system, and identified ∼1500 proteins. 45

Evaluation of Protein Extraction Efficiency and Reproducibility
The results show that over 1500 proteins were confidently identified from 200 cells in all sample types. This high number of identified proteins from only 200 cells indicates the high extraction efficiency of the microPOTS system for microscale proteomic analysis. A look at the mean and standard deviations across the sample types reveals consistency in the number of identified proteins. As illustrated in Figure S2, almost all replicates had over 75% fully tryptic cleavage sites with fewer than 25% missed cleavages, indicating good tryptic digestion, which also translates into a high extraction efficiency. The evaluation of microPOTS reproducibility in this study was predicated on two approachesqualitative and quantitative analysis. Just like in the case of other method comparison studies 46−48 that often use this kind of approach, we decided to use the strategy to assess the performance of microPOTS. The reproducibility of the measurements is important in evaluation of the results and correct identification of proteins is critical to discovering new proteomic signatures with high certainty and Journal of Proteome Research pubs.acs.org/jpr Article specificity. 49−52 Qualitative reproducibility was achieved by comparing the overlap of the identified proteins between sample types and illustrating the shared proteins in an areaproportional Venn diagram. In Figure 2a, results for LCA treated CP-A dKO cells showed over 52% protein overlap between the replicates. Whereas for the X-ray treated CP-A dKO cells, an overlap of 73% was reported between replicates. Additionally, protein overlap between replicates for the rest of samples was assessed, and high protein overlap of over 50% was reported for almost all comparisons ( Figure S3). Further, quantitative reproducibility between the replicates was assessed using the LFQ values of replicates to perform a pairwise Pearson's correlation coefficient analysis. As shown in Figure  2b, the quantitative assessment of reproducibility demonstrated a high Pearson's correlation coefficient value (R > 0.9). A high correlation coefficient (R > 0.9) was observed for almost all the pairwise comparisons that were conducted ( Figure S4). CP-A KO replicates reported a lower correlation coefficient (R = 0.83) relative to the rest of the samples. Reproducibility of the MicroPOTS system was further assessed by computing the coefficient of variation (CV) for individual protein intensities in each sample condition (Figure 2c). All samples except CP-A KO showed little variation with a median CV that is less than 50% (Table S1).

Comparison of MicroPOTS Data to Bulk Proteomics Data Set
The microPOTS data were compared to an existing data set of the same cell line (CP-A). The samples that generated the bulk data were prepared and analyzed according to standard proteomic workflows as described in Box S1. The resulting data were processed and analyzed using bioinformatics tools as stated in Box S2. Next, we correlated all the overlapping proteins that were confidently identified between the two data sets using Pearson correlation method. As illustrated in the scatter plot in Figure S5, there was a high positive correlation of R = 0.625.

Comparison of Physicochemical Aspects of Identified Proteins
It has been established that high molecular weight (MW) and basic proteins are often challenging to extract due to their propensity to undergo intra-and intermolecular interactions. 53,54 As such, we were interested in exploring the MW, and GRAVY distribution pattern of the identified proteins. The GRAVY distribution scores for LCA treated CP-A dKO cells indicated that a high number of hydrophobic proteins were detected with microPOTS ( Figure 3a). A similar distribution trend of GRAVY was noticed from all the sample types with the majority of identified protein IDs within 0.4 to −0.2 range ( Figure S6). LCA treated CP-A WT cells showed that the most abundant proteins had a MW between 20 and 30 kDa in all sample types (Figure 3b). Additionally, a similar MW distribution pattern was observed across all sample types, which demonstrates that the proteins show no major difference regarding their physicochemical characteristics ( Figure S7). Next, the distribution of identified proteins according to their subcellular localization was explored, which showed that over 50% of the identified proteins resided within the cytosolic region, and this was observed across all sample types. Nearly 20% of confidently identified proteins for all the sample types were located within the cytosol (Figure 3c). All samples revealed that few ribosomal proteins could be detected by the microPOTS system, and this finding is consistent with Zhu et al. 23 reported in their study. Ribosomal and cytoskeletonderived proteins were the least likely to be identified with a percentage of less than 5%. The distribution pattern for subcellular proteins was almost the same across all sample types ( Figure S8).

Effect of Stress on Protein Expression
Differential expression levels of the identified proteins were determined between CP-A WT, CP-A KO, and CP-A dKO as well as between their corresponding treatment set (LCA and X-ray treated groups). A pairwise comparison for all sample types was carried out to evaluate differentially expressed proteins between cells treated with different stresses. The Journal of Proteome Research pubs.acs.org/jpr Article findings from the present study indicate that LCA and X-ray induced changes in the proteome of CP-A cells, which is also consistent with the findings of Proungvitaya et al. that reported bile acids-induced alteration of protein expression in model cells system. 55 Significant alterations were observed in CPA-dKO cells following LCA and X-ray treatment. Specifically, a pairwise comparison between CP-A LCA-dKO and CP-A dKo revealed that 21 proteins were upregulated, and 13 proteins were downregulated, some of which include ASNS, RALY, CSRP1, CTSD, FAM120A, ESTD, and GSR ( Figure 4a). Also, CP-A X-ray-dKO and CP-A dKO comparison reported 32 upregulated proteins with 14 downregulated proteins, including NONO, SAR1A, HNRL2, PLEC.1, FARSA, S100P, FKBP3, and AGR2 (Figure 4b). Table S2 and Table S3 represent the complete list of proteins that were significantly differentially expressed for LCA and X-ray treated CP-A dKO cells, respectively. Anterior Gradient 2 (AGR2) is a member of the protein disulfide isomerase family, and its overexpression has been associated with many human cancers including neoplasia of esophagus. 56,57 This evidence is consistent with the present study, which has shown that AGR2 is overexpressed in Barrett's esophageal cells (CP-A X-ray-dKO).
Having detected ∼1500 proteins from fewer than 200 cells, and capturing differential expression, signify a potential use of the microPOTS-LC-MS method to explore subcellular populations within a tissue/tumor microenvironment including T-cells, fibroblasts and macrophages.

■ CONCLUSION
In this study, a recently developed proteomic method called microPOTS was applied to identify proteins and determine the changes in the proteome of ∼200 cells (including an isogenic cell panel being used for the Barrett's esophageal studies) following radiation and physiological stress treatment. The results show that the microPOTS method is applicable for use in qualitative and quantitative proteomic studies where only low cell numbers are available. Ionizing radiation was one stress used since it is DNA damaging and can activate p53 function. LCA was used since it is a component of bile acids that can impact on acid reflux disease and cancer progression in this tissue. The results were highly reproducible (R > 0.9) between replicates, allowing us to investigate confidently the effect of stress on the cells important in biological applications. With the microPOTS method ∼1500 unique proteins were quantified in all the samples. Moreover, results for the cells treated with LCA revealed differential expression analysis of 21 upregulated proteins and 13 downregulated proteins (CP-A LCA-dKO vs CP-A dKO), some of which include RALY, CSRP1, ASNS, ESTD, and FAM120A. Also, a comparison set between CP-A X-ray-dKO and CP-A dKO reported 33 significantly overexpressed proteins and 15 underexpressed proteins, including NONO, SKP1, HNRL2, PLEC.1, S100P, FKBP3, and AGR2. The results of the present study offer a basis for further studies to deeply interrogate in the future the molecular mechanisms that underpin LCA induction of proteome changes using clinical biopsies, which could aid in uncoupling the distinct role of bile acids in the selection for specific genetic mutations in esophageal adenocarcinoma progression. Importantly, the use of microPOTS, while not as sensitive as its companion technique nanoPOTS, was implemented and will be used in future studies.

* sı Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jproteome.0c00629. Figure S1: Venn diagram illustrating the shared number of identified proteins among different sample types, as well as proteins that are unique to each sample type; Figure S2: Number of missed cleavages are shown for all replicates and are expressed in percentage; Figure S3: Qualitative assessment of reproducibility of the micro-POTS system; Figure S4: Quantitative assessment of reproducibility of the microPOTS system; Table S1: Median coefficient of variation (CV) for quantile normalized protein LFQ values for each sample type; Box S1: Sample preparation and LC-MS analysis for bulk proteomics data set; Box S2: Bulk data analysis and comparison to microPOTS data; Figure S5: Scatter plot with associated Pearson's correlation coefficient (R = 0.625) between microPOTS and bulk proteomics for all overlapping 1066 proteins that were identified; Figure  S6: Assessment of physicochemical characteristics; Figure S7: Assessment of physicochemical characteristics; Figure S8: Subcellular localization; Table S2: List of differentially expressed proteins between CP-A dKO cells with or without LCA treatment; Table S3: List of differentially expressed proteins between CP-A dKO cells with or without X-ray treatment (PDF) Ted R. Hupp − University of Gdansk, International Centre for