Utility of Proteomics in Emerging and Re-Emerging Infectious Diseases Caused by RNA Viruses

Emerging and re-emerging infectious diseases due to RNA viruses cause major negative consequences for the quality of life, public health, and overall economic development. Most of the RNA viruses causing illnesses in humans are of zoonotic origin. Zoonotic viruses can directly be transferred from animals to humans through adaptation, followed by human-to-human transmission, such as in human immunodeficiency virus (HIV), severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), and, more recently, SARS coronavirus 2 (SARS-CoV-2), or they can be transferred through insects or vectors, as in the case of Crimean-Congo hemorrhagic fever virus (CCHFV), Zika virus (ZIKV), and dengue virus (DENV). At the present, there are no vaccines or antiviral compounds against most of these viruses. Because proteins possess a vast array of functions in all known biological systems, proteomics-based strategies can provide important insights into the investigation of disease pathogenesis and the identification of promising antiviral drug targets during an epidemic or pandemic. Mass spectrometry technology has provided the capacity required for the precise identification and the sensitive and high-throughput analysis of proteins on a large scale and has contributed greatly to unravelling key protein–protein interactions, discovering signaling networks, and understanding disease mechanisms. In this Review, we present an account of quantitative proteomics and its application in some prominent recent examples of emerging and re-emerging RNA virus diseases like HIV-1, CCHFV, ZIKV, and DENV, with more detail with respect to coronaviruses (MERS-CoV and SARS-CoV) as well as the recent SARS-CoV-2 pandemic.


■ INTRODUCTION
Emerging infectious diseases (EIDs) remain a significant cause of human and animal morbidities and mortalities, especially those induced by viruses. Diseases related to emerging and reemerging viruses have significant public health consequences and effects on the quality of life and overall economic development. Most of the causative agents of the recent examples of emerging or re-emerging diseases are RNA viruses such as Crimean-Congo hemorrhagic fever virus (CCHFV), Zika virus (ZIKV), dengue virus (DENV), severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV) virus, SARS coronavirus 2 (SARS-CoV-2), and human immunodeficiency virus type 1 (HIV-1). The RNA viruses can quickly adapt to the varying environments and hosts due to the high error rates of the viral polymerases that play a central role in viral genome replications. Most of the RNA viruses causing illnesses in humans are of zoonotic origin, that is, transmitted from animals. Zoonotic diseases can be transferred from animal to human directly through adaptation; then, human-to-human transmission can occur, such as for HIV-1, SARS, MERS, and SARS-CoV-2. Alternatively, they can be transferred through insect vectors, such as for ZIKV, DENV, and CCHFV.
Because proteins possess a vast array of functions in all known biological systems, proteomics-based strategies have been very useful in characterizing the pathogens and understanding protein structure, regulation, phenotypes, and cellular development. Plenty of analytical methods have been developed to detect and quantify proteins in biological samples to understand cell behavior, 1,2 to discover drug targets, and to improve diagnostics. 3 From the identification of entry receptors to altered host−cell pathways, proteomic approaches in viral disease studies can provide essential insights into the investigation of disease pathogenesis and the identification of promising antiviral drug targets. The purpose of this Review is Figure 1. Schematic overview of quantitative proteomic strategies. (A) Data-dependent acquisition (DDA) represents the most common mass spectrometric (MS) analysis used in proteomics. A survey scan is performed for all peptides in the whole mass range (MS1), which are consecutively fragmented, and their fragment ions are analyzed in MS2 events producing sequential data on the peptides. (B) In label-free quantification (LFQ), the studied biological conditions are processed by LC-MS/MS separately but consecutively after MS1 acquisition. During the peak elution, an extracted ion chromatogram is constructed for quantification using areas or intensities for relative quantification, where MS2 events are intended only for peptide identification. (C) In multiplexing with isobaric labeling, digested proteins from n conditions are labeled with isobaric tags (same nominal mass) and are combined into one single sample, which is analyzed by liquid chromatography tandem mass to present an account of quantitative proteomics and its application in some prominent recent examples of emerging and re-emerging RNA virus diseases like HIV-1, CCHFV, ZIKV, and DENV. We also present a more detailed account with respect to coronaviruses (MERS-CoV and SARS-CoV) and the more recent SARS-CoV-2 pandemic that has infected more than 28 million people with more than 900 000 deaths as of the writing of this Review (September 2020).

■ MASS-SPECTROMETRY-BASED PROTEOMICS
Mass spectrometry (MS) technology has provided the power required for the precise identification together with the sensitive and high-throughput analysis of proteins on the large scale. 4 MS has significantly contributed to unravelling key protein−protein interactions, discovering signaling networks, and understanding disease mechanisms. 5 Although intact proteins can be detected by MS, as in top-down proteomics, 6 the most successful approach is based on the analysis of peptides derived from proteins in bottom-up proteomics, which is discussed in this Review.
In general, bottom-up proteomics requires proteolytically digested proteins extracted from any biological sample, followed by the liquid chromatographic (LC) separation of the resulting peptides that are eluted and subjected to electrospray ionization (ESI). 7,8 In the tandem mass spectrometer (MS/MS), the peptide ions undergo two levels of detection: (i) A mass analyzer measures the mass−charge ratio (m/z) of peptide ions (or precursors) in the MS1 event and (ii) selected precursors are fragmented, and the subsequent fragment ions are measured to allow precise amino acid sequences of the peptides in the MS2 event. 4 (See Figure 1A.) With the help of computational algorithms, peptide sequences can be mapped to infer the proteins. 9 The complete analysis of a multitude of peptides is usually referred to as LC-MS/MS. LC-MS/MS has become the gold standard method in proteomics largely due to its compatibility with many upstream separation techniques 9 that allows for protein identifications in both SDS-PAGE 10 and two-dimensional difference gel electrophoresis (2D-DIGE) samples 11 and the characterization of protein complexes obtained by coimmunoprecipitation (co-IP). 12 However, other MS strategies have also been applied to proteomics, including matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF), 13 which requires peptides mixed with a matrix. A laser beam can efficiently desorb ionized peptides that are accelerated and detected in a TOF mass analyzer. (For more details, see an extensive review. 14−16 )

■ QUANTITATIVE PROTEOMICS
Most importantly, bottom-up proteomics can also offer quantitative analysis, 4 which has been extensively applied to quantify the effect of drugs, to study cellular differentiation processes, and to carry out patient cohort classification. 17 Because peptide sequences can be mapped to proteins, both the precursor and the fragment ion signal intensities can be used to assess relative changes in the abundance of the proteins. 4,17 Quantitative proteomics can be divided into two categories: nontargeted and targeted quantification. Nontargeted quantitative analysis can be achieved by label-free and chemical labelbased methods. In label-free quantification (LFQ), peptide signals are detected at the MS1 level through their isotopic pattern ( Figure 1B) and tracked across the retention time to reconstruct a chromatographic elution profile of the monoisotopic peptide mass (extracted ion chromatography (XIC)). 18 For each peptide, the total ion current is integrated and used as a quantitative estimation in the sample. To assess the ratios between the conditions in a study, each condition is separately analyzed by LC-MS/MS, and the areas or intensities of the peptides are compared with each other. 19,20 The advantages of LFQ are (i) the simple design protocol that allows us to identify and quantify thousands of peptides per chromatographic run, (ii) the simple sample preparation methodology, and (iii) the "unlimited" conditions to be analyzed. 4 The countless LFQ applications in the literature demonstrate the power of this approach, as illustrated by studies, for example, to explore the difference between highly pathogenic porcine viruses and their attenuated strains in pulmonary alveolar macrophages 21 and to identify essential proteins/pathways in organ regeneration. 22 Despite the advantages of LFQ, challenges remain related to the reproducibility of the chromatographic separation and the sample handling robustness, and those must be overcome. Additionally, LFQ is time-consuming, and isolation interference may affect the overall quantification in highly complex samples. 20 As an alternative to LFQ, strategies based on chemical labeling (metabolic and chemical isotopic labeling as well as isobaric tags) have gained attention in high-throughput applications allowed by easy and fast chemical reactions. 23,24 Furthermore, the ability of multiplexing, that is, the possibility to analyze multiple samples simultaneously, has drastically reduced the technical variability. 25 Isobaric labels, such as the isobaric tags for relative and absolute quantification (iTRAQ) 23 and the tandem mass tag (TMT), 24 are widely utilized to quantify proteins reliably and to answer diverse biological questions of clinical relevance. The iTRAQ or TMT tags, which have the same nominal mass (isobaric), incorporate a mass reporter, a spacer arm, and an aminereactive ester group that can bind covalently to the N-termini and the side chain of lysine residues of peptides. During fragmentation in the MS2 event, the mass reporter is released, of which the intensity is correlated with the abundance of the given peptide in the sample ( Figure 1C). 25 Isobaric labeling strategies are proven to be very versatile due to their ability to be combined with prefractionation strategies 26,27 and their compatibility with post-translational modification (PTM) enrichment protocols (e.g., phosphorylation). 28 TMT allows where the qualitative information is extracted from MS2 events and the low m/z area contains the reporter ion intensities that serve as quantitative information. (D) In selected reaction monitoring (SRM), a peptide is analyzed in a triple quadrupole system (QQQ), where Q1 isolates the precursor, Q2 serves as the collision cell, and Q3 isolates the fragments to be analyzed, providing quantitative information extracted from the fragment profiles over the retention time. (E) Data-independent acquisition (DIA) is designed with MS1 survey scans of predefined scan envelopes (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25), and all precursor ions are fragmented to generate highly multiplexed MS2 scans, which, with the help of algorithms, are deconvoluted to identify the peptides.
for the simultaneous analysis of up to 16 biological samples, which ensures less analysis time, low technical variability, and a low number of missing values in the quantitative data. 29 TMTbased labeling is currently utilized in several biological contexts, such as the quantification of more than 11 000 proteins in large clinical cohorts, 30 the temporal quantitative analysis of host phosphoproteins in HIV-infected T cells, 31 and single-cell proteomics (SCP), an emerging methodology where multiplexing is necessary to increase the possibility to identify/ quantify a protein present in individual cells with protein quantification at picogram levels. 32 Although nontargeted quantitative proteomics has evident advantages, it still faces limitations regarding the reproducibility and the consistency of quantification, especially when individual proteins are contrasted with orthogonal methodologies. 33, 34 To overcome these limitations, targeting methods provide more accurate and robust quantification in highly complex samples and for low abundant proteins. 4 Targeted proteomics also takes advantage of MS1 and MS2 events but requires specific instrumentation. Selected reaction monitoring (SRM) 35,36 and parallel reaction monitoring (PRM) 37 are the most established methods for targeted quantification. SRM utilizes the properties of a triple quadrupole mass spectrometer (QQQ) to obtain quantitative information, isolating a peptide precursor ion in Q1, which is fragmented in Q2, and, finally, specific fragments are detected in Q3 ( Figure 1D). A pair of precursor and fragment ions is usually regarded as a transition. When a complex matrix incorporates heavy-labeled ( 15 N and 13 C) reference peptides at known concentrations, SRM can be used to construct a calibration curve that provides absolute quantification because the instrument can resolve the signals between the heavy and light (endogenous) peptides. 4,35,36 PRM uses a similar approximation as SRM but performs in a Q-Orbitrap instrument, where Q2 and Q3 are replaced by a higher energy collisional dissociation (HCD) cell and an Orbitrap analyzer, respectively. Unlike SRM, PRM provides a mass spectra of all transitions by a precursor ion in a single step. PRM can reach the parts per million (ppm) level of mass accuracy, which eliminates the background interference and false-positive events more efficiently than SRM and expands the quantification up to five to six orders of magnitude. 37,38 SRM/PRM has been shown to be highly valuable for hypothesis-driven research and clinical studies, in which a small number of targets (typically limited to a few dozen predefined peptides per run) need to be measured in large cohorts of patients (e.g., verification of biomarkers in plasma). 4,39 Recently, a fundamentally different technique has gained attention, which elucidates the drawbacks of stochastic data acquisition typical for the nontargeted quantification methods previously described. In data-independent acquisition (DIA), 40 multiplexed fragment ion spectra are systematically acquired through the use of predefined isolation windows that collectively encompass the entire mass range, where most of the tryptic peptides are expected to be located ( Figure 1E). 40,41 With the use of spectral libraries and targeted scoring algorithms, it is possible to deconvolute highly multiplexed DIA data and increase the number of detected and quantified proteins in complex samples. 42 In this regard, the SWATH-MS approach represents one of the most successful DIA-based methods implemented. It uses a targeted data extraction to query the acquired fragment ion maps for the detection and quantification of specific peptides by contrasting the information contained in spectral libraries. In this strategy, the fragment ion signals, their relative intensities, and the peptide retention time with additional chromatographic information are accessed to uniquely identify a given peptide in the map. 4,43 More specific examples of quantitative proteomics applied to RNA viruses are discussed in detail in the following sections.

■ QUANTITATIVE PROTEOMICS DATA ANALYSIS
The use of tools for analyzing raw proteomics data can greatly impact the final data output. A plethora of review articles present comparisons and recent improvements in proteomics quantification methods. 44−48 However, computational resources for the downstream proteomics analysis of raw data are limited. Pipelines for analyzing proteomics data need to be developed, similar to what has been done for analyzing RNAseq data, where a large panel of tools is available. 49 Data are Journal of Proteome Research pubs.acs.org/jpr Reviews usually normalized using log transformation to harmonize variance and remove technical variation among biological samples. 50 However, there is no clear consensus about the best statistical method for detecting differential protein abundance. The t test or analysis of variance (ANOVA) with the false discovery rate (FDR) or the Bonferroni correction to control family-wise error are the most used methods. The risk of detecting false-positives increases with the number of proteins tested, which is the main limitation for using the t test. 51,52 Other methods commonly used are modified versions of the ordinary t test, such as significance analysis of microarrays (SAM), 53 linear models for microarray data (LIMMA), 54 and reproducibility-optimized test statistic (ROTS). 55,56 A specific pipeline designed for one or more specific proteomics quantification methods is IsobariQ. This is software intended for iTRAQ and TMT that uses variance stabilizing normalization (VSN). 57 ProteoSign is based on LIMMA and works with TMT, stable isotope labeling with amino acids in cell culture (SILAC), and iTRAQ. 58 DeqMs, designed for TMT, is based on log2 normalization and LIMMA. 59 Isobar is an R package designed for TMT and iTRAQ. 60 Extensive benchmarking is still necessary to obtain consensus pipelines used by the proteomics community. Other statistics tools nonspecific to proteomics, such as MSstats, can also be used. 61 As an example of how different pipelines affect the final data output, we analyzed a data set from our previous study on SARS-CoV-2. 62 In this study, we were interested in how SARS-CoV-2 changes the proteomic profile of human hepatocarcinoma cell line Huh7 upon infection with SARS-CoV-2. Triplicate samples of SARS-CoV-2-infected and uninfected Huh7 cells were collected and TMT-labeled, and data were acquired in a single run. We have used five different analysis pipelines, namely, DeqMS, ROTS, QT_LIMMA, trimmed mean of M-values LIMMA (TMM_LIMMA), and TMM_EdgeR ( Figure 2A). There is a substantial overlap between the results from ROTS, TMM_EdgeR, TMM_LIM-MA, and QT_LIMMA. A total number of 470 proteins was identified as differentially abundant between uninfected and SARS-CoV-2 -infected cells by all pipelines. Out of the 841 proteins identified by DeqMS, only 149 are overlapping with the other pipelines. Using the 149 proteins overlapping between DeqMS and the other pipelines, the triplicates cluster together per condition ( Figure 2B). DeqMS was run directly on peptide abundance from the peptide-spectrum match (PSM), whereas the other pipelines were run on protein abundance. Another particularity is the log2 transformation of the data before DeqMS is applied. The different results highlight the influence of the processing steps and the statistical tool selection on the output. The choice of data analysis methods must be data-driven.

■ APPLICATION OF THE PROTEOMICS:
UNDERSTANDING RNA VIRUSES The development and advances of the high-throughput proteomics screening technologies have revolutionized the field of virology. The continuous massive improvement of the older techniques and analysis algorithms and the development of newer technologies make it possible to apply the different proteomics-based approaches to delineate the complex host− pathogen interactions and the host response to the invading virus or to understand how the virus hijacks the host signaling pathway for its gain. In the following section, we discuss how proteomics was used in some recent emerging and re-emerging epidemics of HIV-1, CCHFV, ZIKV, DENV, SARS-CoV, MERS-CoV, and SARS-CoV-2, where vaccines or definitive antivirals are lacking. Because of the smaller and simpler structures of these RNA viruses, proteomic studies have mostly focused on host−virus protein interactions, host antiviral responses, and diagnostic usage. Table 1 provides an overview of the viruses discussed in this Review and how proteomics was utilized to reveal the respective disease pathogenesis.

Human Immunodeficiency Virus (HIV)
The first patient with symptoms of a disease that we today know as acquired immunodeficiency syndrome (AIDS) was described in 1981. However, to date, there is still no cure to eradicate the AIDS-causing agents HIV-1 and HIV-2, viruses belonging to the family of Retroviridae. 63,64 Through the development and application of antiretroviral treatment (ART), it has become a manageable chronic disease, although HIV/AIDS remains a pandemic as an epidemiologic and global health phenomenon. 65 During this long period, a large number of studies applying MS-based proteomics have been conducted to describe HIV protein structures, host−viral interactions, and host immune responses. In the last 5 years, the accumulated knowledge created by proteomics studies has helped us to better understand HIV-1 pathogenesis. 66−77 More recently, body fluids other than plasma/serum and patients on longterm ART have been the focus of interest to understand non-AIDS-related comorbidities, which are more frequently seen in people living with HIV. 77 Several studies investigated the proteome of cerebrospinal fluids (CSFs) in HIV-infected individuals or even the proteome of CSF extracellular vesicles using MS. These studies revealed an increase in inflammatory markers and proteins of the complement system in CSF, which was associated with the patients' neurocognitive status. 69,73,76 A recent study subjected the semen of HIV-infected men to MS analyses, together with blood samples of the same participants. Even though HIV-induced changes were less frequent in semen than in blood, 43 proteins were found to be unique in the semen of HIV-infected compared with HIVuninfected men. The proteins were enriched in processes like interleukin 17 (IL-17) signaling pathway and complement and coagulation cascades. 71 The cervicovaginal lavage (CVL) of HIV-negative women has also been studied to better understand HIV susceptibility and infection risk. Studies have reported that pregnant women have a different mucosal proteome, in particular, showing alterations in inflammatory pathways, compared with nonpregnant women, and elevated mucosal cytokines were associated with a higher risk of HIV acquisition. In contrast, glycoproteins in CVL bearing high mannose were associated with HIV-1 resistance. 66,68,72 SWATH-MS has been proved to be successful in HIV in vitro studies using Jurkat cells (human T cells) 67 or human primary CD4 + T cells infected with HIV-1 in vitro 75 as well as in in vivo experiments using patient plasma 70 or CD4 + T cells from HIV-1-infected patients with paired samples before and after the initiation of ART. 75 In the in vitro studies using SWATH-MS, 117 novel proteins were found to be altered during HIV-1 infection based on the NIH HIV-1 human interaction database at the time when the study was conducted, and two new proteins, EZRIN and Y-box binding protein 1, were confirmed to interact with HIV-1 matrix protein, modulating the infection efficiency. 67,74 Another study combining in vitro infections of primary CD4 + T cells and CCHF, caused by the nairovirus CCHFV, is a human tickborne disease, vectored mainly by Hyalomma marginatum ticks and characterized by hemorrhagic manifestations, with a case fatality rate of up to 40%. In fact, of all medically significant tick-borne diseases, the CCHFV is the geographically most widespread vector-borne pathogen. 78 Although the disease was first reported in 1944 and the virus was identified in the late 1960s, 79 there is still no available vaccine or vaccine candidate. Also, a selective antiviral drug for the treatment or prevention of the disease is not expected soon. In the search for new antivirals against CCHFV, several studies used proteomics approaches to identify host-cell proteins that interact with the CCHF viral glycoproteins that facilitate entry into target cells or the nucleocapsid protein that supports viral replication. An interaction between the receptor-binding domain of one of the two envelope glycoproteins, namely, Gc, with the cell surface protein nucleolin was identified through co-immunoprecipitation (Co-IP) combined with MS peptide sequencing. 80 However, whether nucleolin indeed serves as an entry factor remains unclear. Despite the current advances in proteomic techniques, it remains unclear what the entry factors are by which the CCHF viral glycoproteins facilitate host-cell entry. Several studies investigated the interaction between the CCHFV nucleocapsid protein and the host-cell proteins.
Whereas previous studies used a targeted approach demonstrating the interaction between CCHFV nucleoprotein (NP) with host-cell proteins actin or interferon-induced antiviral protein MxA, 81,82 a later study employed immunoprecipitation with MS to identify NP interactions with members of the heat shock protein 70 family. 83 One study used two proteomics approaches to gain a better understanding of the CCHF viral pathogenesis in the liver, one of the critical organs that, when infected, could lead to liver failure and contribute to mortality. 84 Samples from mock-or CCHFV-infected HepG2 liver carcinoma cells were subjected to either 2D-DIGE or iTRAQ labeling, and both were followed by MS to identify alterations in the host protein expression levels. The majority of the 240 differentially regulated proteins between mock-and CCHFV-infected cells was associated with cell death, cellular growth and proliferation, and cellular movement. The expression levels of proteins involved in the clathrin-mediated endocytosis pathway, which is involved in the entry of CCHFV, as well as the Gc-interacting protein nucleolin were also altered. 84

Flaviviruses: Dengue Virus (DENV) and Zika Virus (ZIKV)
Flaviviruses are mostly transmitted by arthropods and are pathogenic to humans. 85 Diseases caused by flaviviruses, such as yellow fever and dengue fever, were reported as early as in the 17th and 19th centuries. Most of the flaviviruses causing these diseases were identified in the first half of the 20th century. Among them, DENV (serotypes 1−4), West Nile virus (WNV), Japanese encephalitis virus (JEV), yellow fever virus (YFV), chikungunya virus (CHIKV), tick-borne encephalitis virus (TBEV), and, lately, ZIKV, are globally significant, causing disease in hundreds of millions of people annually across half of the world. 85 Although they are most prevalent in the developing world, recent evidence suggests that known as well as novel flaviviruses have the potential to spread to other regions of the world and to cause epidemics. 86 Many of these flaviviruses can cause severe diseases, like in DENV infection, which can manifest in its severe forms as the dengue hemorrhagic fever (DHF) or dengue shock syndrome (DSS), which are associated with subsequent infections with heterologous serotypes. 87 The majority of the proteomic approaches for studying DENV and ZIKV have mostly focused on the interactions of nonstructural viral proteins with host-cell proteins, proteome changes after virus infection, and diagnostics.
The expression of dengue nonstructural protein NS1 in HepG2 cells followed by LFQ analysis revealed 107 host proteins to be differentially expressed. 88 These host-cell proteins are involved in processes like proteasomal protein degradation, apoptosis, and cellular stress. In another study, a comprehensive map of host-cell proteins that interact with Dengue virus NS1 protein was created using three different cell lines that were infected with DENV replicon, encoding only the NS1, and MS analysis. 89 They further combined their data with a functional RNAi screen of selected NS1-interacting host factors to identify 34 host restriction factors and 58 host dependency factors. Finally, they validated some of these factors through immunoprecipitation experiments. Using these proteomic techniques, essential roles were revealed for receptor-activated C kinase 1 (RACK1), which is involved in RNA translation, and for components of the oligosaccharyltransferase complex and the cytosolic chaperonin-containing T complex in DENV replication through their interaction with NS1. 89 In another LC-MS/MS approach, host proteins that are differentially ubiquitylated upon DENV infection were identified through immunoprecipitation of mono-ubiquitylated proteins of mock-or DENV-infected cell lysates followed by MS analysis. 90 They also showed that dengue viral protein NS4A could interact with one of the identified ubiquitylated host-cell proteins, AUP1, in a Co-IP experiment. Through this interaction, NS4A triggers the activity of AUP1 to induce lipophagy, which is a critical process in dengue viral replication. Other studies employed quantitative proteomic analysis on DENV-infected cells to reveal changes in the expression levels of host cell proteins involved in IFN responses, energy metabolism, lipid metabolism, RNA processing, apoptosis, and the cell cycle. 91,92 Another study investigated the changes in the secretome of DENV-infected cells and showed that DENV infection changes the proteolytic processing of secreted proteins. 93 A comprehensive interaction map between viral proteins and human or mosquito host cell proteins has been created for DENV and ZIKV. 94 Each open reading frame (ORF) of DENV and ZIKV was tagged and expressed in human or mosquito cells, and lysates were subjected to LC-MS/MS. The analysis revealed a conserved mechanism by which both DENV and ZIKV suppress interferon-stimulated genes and a ZIKV-specific interaction between viral NS4A and host cell protein ANKLE2 that affects the brain development in mosquitoes.
In the last several years, many other studies have mapped virus−host protein interactions and alterations in the host proteome upon ZIKV infection using various proteomics approaches. 95−102 These studies show that several cellular host Journal of Proteome Research pubs.acs.org/jpr Reviews proteins, organelles, or pathways seem to be implicated in ZIKV infection. One study thoroughly investigated the virus− host interactome and changes in the (phospho-) proteome upon ZIKV infection in neuronal cells. 103 For the interactome, each of the ZIKV proteins was expressed in a neuronal cell line, and combined immunoprecipitation with MS analysis was performed. A total number of 386 host cell proteins was identified that specifically interact with one of the ZIKV proteins. Several of these host cell proteins were previously reported to be linked to neurological development or differentiation. The effects of ZIKV infection or the ectopic expression of only ZIKV protein NS4B on undifferentiated or differentiated human neuronal progenitor cells were analyzed by mapping of the proteome through MS. LFQ phosphoproteomics was used to show that ZIKV deregulates signaling pathways involved in many cellular processes, including nervous system development and neurological diseases.
Altogether, these studies that utilize different proteomics approaches reveal important mechanisms by which ZIKV uses the host machinery for its replication or could cause neuropathogenesis. Proteomics approaches have also been used in flavivirus diagnostics. In a previous study, iTRAQ labeling with MS analysis was performed on plasma from dengue pediatric patients who either developed severe dengue disease at a later time or did not. 104 Angiotensinogen and antithrombin III protein levels were significantly increased in patients before they developed a severe disease state. In another study, a TMT-based quantitative proteomic approach was used to analyze plasma samples from patients with dengue fever, of whom several developed severe dengue disease. 105 Here, a set of seven proteins was identified as predictors of severe dengue disease. These identified proteins could serve as biomarkers to identify patients that are at risk for developing severe dengue disease.
Because DENV and ZIKV present similar clinical manifestations, accurately diagnosing which virus is the causative agent is crucial for providing the appropriate care for the infected individual, especially in infected pregnant women. After the acute phase of the infection has passed and RT-PCR techniques become less sensitive for discriminating ZIKV from DENV, plasma samples could be subjected to a DENV/ ZIKV protein array to discriminate patients infected with ZIKV from those with DENV. 106 Recently, a more comprehensive single-shot diagnostic and typing assay has been developed to discriminate several flaviviruses, including DENV serotypes and ZIKV, in plasma samples. 107 In their targeted quantitative MS approach, they developed a PRM assay targeting the viral NS1 protein of all of these flavivirus strains. This method shows high specificity and sensitivity when analyzing patient samples ranging from 1 to 8 days postonset of symptoms, with similar performance when analyzing plasma samples from patients with secondary infections.

■ CORONAVIRUSES
The family of Coronaviridae has caused several epidemics/ pandemics during the last two decades. In 2003, there was the outbreak of the SARS-CoV in Asia, followed by the MERS-CoV, 108 and, finally, the current SARS-CoV-2 pandemic causing coronavirus disease 2019 (COVID-19). 109 Proteomics studies have, at this moment, played a crucial role in characterizing viral proteins, in discovering the mechanisms of pathogenesis, including host−viral interactions and host immune responses, as well as in finding biomarkers to monitor the infection course during the COVID-19 pandemic.

Characterization of Host−Virus Protein Interactions for Coronaviruses
The new coronavirus strains appear to be more pathogenic than other members of the Coronaviridae family. One underlying reason for the increased pathogenicity is that these newly emerged coronaviruses cause infection not only in the upper respiratory tract, like other human coronaviruses, but also in the lower respiratory tract. Elucidating how the virus infects its target cells and hijacks the host cell machinery for its own replication is crucial in understanding their pathogenenicity. First, the coronavirus spike protein is heavily glycosylated and has a key role in virus attachment to and entry into the host cell. To characterize SARS-CoV proteins, MS analyses were performed in Canada, where the largest SARS outbreak outside of Asia was seen. Because MS also enables us to study PTMs, such as glycosylation, a Canadian study identified 12 glycosylation sites of the spike glycoprotein as well as some of the respectively attached sugars by MALDI-TOF-MS. 110 Similarly, the N-and O-glycosylation pattern on the spike protein was recently mapped for SARS-CoV-2 using high-resolution LC-MS/MS. 111 By combining the immunoprecipitation of viral spike proteins of SARS-CoV and MERS-CoV with MS analyses, both angiotensin-converting enzyme 2 (ACE2) and dipeptidyl peptidase 4 (DPP4, also CD26) were identified as the entry receptors for SARS-CoV and MERS-CoV, respectively. 112,113 Because of similarities between SARS-CoV and SARS-CoV-2, the usage of ACE2 by the latter one was rapidly confirmed with targeted Western blot. 114 The discovery of the entry receptors ACE2 and DPP4, which are expressed in alveolar regions, explains why these viruses can infect the lower respiratory tract. 115,116 The knowledge about the glycan repertoire of the spike proteins and their entry receptors could potentially be used for the development of entry inhibitors. For example, human recombinant soluble ACE2 has been shown to be a promising drug to block the early stages of SARS-CoV-2 infections. 117 Other studies have used proteomic approaches to increase our understanding of other viral proteins in the replication of the coronaviruses and their interactions with host cell proteins. The previously mentioned Canadian study characterizing SARS-CoV proteins revealed a new nucleocapsid protein that showed only 32% identity with nucleocapsid proteins of known coronaviruses. Interestingly, no caspase cleavage motif was present in the examined SARS-CoV nucleocapsid, which is found in other coronaviruses and plays a role in coronavirus elimination by infected cells. 110 A study investigated the cellular pathways involved in the coronavirus assembly on purified SARS-CoV virions by LC-MS/MS and protein kinase profiling. With this approach, 8 viral and 172 host proteins were identified, of which nonstructural protein 3 (nsp3) was found to be a conserved component of the viral protein processing machinery. 118 The study further found an interaction between the viral nucleocapsid protein and the host cyclophilin A that was previously predicted by a bioinformatics approach. 119 Later, different cyclophilins were also seen to interact with SARS-CoV nonstructural protein 1 (nsp1). In the later study, a systems biology approach was used, where a genome-wide yeast two-hybrid interaction screen Journal of Proteome Research pubs.acs.org/jpr Reviews was employed to identify protein−protein interactions. 120 Regarding the MERS-CoV infection, a Co-IP MS screen revealed the interaction of viral accessory protein 4b with αkaryopherin proteins in Huh7 cells, and through this interaction, viral accessory protein 4b impairs the NK-κBdependent antiviral response. 121 For SARS-CoV-2, a protein interaction map was created early in the epidemic course by cloning, tagging, and expressing 26 of the SARS-CoV-2 proteins in the human embryonic kidney (HEK) 293T cell line and subsequently using affinity-purification MS. With this approach, 332 high confidence protein−protein interactions were found between the virus and host cells. 122 Another study used a similar approach and overexpressed SARS-CoV-2 genes with FLAG-epitopes in HEK293 cells, purified viral protein complexes by affinity purification, and analyzed them by LC-MS/MS. They reported the viral interaction with host proteins that are involved in the translation, protein folding, and degradation pathways, leading to the benefits of viral growth and proliferation. 123 For in vitro studies, the application of the SILAC strategy can be used, as was done by Zhang et al. 124 They developed a BHK21 (baby hamster kidney) cell line that expressed the SARS-CoV subgenomic replicon and grew the cells in light SILAC medium, while, in parallel, parental BHK21 cells were grown in heavy medium. The experiment was even performed in reversed order. They were able to quantify 1081 host proteins in both experimental setups and identified 74 proteins with significantly altered levels. Of these, they chose BCL2associated athanogene 3 (BAG3), which had higher levels in the SARS-CoV replicon cell line, for functional studies. Two regulatory molecules, cdc42 and RhoA, had lower levels in the replicon cell line. 124 No functional assays were performed in this study regarding these two proteins, but the cdc42dependent regulation of the PI3K/AKT pathway was observed in the infection of VeroE6 cells with SARS-CoV. 125 In our recent proteo-transcriptomic study, where the SARS-CoV-2 infection was investigated in Huh7 cells over time, we also noticed the dysregulation of PI3K/AKT as well as of mTOR and MAPK signaling pathways. The proteins were hereby quantified with TMT labeling and reversed-phase liquid chromatography tandem mass spectrometry (RPLC-MS/ MS). 62 In another proteomics study with SARS-CoV-2, which also utilized TMT-based quantification, Caco-2 cells (human colorectal tumor cell line) were infected and studied over time. Cellular pathways, such as cholesterol metabolism, translation, splicing, and carbon metabolism, were identified to be reshaped during viral infection. 126 The given examples demonstrate the versatility of MS-based quantitative proteomics studies in a multitude of cell lines.

Host Antiviral Response and Biomarker
The disease severity in infected individuals with the newly emerged coronaviruses is highly correlated with strong antiviral responses. 127,128 Therefore, it is also crucial to understand the host response to these infectious agents to unravel their pathogenesis. Studies using patient material, typically serum or plasma and peripheral blood mononuclear cells (PBMCs), and quantitative proteomic methods could be used to understand host antiviral responses against the newly emerged coronaviruses as well as to identify biomarkers for disease severity. Whereas for MERS-CoV infection, antibody-based technology such as the enzyme-linked immunosorbant assay (ELISA) and the cytometric bead array (CBA) were commonly used to measure the levels of selected cytokines in patient bronchoalveolar lavage (BAL) and plasma samples, respectively, 129,130 a mix of both immunoassays and MS has been applied in COVID-19. One study measured the plasma concentration of 48 cytokines using the Bio-Plex Pro Human Cytokine Screening panel (Bio-Rad, US). They found that increased protein levels of interferon gamma-induced protein 10 (IP-10, also called CXCL10), monocyte chemotactic protein 3 (MCP-3), hepatocyte growth factor (HGF), monokine induced by gamma interferon (MIG, also called CXCL9), and macrophage inflammatory protein 1 alpha (MIP-1α, also called CCL3) were associated with disease severity. 131 Another study measured not only inflammatory/immunological biomarkers with the Architect i2000 immunoassay analyzer but also serum cancer biomarkers in SARS-CoV-2-infected patients. Besides the elevation of several inflammatory/immunological biomarkers, they also noticed a positive association between the levels of cancer biomarkers with disease severity. 132 In a study, the proteomic and metabolic profiling of patient plasma, including samples from patients before they progressed to severe disease, was carried out using TMT. 133 To predict if a patient will develop severe disease, the data were fed into a machine learning model that was then trained. The model had an accuracy of 93.5% to correctly classify severe patients. Furthermore, the MS data indicated the dysregulation of macrophages, platelet degranulation, and the complement system pathway in COVID-19 patients. 133 Changes in the host proteome were also surveilled in PBMC samples of patients to gain information about the host immune response. 123 Analyses by the TMT approach revealed 220 proteins that had significantly different levels in COVID-19 patients with mild symptoms compared with healthy controls. Of those, 115 proteins were increased, and 105 proteins were decreased in infected patients. Pathway enrichment analyses indicated that this patient group exhibited an active innate immune response against the virus (due to the enrichment of pathways such as neutrophil activation, type I IFN signaling, inflammatory response, antigen processing, and presentation). Next, PBMC proteome changes were compared between mild and severe SARS-CoV-2-infected individuals. A total number of 553 proteins were statistically different, of which the majority (526 proteins) were decreased in the severe cases. Here, pathway enrichment analyses revealed that adaptive immunity was functionally reduced in severe cases, which was correlated with lower T-and B-cell populations. Altered pathways were, for example, T-cell costimulation and activation, T-cell receptor signaling, B-cell receptor (BCR) signaling, and complement activation. 123 From Proteomics to Drug Repurposing to Treat COVID-19 All of these studies that utilize various proteomic approaches to identify entry receptors, virus−host protein interactions, and host antiviral response pathways associated with SARS-CoV-2 contribute to the understanding of its pathogenicity and its disease, COVID-19. It is critical to quickly identify safe and effective drugs for emerging diseases such as COVID-19 to fight endemics and pandemics. Information gathered through proteomic approaches provides unique opportunities for drug development through drug repurposing. The advantage of drug repurposing is that the compounds have already been tested for safety in animals, healthy persons, and humans with other diseases and can therefore rapidly be tested for their application in new diseases. For example, in a previous Journal of Proteome Research pubs.acs.org/jpr Reviews described study, 122 a host−viral protein interaction map was created with SARS-CoV-2 proteins in the 293T cell line, and of those interaction pairs, 66 human proteins were identified as being a drug target of already known drug compounds that are approved by the FDA, are in clinical trials, or are in preclinical tests. These drug compounds can easily be tested for their ability to inhibit SARS-CoV-2 replication in in vitro assays.
Repurposing drugs that target host-cell cytokine pathways involved in SARS-CoV-2 replication and disease progression is a complementary strategy for developing therapies against COVID-19. As an example, elevated levels of cytokines, such as IL-6 and IL-8, have been identified by quantitative proteomic analysis and other studies in patients with severe COVID-19. 123,128,134 The cytokine storm during COVID-19 resembles those found after chimeric antigen receptor T-cell treatment or in patients with macrophage activation syndrome, with the release of IL-1, IL-6, IL-18, and IFN γ. 135,136 Cytokine blocking agents, which often have been successfully developed for other inflammatory diseases, are effective treatments for these disorders. Therefore, great interest has arisen in also using such treatment approaches during the hyperactivation of the immune systems frequently seen during the second phase of the COVID-19 disease. Several of these cytokine blocking agents, such as anakinra (IL-1 receptor antagonist), 137 tocilizumab and sarilumab (IL-6 receptor antagonists), 138−140 baricitinib (antagonist for JAK1 and JAK2), 141 and acalabrutinib (inhibitor of Bruton's tyrosine kinase (BTK)), 142 are promising as repurposable drug strategies to treat COVID-19.

■ CONCLUSIONS
Despite the challenges described, proteomics can provide rapid high-throughput analysis of proteins on a large scale, significantly contributing to unravelling key protein−protein interactions, discovering signaling networks, and understanding viral disease mechanisms. It can potentially be beneficial to predict drug repurposing during an ongoing epidemic or pandemic caused by viruses against which there are no available vaccines or antivirals. However, the results should be carefully interpreted, keeping in mind the limitations of the assays and preclinically validating the claims before existing drugs are repurposed and clinically applied.
The manuscript was written through the contributions of all authors. All authors have approved the final version of the manuscript.

Funding
The study is funded by the Swedish Research Council Grant (2017-01330 and 2018-06156).

Notes
The authors declare no competing financial interest.

■ ACKNOWLEDGMENTS
The TOC graphic was created with Biorender.com.