ACS Publications. Most Trusted. Most Cited. Most Read
A Network Module for the Perseus Software for Computational Proteomics Facilitates Proteome Interaction Graph Analysis
My Activity

Figure 1Loading Img
  • Open Access
Article

A Network Module for the Perseus Software for Computational Proteomics Facilitates Proteome Interaction Graph Analysis
Click to copy article linkArticle link copied!

Open PDFSupporting Information (5)

Journal of Proteome Research

Cite this: J. Proteome Res. 2019, 18, 5, 2052–2064
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.jproteome.8b00927
Published April 1, 2019

Copyright © 2019 American Chemical Society. This publication is licensed under CC-BY.

Abstract

Click to copy section linkSection link copied!

Proteomics data analysis strongly benefits from not studying single proteins in isolation but taking their multivariate interdependence into account. We introduce PerseusNet, the new Perseus network module for the biological analysis of proteomics data. Proteomics is commonly used to generate networks, e.g., with affinity purification experiments, but networks are also used to explore proteomics data. PerseusNet supports the biomedical researcher for both modes of data analysis with a multitude of activities. For affinity purification, a volcano-plot-based statistical analysis method for network generation is featured which is scalable to large numbers of baits. For posttranslational modifications of proteins, such as phosphorylation, a collection of dedicated network analysis tools helps in elucidating cellular signaling events. Co-expression network analysis of proteomics data adopts established tools from transcriptome co-expression analysis. PerseusNet is extensible through a plugin architecture in a multi-lingual way, integrating analyses in C#, Python, and R, and is freely available at http://www.perseus-framework.org.

Copyright © 2019 American Chemical Society

Introduction

Click to copy section linkSection link copied!

The study of complex systems (1) is concerned with the question of how the relationships between the parts of a system give rise to its collective behavior. Complex systems often generate emergent properties (2) which are not present in an obvious way in its parts. The interactions between the components of a complex system define a network of connections consisting of nodes and edges. Examples of such networks range over all disciplines of science, including the study of social media networks, (3) scientific collaboration networks, (4) and the human brain and its interconnected neurons as a particularly interesting one. Much of the relevant content is concealed in the network constructed from these interactions and is not visible in the components themselves. For instance, the brain connectome, (5) and not the cellular content of the brain, is believed to make us who we are. (6,7) Similarly, the observation of cellular concentrations of biomolecules without considering their interaction would provide a limited picture that ignores potential emergent properties of the biomolecular complex system. Hence, it is mandatory to study biological systems, such as cellular concentrations of biomolecules, in the framework of network biology. (8)
At a fundamental level, all network connections between the cellular biomolecules are biochemical reactions, and their specification in biochemical pathways together with their subcellular spatial distribution would provide complete knowledge about the biological network state of the cell. This collective network of all biochemical reactions contains all metabolic reactions, the signaling cascades, gene regulatory networks, and all complex-forming non-covalent interactions between molecules, as for instance protein–protein interactions (PPIs). Due to the limitations of experimental and computational methods to map out this interaction network, we often obtain only partial knowledge about the complete biochemical reaction network from experiments. Networks are, however, not limited to describing fundamental physicochemical interactions between biomolecules. For instance, in a gene co-expression network analysis, (9) one looks for similarity of expression patterns of gene products over many samples. Strongly correlated expression implies that these genes have some kind of non-physical interaction; e.g., they are part of the same transcriptional regulatory program, or they share membership in the same pathway or protein complex. However, the exact relationship in terms of biochemical reactions remains unknown with these and other techniques. Hence, in these cases, networks describe a more coarsely grained level of detail, in which relationships between molecules are not necessarily biochemical reactions, but of a more general kind.
Computational proteomics is a mature data science that copes well with the large amounts of data produced in mass spectrometry (MS) experiments. (10) Perseus is an established framework for the downstream bioinformatics analysis of quantitative proteomics data. (11,12) The initial version of Perseus provided a comprehensive framework and set of activities to analyze data matrices originating from quantitative proteomics in a workflow environment. The main idea behind Perseus is to enable researchers in biomedical sciences to perform data analysis themselves. Here we describe how we extend this program to the analysis of biological networks in the context of proteomics. While cytoscape (13) exists as the de facto standard for network analysis and visualization, many proteomics-specific tasks for the generation and analysis of networks are lacking from this framework, as well as workflow navigation. PerseusNet fills this gap and enables non-computational experts to perform complete network-based analysis of their data. We explicitly do not want to re-invent existing methods and algorithms. Instead, we designed an extensible framework that integrates with existing tools, like cytoscape, and interoperates with existing code and scripts from the network analysis community that were written in diverse languages, like Python and R. The data structures within Perseus that hold the networks were set up in a way that facilitates studying dynamic changes in networks and finding differential network properties over complex experimental designs. Side-by-side analysis of networks with data matrices in a common workflow environment allows for a seamless transition between matrix-centric and network-centric approaches.
In the following we start with a general description of the new network framework in Perseus, including how it enables multilingual programming and usage of code resources from R and Python. We then introduce the new volcano-plot based analysis workflow scalable to large affinity purification-mass spectrometry (AP-MS) datasets. We describe how general and, more specifically, large-scale PPI networks are handled and curated in Perseus. A section on the analysis of posttranslational modification (PTM)-induced networks, like kinase–substrate relationships for phosphoproteomics, is next. Finally, we cover co-expression analysis in Perseus and its applications to clinical proteomics.

Experimental Section

Click to copy section linkSection link copied!

Creating Interaction Networks from Pulldown Experiments

We created an interaction network from a pull-down screen. (14) First, .RAW files were obtained from PRIDE (PXD003758) and processed with MaxQuant version 1.6.2.10. Mouse protein sequences were downloaded from UniProt (release 2017_07). Parameters “matching between runs” and “LFQ” were selected in addition to the default parameters. Downstream analysis of the ‘proteinGroups.txt’ output table was performed in Perseus using the tools described in this Article. Columns for baits Eed, Ring1b, and Bap1 and their controls in the ESC and NPC cell lines were selected and log transformed. Quantitative profiles were filtered for missing values, and were filtered independently for each of the bait control pairs, retaining only proteins that were quantified in all three replicates of either the bait or control pull-down. Missing values were imputed (width 0.3, down shift 1.8) before combining the tables and performing the multi-volcano analysis (Table S1). The s0 and FDR parameters of the multi-volcano analysis for Class A (higher confidence, s0 = 1, FDR = 0.01%) and Class B (lower confidence, s0 = 1, FDR = 0.2%) were chosen by visual inspection, aiming for a low number of significantly depleted proteins in any of the experiments. Class C interactions, which are based on profile correlation between bait and prey, were not considered in this network due to the limited number of pull-downs in the dataset, which would result in inaccurate correlation estimation. Edges representing known protein complex interactions were annotated in the network. Due to missing mouse CORUM annotations for any of the baits, mouse CORUM annotations were obtained by mapping between mouse and human homologues as listed in the MGI database. (15)

Approximately Scale-Free Topology of the STRING Interaction Network

We downloaded the human STRING interaction network (v10.5) from the STRING website. After filtering for high confidence interactions (combined score > 0.9), the scale-free fit index was calculated according to ref (16). Node degrees were calculated and plotted against their frequency distribution on a log–log scale. The R2 of a linear fit to the log–log space represents the scale-free fit index.

Network Analysis of a Phosphoproteomic Dataset of EGF Stimulation

Two separate analysis tool, PHOTON and KSEA, were applied to the same experimental dataset of 9184 phosphorylation sites with high localization probability (>0.75) (17) (Table S2). Log2 fold-changes for EGF from two replicates were averaged. For PHOTON analysis, we first generated a high-confidence PPI network. We downloaded all interactions from HIPPIE and filtered them for high-confidence interactions (confidence > 0.72), additionally removing high-degree nodes (degree < 700). Nodes in the HIPPIE network are identified by their Entrez GeneID. Therefore, the experimental data were mapped from UniProt to Entrez GeneIDs before the nodes of the network were annotated . Phosphorylation sites with multiple GeneIDs were mapped to all matching nodes in the network. We then performed PHOTON analysis with adjusted default parameters. Network reconstruction with ANAT was enabled with the 100 highest scoring proteins and EGF anchor (GeneID 1950). Additionally, we increased the number of permutations to 100 000. The KSEA analysis was performed on the human site-specific kinase–substrate network from PhosphositePlus. (18) Data and network were matched on the basis of UniProt identifiers.

Co-expression Analysis of a Clinical Proteomics Dataset

Protein quantification data and clinical annotation were obtained from Yanovic et al. (19) SILAC ratios were first transformed to log(light/heavy). The dataset was filtered for the 43 patients unique to ref (19). Using global hierarchical clustering of the patients, four outlier samples were identified and removed from the dataset. Additionally, proteins with less than 70% valid values were removed from the dataset, and the resulting patient profiles were Z-scored (Table S3). Following the WGCNA workflow, (16) the power parameter for the co-expression analysis was selected using the ‘Soft-threshold’ activity provided by PluginCoExpression. Co-expression analysis was performed in a signed network with biweight midcorrelation and the power parameters set to 10. The eigengene of each co-expression module was correlated with the provided clinical data using Pearson correlation and clustered using hierarchical clustering.

PluginInterop Provides a Central Entry Point for All External Plugins

The PluginInterop project is written in the C# programming language and implements several Perseus plugin APIs. For users it provides a number of activities in Perseus for executing script files written in the Python and R languages. Upon selection of any of these activities, users will be prompted with a parameter window, allowing them to pass additional arguments to the script and requiring them to specify the executable that should be used for processing. Since Perseus does not include an installation of Python or R, users will have to install those and any other dependencies separately. PluginInterop aids the user by trying to automatically detect an existing installation and provide meaningful error messages in case of missing dependencies. Developers can additionally leverage the functionality implemented in PluginInterop as a basis for parametrized scripts. In general, developers are free to choose which external scripting language or program they would like to utilize. We found the R and Python scripting languages to be most useful, which is why we provide two companion libraries, ‘perseuspy’ and ‘PerseusR’, to be used alongside PluginInterop. These libraries aid the communication between Perseus and the script.
The communication between Perseus and external scripts is straightforward and is easily implemented for any tools of choice. In short, Perseus will persist all necessary data to the hard-drive and call the specified tool with specific command-line arguments. The first arguments contain all the parameters specified by the user, per choice of the developer, either in an XML format or simply separated by spaces. Second, the input data from the workflow is saved to a temporary location which is passed to the script. The final arguments specify the expected location of the output data. The external process can provide status and progress updates to the user, as well as detailed error reporting by printing to stdout/stderr and indicating success or failure through the exit code. Once the process exits, Perseus will parse the output data for its expected location and insert it to the workflow. Any step in the pipeline is customizable for advanced scenarios, such as custom data formats.
The PluginInterop binary is automatically included in the latest Perseus version. The source code was published under the permissive, open-source MIT license on Github (https://github.com/cox-labs/PluginInterop). The website also provides more information on how to develop plugins, including a video demonstration. The plugins presented in this Article are all developed on top of PluginInterop and the perseuspy and PerseusR companion libraries.

Library Support for Scripting Languages

We implemented libraries in R and Python which facilitate the interoperability of Perseus with external scripting languages. The main aim of these libraries is to map the data structures of Perseus to a counterpart native to the external language. Developers proficient in these languages will be more comfortable and productive with these native data structures. The largest benefit comes from the resulting integration with the existing data science ecosystem, all now available to Perseus plugin developers.
The ‘perseuspy’ module provides data mappings for the Python language. The Perseus expression matrix is mapped to the ‘DataFrame’ object of the popular ‘pandas’ module, which is tightly integrated with ‘numpy’, the de facto standard for numerical computations in Python. The Perseus network collection data type maps to a list of networks from the ‘networkx’ package. It features a variety of graph algorithms and interfaces well with other modules, due to its usage of standard Python dictionaries. ‘perseuspy’ is distributed via The Python Package Index (PyPI), allowing for easy installation of the module for developers and users alike. The code of ‘perseuspy’ is published under the permissive, open-source MIT license, and is available alongside usage examples and more information on https://github.com/cox-labs/perseuspy.
For the R language, we implemented the ‘PerseusR’ package. It provides a mapping of the Perseus expression matrix to a custom wrapper class around the R ‘data.frame’ object. The wrapping was necessary to represent Perseus-specific information such as annotation rows. Alternatively, developers can load data as a Bioconductor ‘expressionSet’ object which enables the interface with the entire Bioconductors bioinformatics suite. Currently there is no support for network collections in ‘PerseusR’, but we plan to implement it in the near future. ‘PerseusR’ is also published under the MIT license and its code is available on https://github.com/cox-labs/PerseusR. ‘PerseusR’ is easily installed directly from CRAN.

Implementation of PluginPHOTON

We implemented a Perseus plugin for the PHOTON tool on top of the functionality provided by PluginInterop and perseuspy. PHOTON was previously capable to run only a single experiment at a time with a fixed human PPI network. We expanded its implementation to allow for parallel processing of any number of experiments on any network. These changes make large datasets from any species directly amenable to PHOTON analysis. PluginPHOTON is published under the MIT license, its code is available on https://github.com/jdrudolph/photon, and it is included in the latest Perseus release.

Implementation of PluginCoExpression

We implemented parts of the WGCNA pipeline as a Perseus plugin. PluginCoExpression provides access to the WGCNA functions implemented in the R language via PluginInterop and PerseusR.

Implementation of KSEA in Perseus

KSEA analysis was implemented in Perseus and tested for correctness against the reference implementation.

Results and Discussion

Click to copy section linkSection link copied!

Workflow-Based Biological Network Analysis

PerseusNet was devised to fulfill the computational needs of proteomics researchers wishing to accomplish network analysis of their data. While it is extensible through a new plugin application programming interface (API), and hence any network analysis functionality can be implemented, most tools needed for proteomics research and connecting it to generic network analysis platforms are included in the software (Figure 1). Dedicated activities for analyzing AP-MS datasets and phosphoproteomics experiments in the context of kinase–substrate networks belong to the basic infrastructure of PerseusNet. The most common standard data formats (tab, txt, csv, gml, sif, json) are supported as input. An extended multi-language plugin API allows leveraging many existing tools in the analysis workflow. As an important example, co-expression clustering tools are integrated in this way.

Figure 1

Figure 1. Schematic overview of the new network functionality in Perseus. PerseusNet implements a number of processing and analysis steps facilitated by the network collection data type. While including proteomics centric analyses, such as for the analysis of interaction screens, the network module also provides a number of general purpose tools, as, for instance, for network annotation, filtering, and topology determination. With the extension of the Perseus plugin API to networks and furthermore to other programing languages, it becomes possible to integrate existing network analysis tools in Perseus. Networks are easily imported to and exported from Perseus, due to its support for standard formats.

To accommodate PerseusNet, we extended the Perseus framework with a new data type termed network collection (Figure 2) that represents a set of one or more networks which are analyzed jointly in the workflow. Different networks within the same network collection can, for instance, represent networks derived from different individuals (patients), experimental conditions, or biological replicates. All information in the network collection is organized in data tables, leveraging the existing augmented data matrix (11) in Perseus. General information on the networks in the collection is stored in the networks table, where each row represents an individual network. Here, sample-related annotations, such as calculated global network properties, can be stored to enable their usage in analysis activities operating on a network collection. For instance, if the samples correspond to different patients, the networks table can hold patient-specific information as derived from patient records or questionnaires. These variables can then be used as independent or confounding factors in statistical analysis of the networks.

Figure 2

Figure 2. Schematic representation of the network collection data type. User-facing information is displayed in tabular form with tables listing the networks in the collection, as well as providing detailed information on the nodes and edges of each network. Internally an auxiliary graph data structure aids in the implementation of graph algorithms. Node- and edge-mapping provide the required cross-references between the tabular and graph representation.

The nodes and edges of each individual network are stored in a pair of separate tables. The nodes table further describes the entities in the network, while the edges table provides details on the connections between the entities. The entities in the nodes table can be annotated with local network properties, such as the node degree. In case the entities correspond to proteins, biologically meaningful annotations could include membership in gene ontology terms, pathways, or protein complexes. Similarly, edges can be annotated in the edges table with properties of pairwise relationships between proteins, as, for instance, interaction confidence measures. All of these properties are then accessible to the network analysis tools. Furthermore, all mentioned tables can be sorted and searched, allowing all information to be browsed and inspected intuitively. Internally, a graph data structure for each network enables the efficient execution of graph algorithms. We did not aim to include generic graphical representation of networks as node-link diagrams, since this can be achieved in other tools such as Cytoscape, for which we provide simple adaptors for the transfer of networks. However, several activities include specialized visualizations tailored to specific analyses.
In Perseus, all data analysis steps are performed within a graphical workflow (see Figure S1.) Enabled by the newly implemented network collection, the Perseus workflow is now capable of all import, processing, and analysis steps in the side-by-side analysis of expression matrices and networks. All data imported into Perseus is represented as a separate entity in the workflow. Any matrix or network undergoing a processing step is not modified in place but rather becomes a new entity that gets connected to the original data in the workflow. By inspecting both input and output data, every step in the analysis is traceable and easily understood. Certain processing steps allow for the transformation of matrices into networks and vice versa, or the mapping of data between the two. As a result, any analysis performed in Perseus, potentially including several side-by-side processing steps of networks and matrices, always remains transparent to the user.

Multilingual Plugin Activities

The network collection data structure (Figure 2) and the extended Perseus workflow provide the foundation for enabling various network analyses, many of which are available in Perseus. In general, networks either originate from external sources or are created in a data-driven manner from within the workflow. To facilitate the import of external networks into the workflow, we implemented parsers for standard network formats, such as edge table (.tab|.txt|.csv), GraphML (.gml) (http://graphml.graphdrawing.org/), Cytoscape’s simple interaction format (.sif) (http://manual.cytoscape.org/en/stable/Supported_Network_File_Formats.html), and D3js’s JSONgraph (.json) (http://jsongraphformat.info/), which enable loading interactions from most popular network databases, including STRING, (20) BioGRID, (21) IntAct, (22) CORUM, (23) and PhosphoSitePlus. (18) Furthermore, specific quantitative expression data, such as AP-MS, drives the creation of novel PPI networks, and phosphoproteomics datasets allow for a more detailed view or construction of kinase–substrate relationship networks. Specialized visualizations of such networks are provided (see later sections), which allow for an intuitive visual inspection of the results of the analysis. Perseus is not limited to physical interaction networks: co-expression clustering provides a powerful alternative to regular hierarchical clustering for expression proteomics studies. Finally, any network collection can be exported from the workflow in a plain text file format (Supplementary Data 1) for sharing or use in any other external tools, such as Cytoscape. In order to accommodate these new capabilities in the Perseus plugin system, we extended the Perseus plugin API with new programming interfaces for the network collection and other associated data types, as well as the respective import, processing, and analysis interfaces (see Figure S2.) This fully featured API is available to all developers wishing to extend Perseus’s functionality with plugins. All analyses presented in this Article adhere to the new API.
In order to better leverage the existing network analysis ecosystem, we additionally implemented a new mode of interoperability between Perseus and external tools (Figure 3). The PluginInterop project enables this functionality and allows the user to run external tools from within the Perseus workflow, most prominently scripts written in the popular R and Python languages. Open-source companion libraries for R (PerseusR, https://github.com/cox-labs/PerseusR) and Python (perseuspy, https://github.com/cox-labs/perseuspy) provide utilities for interfacing with Perseus. As a result, network analysis tools originally implemented in external tools can run from within the Perseus workflow with only minor adjustments. The implementations of the PHOTON and WGCNA plugins presented in this Article are based upon PluginInterop and its companion libraries. Instructions for interested developers on how to write scripts for Perseus or how to adapt existing tools can be found on the PluginInterop website (https://github.com/cox-labs/PluginInterop). In the following sections, we will present a number of network analyses which are now implemented in Perseus, with focus on their application to different types of proteomics data.

Figure 3

Figure 3. Schematic of the Perseus plugin system. Plugins written in C# are native to Perseus and implement their functionality directly on top of the application programming interfaces and data structures provided by the application framework. PluginInterop enables the execution of scripts in the Python and R languages, as well as other external programs. By communicating via the file system, data are transferred between Perseus and the external program. The companion libraries ‘perseuspy’ and ‘PerseusR’ enable developers to access the data science ecosystem in their language of choice. For custom graphical user interface elements and an improved user experience of external tools, developers can implement a thin C# wrapper class that extends the generic functionality of PluginInterop.

Affinity Enrichment MS Interactomics

Affinity purification or enrichment coupled to MS analysis has become a powerful tool for interrogating PPIs. (24,25) Not only is it able to provide a detailed view on proteins of interest, but it can also determine the basic building blocks for the assembly of large-scale PPI networks. (26,27) Historically, protein complex members were detected by subjecting the sample to a series of purification steps followed by MS identification. With the advent of quantitative MS, detecting even transient interactions has become possible by relying not on the identification itself, but instead on quantitative information. The sample is not purified but only enriched for the protein of interest and its interaction partners and then subjected to MS quantification. (28)
Confidently identifying bona fide interactions and distinguishing them from background binders, arising from off-target binding or contamination, require data analysis of replicate case and control measurements. Compared to purely fold-change-based methods, statistical tests provide a powerful way to compare case and control samples by calculating a test statistic and an associated p-value and limit the number of false-positives. For visual inspection of the results, the (negative logarithm of the) p-value can be plotted against the size of the effect, i.e., the difference between the means of logarithmic abundances, in a so-called volcano plot. Since one statistical test is performed for each protein, which amounts to a large number of tests performed simultaneously, the significance level needs to be adjusted to avoid increased numbers of false positives due to the multiple hypothesis testing problem. (29) A popular strategy to adjust for multiple testing is to control the false discovery rate (FDR), which can be achieved by permutation-based methods. Furthermore, in the volcano plot method it is necessary to define the functional form of the curves that separate significant from non-significant hits, either by straight lines or, in a more sophisticated way, introduced in the significance analysis of microarrays (SAM) method, (30) by modifying the t test statistic with the background variance parameter s0. This standard workflow is available in Perseus but becomes increasingly cumbersome for interaction screens with more than a handful of baits. Parameter values for s0 and the FDR thresholds are often applied separately for each pulldown, inviting overfitting and cherry-picking, and also requiring results be subsequently combined manually.
We implemented the interactive multi-volcano plot (Figure 4a) to analyze interaction screens with arbitrarily many baits and conditions simultaneously. Given the experimental design of the dataset, defined by baits and conditions, the analysis is applied to each experiment. FDR threshold and s0 parameters for two different Class A (high) and Class B (low) confidence classes can be selected globally. For sufficiently large datasets, instead of dedicated control samples, an internal control can be assembled from the dataset for each pulldown consisting of pulldowns of other, unrelated baits. The results can be inspected through an interactive user interface. All volcano plots are displayed in the overview panel. A multi-functional detail panel shows more information on selected plots and provides zoom, protein selection, and labeling options. If a single plot is selected, the volcano plot is shown in the detail panel. When two plots are selected, the t test differences between the selected experiments are plotted against each other, highlighting changes in the enrichment of proteins between experiments (Figure 4b). Additionally, all data can be browsed in tabular form, making it easily searchable and allowing for rich styling options. Known interactors or gene ontology annotations matching the experiment can be used to highlight proteins in the plot and can serve as a positive control for the adjustment of test parameters. Since all test parameters are controlled on a global level, overfitting and cherry-picking parameter values is prevented effectively. We integrated the multi-volcano analysis into the new network module. Results from PPI screens can be exported as network objects into the Perseus workflow. A specialized node-link visualization based on the open-source cytoscape.js library (31,32) with multiple layers of information, allows for easy interpretation of the results (Figure 4c). A PPI network that was newly created in this way can be integrated with existing networks or exported in various formats using the functions available through the network module.

Figure 4

Figure 4. AP-MS. (a) The Hawaii plot provides an overview over an entire dataset, in this case consisting of three baits in two conditions (Table S1 and Experimental Section). (14) Each volcano plot displays the results of a pull-down of a specific bait (Bap1, Eed, Ring1b) in one of the ESC or NPC cell lines. Significant interactors are determined using a permutation-based FDR and the resulting high-confidence Class A (solid line) and low-confidence Class B (dashed lines) thresholds are displayed in the plot. In this case only in the Bap1 ESC pull-down, Class B interactions could be found. Class A interactors are displayed in dark gray, other proteins are shown in light gray. (b) Enrichment plot comparing the Eed pull-downs in ESC and NPC cell lines. Significant interactors in any of the two conditions are displayed in black, nonsignificant proteins are displayed in light gray. Proteins differentially enriched in one of the two conditions will be located far from the diagonal and can be identified visually. (c) Visualization of the resulting protein interaction network for both cell lines. Bait proteins are colored in green, and their interactors are colored in blue. Thick lines represent Class A interactions, thinner lines Class B. Interactions which were already annotated in the human CORUM database are highlighted in red.

As an example application, we obtained pull-down experiments of Polycomb group proteins from ref (14), covering the three baits Bap1, Eed, and Ring1 in mouse embryonic stem cells (ESCs) and neural progenitor cells (NPCs). The filtered dataset contained 2995 proteins (Table S1). Using the new multi-volcano analysis (Figure 4a), we obtained an interaction network connecting the bait proteins with their significantly enriched prey proteins. Bait proteins were identified by their gene name, as specified in the annotation rows of the dataset. In order to have a consistent representation, the protein groups of the preys are also identified by their gene names. The resulting network contained 134 nodes and 140 edges. The results were comparable to the original publication with overlaps between 55% (Ring1b ESC) and 91% (Bap1 ESC) between the previously reported interactions and detected Class A interactions. Differences can be explained by the slightly different methodology used in this Article. We used the s0-modified t test with s0 set to 1.0, and FDRs of 0.01% and 0.2% for Class A and B, respectively, while the authors of ref (14) used individually chosen fold-change and p-value cutoffs for each experiment. No Class C interactions were included. Using the built-in visualization features, such as the enrichment between experiments, we identified several interactions that were conditional on the cell type (Figure 4b). By annotating the newly created protein-interaction network with known complex interaction from CORUM and inspecting the resulting node-link network visualization (Figure 4c), previously known and possibly novel interactions could be distinguished.
Further confidence in the existence of an interaction between a protein identified in a pulldown and the bait can be obtained by correlation analysis. The correlation of the intensity profiles over many pulldowns with the bait intensity profile is reported in the output tables, together with the volcano plot-derived significance of the interaction. When assembling the interaction network, a threshold is applied to this correlation in order to define an additional class of interactions (Class C), which might not have been found by volcano plot analysis (Class A and Class B). This workflow is especially appealing for interaction screens with a large number of bait proteins.

Importing, Curating, and Probing Large-Scale PPI Networks

While protein interaction screens can uncover novel or condition-specific interactions, a wealth of detected and predicted interactions are already stored in PPI databases. (33) Analyzing large-scale PPI networks jointly with other omics data has great promise. However, a major obstacle to performing systems-level analysis on these large-scale networks is the lack of easy-to-use software solutions to transparently handle the processing and analysis of these networks. Many studies under-utilize the existing resources and mostly report the interactions of a single protein as an afterthought. In the following, we introduce the new network capabilities of Perseus to assemble, filter, and understand large-scale PPI networks, which lay the foundation for any network analysis.
The first task is assembling a high-confidence interaction network. Many databases, such as STRING, (20) BioGRID, (21) or HIPPIE, (34) allow researchers to download all interactions in a tabular format, which can be easily loaded into Perseus, even with sizes of up to few millions of interactions. Supporting information on the interactions such as, but not limited to, the interaction type or a measure of confidence remains available at each step in the subsequent data analysis. Networks are not restricted to originate from any single data source. Perseus provides all necessary tools to integrate information from any source, providing full control over the choice of identifier and handling of duplication and ambiguity in the mapping. Conversely, generalized interaction networks such as STRING can be filtered by interaction type to generate a physical interaction network. Confidence measures often integrate diverse knowledge into a single score, derived from how often, and by which experimental technique, an interaction was detected, combined with more abstract measures, such as co-expression and literature co-occurrence of the interaction partners. (20) There are two approaches for interaction confidence aware network analysis (Figure 5a). Applying a cutoff to the confidence score removes low-confidence interactions from the network, which is especially useful when applying methods that treat all interactions equally. The cutoff can be chosen according to the confidence score distribution and the targeted network size (Figure 5b). Other methods operate on weighted networks and distinguish between interactions with high or low confidence. In this case the confidence scores can be used as an edge weight. In addition to static confidence scores, one can devise dynamic confidence scores from experimental data which reflect, e.g., changes in abundance or localization of any of the interactors.

Figure 5

Figure 5. Handling large-scale protein interaction databases in Perseus. (a) Interactions in PPI databases are often annotated with confidence scores derived from various sources. Perseus provides tools to load and combine confidence scores derived a variety of data sources, including dynamic confidence adjustments based on condition-specific data. High-confidence networks can be obtained by removing edges below a given hard threshold or alternatively, confidence scores can be utilized directly as so-called edge weights, thereby allowing for the inclusion of lower-confidence interactions. (b) Histogram of the combined confidence score from the human STRING PPI network. Superimposed in orange is the number of interactions in the filtered network if the edges with scores lower than the current value were removed. Filtering out low confidence edges leads to a significant reduction in the number of edges in the final network. (c) Log–log plot of the node degree against the degree frequency generated from the human STRING PPI network. The R2 value of the linear fit (orange) to these data represents the scale-free fit index.

A deeper understanding of the network requires a different perspective in addition to the interaction-centric view. Any list of interactions can be converted into a network collection with a single click. A dedicated set of network-specific processing activities are now available. While processing the list of interactions, the focus remains on the edges of the network. In the network view, the focus is shifted to the nodes. With the powerful identifier and data mapping mechanisms in Perseus, nodes are easily annotated with various annotations, such as gene ontology (GO), (35) or quantitative proteomics data. Any annotation can be subsequently used to filter the nodes of the network. One could, for example, extract a sub-network of proteins associated with a specific GO category and their interactions from the large-scale network. Using the data mapping from, e.g., deep proteomes of specific cell lines or tissues, condition-specific sub-networks can be created.
Further understanding is gained by studying the intrinsic properties of networks. By calculating node degrees, corresponding to the number of neighbors of each node in the network, hub nodes can be distinguished from peripheral nodes. By analyzing the distribution of the node degrees in the network, global network properties, such as approximate scale-freeness, (36,37) of the topology can be identified (Figure 5c). Furthermore, intrinsic local network properties, like the node degree, can be correlated with biological properties derived from protein annotations or experimental data. The proper construction of large-scale interaction networks and understanding of their basic properties are central to the successful application of more specialized analyses such as the integration of such networks with PTM data.

Network Analysis of PTM Data

The MS-based study of PTMs is now possible on a global scale for several types of modifications. The best known example is MS-based phosphoproteomics, (38) which is a powerful tool for interrogating signaling events on a large scale. However, drawing conclusions directly from phosphorylation changes is challenging, due to the mostly missing functional information on the inhibitory or excitatory action of a specific protein phosphorylation at a specific site. Network-based approaches for the analysis of phosphorylation data derive functional information on the protein level by interrogating the phosphorylation changes observed in the network neighborhood. (17,39,40)
We implemented the popular kinase–substrate enrichment analysis (39) (KSEA) tool for predicting kinase activities in Perseus. Site-specific kinase–substrate networks (Figure 6a) assign kinases to the experimentally observed phosphorylation sites. The core of the analysis is the calculation of a series of scores (mean, enrichment, Z-score, p-value, q-value) for each kinase, based on the quantitative phosphorylation changes of its substrates. These predicted kinase activities can be analyzed further to find differentially activated kinases. KSEA most often utilizes the curated kinase–substrate network from the PhosphoSitePlus database. (18,41,42) In order to extend the coverage of the network and thereby allow for the utilization of a larger fraction of the experimental data, the network can be supplemented with predicted kinase–substrate interactions from tools such as NetworKIN (43,44) or with low-specificity interactions derived from kinase target sequence motifs.

Figure 6

Figure 6. Network analysis of an EGF stimulation phosphoproteomics study. (a) Comparison of network topologies used for the analysis of phosphoproteomics data. Nodes in the network are represented as gray circles or pie charts where each slice represents the observed phosphorylation changes at a specific site on the protein. Physical protein–protein interactions (left side) are present between all classes of proteins and are by definition undirected. In order to capture the enzymatic action of kinases more accurately, directed interactions (right side) from kinase to substrate are defined in a site-specific manner. (b) KSEA Z-score and PHOTON signaling functionality scores derived from phosphoproteomics data measured after EGF stimulation (Table S2) only weakly correlate to each other (Pearson correlation 0.52). Kinases annotated in GO with the term ‘Epidermal growth factor receptor signaling pathway’ are highlighted in red. Both methods assign high scores to central members of the expected pathway. (c) Signaling network reconstructed by PHOTON from the 100 highest scoring proteins anchored at EGF. The interactive visualization has an automatic layout and phosphorylation data overlay.

PHOTON, (17) now available in Perseus, is an alternative approach to KSEA that calculates more broadly defined signaling functionality scores for any protein, rather than activities for kinases only. A data-annotated large-scale PPI network now serves as the input (Figure 6a). The resulting signaling functionality scores for each experimental condition are based on the observed phosphorylation in the neighborhood of each protein and are assigned a significance by a permutation scheme. The scores can either be analyzed directly, to find proteins with differentially changing signaling functionality, or utilized in a second step of the PHOTON pipeline, in which signaling pathways are automatically reconstructed from the large-scale network that connect the proteins with significant signaling functionality. (17)
The Perseus network module allows for performing both KSEA and PHOTON analysis on the same experimental data (17) and a choice of networks. (18,34) When applied to a phosphoproteomics dataset of EGF stimulation, (17) the trade-offs of both methods in terms of coverage can be compared at every step of the analysis by inspecting the matrices and network collections in the workflow. For KSEA, 583 (5.66%) phosphorylation sites could be mapped to 975 (9.43%) site-specific kinase substrate interactions. As expected, PHOTON provided more coverage, with 9148 (99.82%) sites mapped to 2070 (16.87%) nodes in the PPI network. Due to the differences in the utilized methodologies and the chosen networks, resulting scores will differ but can be compared with the analyses and visualizations provided by Perseus (Figure 6b). While KSEA is tailored to the analysis of phosphoproteomics data due to its focus on kinase–substrate interactions, PHOTON is not limited to phosphorylation. Any quantitative, large-scale PTM dataset can be mapped to the PPI network, signaling functionality scores can be calculated, and sub-networks can be reconstructed.
Both tools support the analysis of datasets with multiple conditions, effectively transforming the peptide-level phosphorylation data into protein-level scores. The entire well-established toolset for the analysis of protein quantification data can be applied to these scores, including hierarchical clustering, enrichment analysis, (17) and time-series analysis. To visualize PTM data in the context of any network, we implemented an interactive visualization of directly in Perseus (Figure 6c) using the cytoscape.js library. (31) The visualization allows for the joint visual inspection of the networks, e.g., sub-networks reconstructed by PHOTON, and the measured data. Browsing the quantitative PTM data in a reduced and highly structured network view while also considering the signaling functionality scores allows for the generation of hypotheses that explain the signal transduction mechanistically.

Co-expression Clustering and Clinical Data

When performing co-expression analysis, the correlation matrix between the proteins in the dataset describes a fully connected, weighted network, in which the weight on each edge denotes the correlation between the quantitative profiles of the two proteins (Figure 7a). Hence, the actual network usually remains implicit. A hierarchical clustering of the co-expression network can utilize the network neighborhood of each protein and integrate it into the similarity calculation. (45) The cluster dendrogram and the detected co-expression modules are then transferred back to the original data, where their interpretation is equivalent to ordinary hierarchical clustering. In addition to the clustering, a representative expression profile for each of the clusters is generated, which is termed eigengene. This highly reduced view on the data can be correlated with clinical or phenotype data and clustered to gain a better understanding of the behavior of the detected cluster (Figure 7b). The described co-expression analysis is available in Perseus through the R language interface provided by PluginInterop, which interfaces directly with the established WGCNA library. (16)

Figure 7

Figure 7. Co-expression network analysis on clinical data. (a) The correlation matrix is an equivalent representation of a fully connected network with edge weights corresponding to the correlation between the proteins. (b) Co-expression clustering and identified co-expression modules annotate the original expression matrix. Phenotype data can be correlated with representative co-expression module profiles and provide a high-level interpretation of the modules. (c) Parameter selection of the power parameter for the Yanovich et al. (19) dataset (Table S3 and Experimental Section). The lowest power reaching close to a high scale-free fit index of 0.9 (red line) was selected. (d) Co-expression cluster dendrogram. Each color corresponds to one co-expression module. (e) Correlation heat map between module eigengenes and clinical parameters.

We applied the WGCNA co-expression analysis to parts of a cancer proteomics dataset (19) following the recommended workflow (http://www.peterlangfelder.com/wgcna-resources-on-the-web/) from within Perseus. Bi-weight midcorrelation, a robust alternative to Pearson correlation, was chosen to calculate correlations between all pairs of proteins. In order to obtain a scale-free co-expression network, a power parameter of 10 was selected (Figure 7c), leading to an approximately scale-free network with a scale-free fit index of 0.9. Hierarchical clustering of the co-expression network identified 30 modules (Figure 7d). The representative expression profiles of each of the modules, as provided by the corresponding module eigengene, were correlated with the available clinical annotations. This high-level overview over the data was then visualized in a heatmap (Figure 7e). Several modules showed high correlations with specific clinical annotations. The magenta module showed high correlation with the triple-negative subtype (TN) and was significantly enriched for the ‘interferon-gamma-mediated signaling pathway’ GO category (q = 1.12 × 10–05). The top module hub genes with kME > 0.8 were GBP1, TAP1, TAPBP, HLA-A, TAP2, STAT1, and EML4. The purple module showed high correlation with Stage III, but when inspecting the profile of its eigengene, we found it to have a single peak at one patient while being flat for all others. Hence, a set of proteins that are highly expressed in only a single patient, dominate the purple module, thereby limiting the validity of the module.

Software Implementation, Download, and Maintenance

The Perseus network module PerseusNet is implemented in the C# programming language using Visual Studio 2017, like the whole Perseus software. PerseusNet is distributed with Perseus by default and can be downloaded from http://www.perseus-framework.org. The current version, which is described in this Article, is 1.6.2.3. The PluginInterop and PHOTON plugins are also included in the standard download. In the current release, it is recommended to use Windows as operating system, although Linux support is underway, realized in the same way as for the MaxQuant software, (46,47) by ensuring Mono compatibility. A plugin API enables external programmers to extend the functionality of PerseusNet and Perseus in general, by programming their own workflow activities. Plugin extensions by the user community will be linked from the plugin store at http://www.coxdocs.org/doku.php?id=perseus:user:plugins:store upon request. Context-specific documentation is linked from each activity (Figure S3). Step-by-step guides for the integration of external tools, such as Python or R, that have to be installed and configured separately from the main Perseus software, are available online (https://github.com/jdrudolph/PluginInterop). A help forum for Perseus and PerseusNet is available at https://groups.google.com/group/perseus-list. Bugs that are reproducible in the latest available software version should be reported at https://maxquant.myjetbrains.com/youtrack. All presented analyses and necessary installations take less than an hour altogether.

Conclusions

Click to copy section linkSection link copied!

We introduced PerseusNet, the network analysis extension for the Perseus software. It enables proteomics researchers to perform most network analysis by themselves. PerseusNet is highly extensible through a plugin API and its extension to R and Python, which allows for the incorporation of a plethora of existing scripts and programs from the network community. We envision that large part of the future programming will be done not by local developers but by the global community through the plugin API. Programmers can release their plugins under licenses of their choice.
We have implemented powerful proteomics-specific activities for AP-MS network generation and PTM-related network analysis, presumably the two main applications for networking in proteomics. We plan to extend PerseusNet in the near future by activities from other proteomics sub-domains, as interaction determination by protein correlation profiling (48) and large-scale network generation from cross-linking experiments on whole-cell lysates. (49)

Supporting Information

Click to copy section linkSection link copied!

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.8b00927.

  • Figure S1, Graphical workflow combining matrix and network activities; Figure S2, Organization of the new Perseus plugin API for networks; Figure S3, Context-specific documentation; and explanations of Tables S1–S3 (PDF)

  • Table S1, AP-MS pull screen (TXT)

  • Table S2, phosphoproteomics of EGF stimulation (TXT)

  • Table S3, clinical proteomics dataset (TXT)

  • Supplementary Data 1, Perseus network collection data format: example of a network collection describing three small, randomly generated networks (ZIP)

Terms & Conditions

Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

Author Information

Click to copy section linkSection link copied!

  • Corresponding Author
    • Jürgen Cox - Computational Systems Biochemistry, Max-Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, GermanyDepartment of Biological and Medical Psychology, University of Bergen, Jonas Liesvei 91, 5009 Bergen, NorwayOrcidhttp://orcid.org/0000-0001-8597-205X Email: [email protected]
  • Author
    • Jan Daniel Rudolph - Computational Systems Biochemistry, Max-Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, GermanyOrcidhttp://orcid.org/0000-0003-4196-057X
  • Author Contributions

    J.D.R. and J.C. planned and performed the research, developed the software, and wrote the manuscript.

  • Notes
    The authors declare no competing financial interest.

Acknowledgments

Click to copy section linkSection link copied!

We thank J. Sebastian Paez and Sung-Huan Yu for contributing to PerseusR, and Caroline Friedel and Tamar Geiger for helpful discussions. This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement no. 686547 and from the FP7 grant agreement GA ERC-2012-SyG_318987–ToPAG.

References

Click to copy section linkSection link copied!

This article references 49 other publications.

  1. 1
    Bar-Yam, Y. General Features of Complex Systems. Knowledge Management, Organizational Intelligence and Learning, and Complexity, Vol. I; Encyclopedia of Life Support Systems, 1997; www.eolss.net
  2. 2
    O’Connor, T.; Wong, H. Y. Emergent Properties. Stanford Encyclopedia of Philosophy, Spring 2012 ed.; Metaphysics Research Lab, Stanford University, 2012; https://plato.stanford.edu/archives/spr2012/entries/properties-emergent/
  3. 3
    Grandjean, M. A social network analysis of Twitter: Mapping the digital humanities community. Cogent Arts Humanit. 2016, 3, 1171458,  DOI: 10.1080/23311983.2016.1171458
  4. 4
    Goffman, C. And What is Your Erdös Number?. Am. Math. Mon. 1969, 76, 791,  DOI: 10.1080/00029890.1969.12000324
  5. 5
    Sporns, O.; Tononi, G.; Kötter, R. The human connectome: A structural description of the human brain. PLoS Comput. Biol. 2005, 1, e42,  DOI: 10.1371/journal.pcbi.0010042
  6. 6
    Seung, S. I am my connectome. TED: Ideas Worth Spreading , 2010; https://www.ted.com/talks/sebastian_seung.
  7. 7
    Seung, H. S. Connectome: How the Brain’s Wiring Makes Us Who We Are; Penguin Books Ltd., 2013.
  8. 8
    Barabási, A.-L.; Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 2004, 5, 101113,  DOI: 10.1038/nrg1272
  9. 9
    Butte, A. J.; Kohane, I. S. Mutual Information Relevance Networks: Functional Genomic Clustering Using Pairwise Entropy Measurements. Biocomputing 2000, 2000, 418429,  DOI: 10.1142/9789814447331_0040
  10. 10
    Sinitcyn, P.; Rudolph, J. D.; Cox, J. Computational Methods for Understanding Mass Spectrometry-Based Shotgun Proteomics Data. Annu. Rev. Biomed. Data Sci. 2018, 1, 207234,  DOI: 10.1146/annurev-biodatasci-080917-013516
  11. 11
    Tyanova, S. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 2016, 13, 73140,  DOI: 10.1038/nmeth.3901
  12. 12
    Tyanova, S.; Cox, J. Methods in Molecular Biology Springer, 2018; Vol. 1711, pp 133148.
  13. 13
    Shannon, P. Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 24982504,  DOI: 10.1101/gr.1239303
  14. 14
    Kloet, S. L. The dynamic interactome and genomic targets of Polycomb complexes during stem-cell differentiation. Nat. Struct. Mol. Biol. 2016, 23, 682690,  DOI: 10.1038/nsmb.3248
  15. 15
    Smith, C. L.; Blake, J. A.; Kadin, J. A.; Richardson, J. E.; Bult, C. J. Mouse Genome Database (MGD)-2018: Knowledgebase for the laboratory mouse. Nucleic Acids Res. 2018, 46, D83642,  DOI: 10.1093/nar/gkx1006
  16. 16
    Langfelder, P.; Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008, 9, 559,  DOI: 10.1186/1471-2105-9-559
  17. 17
    Rudolph, J. D.; de Graauw, M.; van de Water, B.; Geiger, T.; Sharan, R. Elucidation of Signaling Pathways from Large-Scale Phosphoproteomic Data Using Protein Interaction Networks. Cell Syst. 2016, 3, 585593,  DOI: 10.1016/j.cels.2016.11.005
  18. 18
    Hornbeck, P. V. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015, 43, D51220,  DOI: 10.1093/nar/gku1267
  19. 19
    Yanovich, G. Clinical Proteomics of Breast Cancer Reveals a Novel Layer of Breast Cancer Classification. Cancer Res. 2018, 78, 60016010,  DOI: 10.1158/0008-5472.CAN-18-1079
  20. 20
    Szklarczyk, D. The STRING database in 2017: Quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 2017, 45, D362D368,  DOI: 10.1093/nar/gkw937
  21. 21
    Chatr-Aryamontri, A. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017, 45, D369D379,  DOI: 10.1093/nar/gkw1102
  22. 22
    Orchard, S. The MIntAct project - IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 2014, 42, D35863,  DOI: 10.1093/nar/gkt1115
  23. 23
    Ruepp, A. CORUM: The comprehensive resource of mammalian protein complexes-2009. Nucleic Acids Res. 2010, 38, D497501,  DOI: 10.1093/nar/gkp914
  24. 24
    Gingras, A. C.; Gstaiger, M.; Raught, B.; Aebersold, R. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol. 2007, 8, 645654,  DOI: 10.1038/nrm2208
  25. 25
    Dunham, W. H.; Mullin, M.; Gingras, A. C. Affinity-purification coupled to mass spectrometry: Basic principles and strategies. Proteomics 2012, 12, 1576,  DOI: 10.1002/pmic.201100523
  26. 26
    Hein, M. Y. A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances. Cell 2015, 163, 712723,  DOI: 10.1016/j.cell.2015.09.053
  27. 27
    Huttlin, E. L. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell 2015, 162, 425440,  DOI: 10.1016/j.cell.2015.06.043
  28. 28
    Hubner, N. C. Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions. J. Cell Biol. 2010, 189, 739754,  DOI: 10.1083/jcb.200911091
  29. 29
    Noble, W. S. How does multiple testing correction work?. Nat. Biotechnol. 2009, 27, 11351137,  DOI: 10.1038/nbt1209-1135
  30. 30
    Tusher, V. G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 51165121,  DOI: 10.1073/pnas.091062498
  31. 31
    Franz, M. Cytoscape.js: A graph theory library for visualisation and analysis. Bioinformatics 2015, 32, 309311,  DOI: 10.1093/bioinformatics/btv557
  32. 32
    Dogrusoz, U.; Giral, E.; Cetintas, A.; Civril, A.; Demir, E. A layout algorithm for undirected compound graphs. Inf. Sci. (N. Y.) 2009, 179, 980994,  DOI: 10.1016/j.ins.2008.11.017
  33. 33
    Pedamallu, C. S.; Ozdamar, L. A Review on protein-protein interaction network databases. In Springer Proceedings in Mathematics and Statistics; Springer, 2014; Vol. 73, pp 511519.
  34. 34
    Alanis-Lobato, G.; Andrade-Navarro, M. A.; Schaefer, M. H. HIPPIE v2.0: Enhancing meaningfulness and reliability of protein-protein interaction networks. Nucleic Acids Res. 2017, 45, D40814,  DOI: 10.1093/nar/gkw985
  35. 35
    Gene Ontology Consortium, C. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015, 43, D104956,  DOI: 10.1093/nar/gku1179
  36. 36
    Clauset, A.; Shalizi, C. R.; Newman, M. E. J. Power-law distributions in empirical data. SIAM Rev. 2009, 51, 661,  DOI: 10.1137/070710111
  37. 37
    Albert, R. Scale-free networks in cell biology. J. Cell Sci. 2005, 118, 4947,  DOI: 10.1242/jcs.02714
  38. 38
    Riley, N. M.; Coon, J. J. Phosphoproteomics in the Age of Rapid and Deep Proteome Profiling. Anal. Chem. 2016, 88, 7494,  DOI: 10.1021/acs.analchem.5b04123
  39. 39
    Casado, P. Kinase-substrate enrichment analysis provides insights into the heterogeneity of signaling pathway activation in leukemia cells. Sci. Signaling 2013, 6, rs6rs6,  DOI: 10.1126/scisignal.2003573
  40. 40
    Hernandez-Armenta, C.; Ochoa, D.; Gonçalves, E.; Saez-Rodriguez, J.; Beltrao, P. Benchmarking substrate-based kinase activity inference using phosphoproteomic data. Bioinformatics 2017, 33, 18451851,  DOI: 10.1093/bioinformatics/btx082
  41. 41
    Herranz, N. mTOR regulates MAPKAPK2 translation to control the senescence-associated secretory phenotype. Nat. Cell Biol. 2015, 17, 1205,  DOI: 10.1038/ncb3225
  42. 42
    Wilkes, E. H.; Terfve, C.; Gribben, J. G.; Saez-Rodriguez, J.; Cutillas, P. R. Empirical inference of circuitry and plasticity in a kinase signaling network. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 7719,  DOI: 10.1073/pnas.1423344112
  43. 43
    Linding, R. NetworKIN: A resource for exploring cellular phosphorylation networks. Nucleic Acids Res. 2007, 36, D69599,  DOI: 10.1093/nar/gkm902
  44. 44
    Wiredja, D. D.; Koyutürk, M.; Chance, M. R. The KSEA App: a web-based tool for kinase activity inference from quantitative phosphoproteomics. Bioinformatics 2017, 33, 3489,  DOI: 10.1093/bioinformatics/btx415
  45. 45
    Zhang, B.; Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 2005, 4, 17,  DOI: 10.2202/1544-6115.1128
  46. 46
    Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 2008, 26, 13671372,  DOI: 10.1038/nbt.1511
  47. 47
    Sinitcyn, P. MaxQuant goes Linux. Nat. Methods 2018, 15, 401,  DOI: 10.1038/s41592-018-0018-y
  48. 48
    Kristensen, A. R.; Foster, L. J. Protein correlation profiling-SILAC to study protein-protein interactions. Methods Mol. Biol. 2014, 1188, 263,  DOI: 10.1007/978-1-4939-1142-4_18
  49. 49
    Liu, F.; Lössl, P.; Scheltema, R.; Viner, R.; Heck, A. J. R. Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat. Commun. 2017, 8, 15473,  DOI: 10.1038/ncomms15473

Cited By

Click to copy section linkSection link copied!

This article is cited by 49 publications.

  1. Tehmina Bharucha, Bevin Gangadharan, Abhinav Kumar, Ashleigh C. Myall, Nazli Ayhan, Boris Pastorino, Anisone Chanthongthip, Manivanh Vongsouvath, Mayfong Mayxay, Onanong Sengvilaipaseuth, Ooyanong Phonemixay, Sayaphet Rattanavong, Darragh P. O’Brien, Iolanda Vendrell, Roman Fischer, Benedikt Kessler, Lance Turtle, Xavier de Lamballerie, Audrey Dubot-Pérès, Paul N. Newton, Nicole Zitzmann, SEAe Consortium. Deep Proteomics Network and Machine Learning Analysis of Human Cerebrospinal Fluid in Japanese Encephalitis Virus Infection. Journal of Proteome Research 2023, 22 (6) , 1614-1629. https://doi.org/10.1021/acs.jproteome.2c00563
  2. Miroslava Strmiskova, Jason D. Josephson, Caroline Toudic, John Paul Pezacki. Optimized Bioorthogonal Non-canonical Amino Acid Tagging to Identify Serotype-Specific Biomarkers in Verotoxigenic Escherichia coli. ACS Infectious Diseases 2023, 9 (4) , 856-863. https://doi.org/10.1021/acsinfecdis.2c00548
  3. Kevin W. Cormier, Brett Larsen, Anne-Claude Gingras, James R. Woodgett. Interactomes of Glycogen Synthase Kinase-3 Isoforms. Journal of Proteome Research 2023, 22 (3) , 977-989. https://doi.org/10.1021/acs.jproteome.2c00825
  4. Joseph Bloom, Aaron Triantafyllidis, Anna Quaglieri, Paula Burton Ngov, Giuseppe Infusini, Andrew Webb. Mass Dynamics 1.0: A Streamlined, Web-Based Environment for Analyzing, Sharing, and Integrating Label-Free Data. Journal of Proteome Research 2021, 20 (11) , 5180-5188. https://doi.org/10.1021/acs.jproteome.1c00683
  5. Weijun Gui, Siqi Shen, Zhihao Zhuang. Photocaged Cell-Permeable Ubiquitin Probe for Temporal Profiling of Deubiquitinating Enzymes. Journal of the American Chemical Society 2020, 142 (46) , 19493-19501. https://doi.org/10.1021/jacs.9b12426
  6. Tristan Cardon, Bilgehan Ozcan, Soulaimane Aboulouard, Firas Kobeissy, Marie Duhamel, Franck Rodet, Isabelle Fournier, Michel Salzet. Epigenetic Studies Revealed a Ghost Proteome in PC1/3 KD Macrophages under Antitumoral Resistance Induced by IL-10. ACS Omega 2020, 5 (43) , 27774-27782. https://doi.org/10.1021/acsomega.0c02530
  7. Sung-Huan Yu, Pelagia Kyriakidou, Jürgen Cox. Isobaric Matching between Runs and Novel PSM-Level Normalization in MaxQuant Strongly Improve Reporter Ion-Based Quantification. Journal of Proteome Research 2020, 19 (10) , 3945-3954. https://doi.org/10.1021/acs.jproteome.0c00209
  8. S. Ghorbanalipoor, T. Hommel, T. Kolbe, T. Fröhlich, B. Wagner, C. Posch, M. Dahlhoff. The loss of keratin 77 in murine skin is functionally compensated by keratin 1. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 2025, 1872 (2) , 119881. https://doi.org/10.1016/j.bbamcr.2024.119881
  9. Mengyao Wu, Huihui Tao, Tiantian Xu, Xuejia Zheng, Chunmei Wen, Guoying Wang, Yali Peng, Yong Dai. Spatial proteomics: unveiling the multidimensional landscape of protein localization in human diseases. Proteome Science 2024, 22 (1) https://doi.org/10.1186/s12953-024-00231-2
  10. Xiaosha Wang, Layla Frühn, Panpan Li, Xin Shi, Nini Wang, Yuan Feng, Julia Prinz, Hanhan Liu, Verena Prokosch. Comparative proteomic analysis of regenerative mechanisms in mouse retina to identify markers for neuro-regeneration in glaucoma. Scientific Reports 2024, 14 (1) https://doi.org/10.1038/s41598-024-72378-z
  11. Rico Ledwith, Tobias Stobernack, Antje Bergert, Aileen Bahl, Mario Pink, Andrea Haase, Verónica I. Dumit. Towards characterization of cell culture conditions for reliable proteomic analysis: in vitro studies on A549, differentiated THP-1, and NR8383 cell lines. Archives of Toxicology 2024, 98 (12) , 4021-4031. https://doi.org/10.1007/s00204-024-03858-4
  12. Zhi-Feng Pan, Ling Du, Feng Liu. Protocol for analysis of plasma proteomes from patients with hepatocellular carcinoma receiving combination therapy. STAR Protocols 2024, 5 (4) , 103308. https://doi.org/10.1016/j.xpro.2024.103308
  13. Faheem Seedat, Neva Kandzija, Michael J. Ellis, Shuhan Jiang, Asselzhan Sarbalina, James Bancroft, Edward Drydale, Svenja S. Hester, Roman Fischer, Alisha N. Wade, M. Irina Stefana, John A. Todd, Manu Vatish. Placental small extracellular vesicles from normal pregnancy and gestational diabetes increase insulin gene transcription and content in β cells. Clinical Science 2024, 138 (22) , 1481-1502. https://doi.org/10.1042/CS20241782
  14. Delia Oosthuizen, Tariq A. Ganief, Kenneth E. Bernstein, Edward D. Sturrock. Proteomic Analysis of Human Macrophages Overexpressing Angiotensin-Converting Enzyme. International Journal of Molecular Sciences 2024, 25 (13) , 7055. https://doi.org/10.3390/ijms25137055
  15. Maya Belghazi, Cécile Iborra, Ophélie Toutendji, Manon Lasserre, Dominique Debanne, Jean-Marc Goaillard, Béatrice Marquèze-Pouey. High-Resolution Proteomics Unravel a Native Functional Complex of Cav1.3, SK3, and Hyperpolarization-Activated Cyclic Nucleotide-Gated Channels in Midbrain Dopaminergic Neurons. Cells 2024, 13 (11) , 944. https://doi.org/10.3390/cells13110944
  16. André C. Michaelis, Andreas-David Brunner, Maximilian Zwiebel, Florian Meier, Maximilian T. Strauss, Isabell Bludau, Matthias Mann. The social and structural architecture of the yeast protein interactome. Nature 2023, 624 (7990) , 192-200. https://doi.org/10.1038/s41586-023-06739-5
  17. Hee-Yeon Kim, Janbolat Ashim, Song Park, Wansoo Kim, Sangho Ji, Seoung-Woo Lee, Yi-Rang Jung, Sang Won Jeong, Se-Guen Lee, Hyun-Chul Kim, Young-Jae Lee, Mi Kyung Kwon, Jun-Seong Hwang, Jung Min Shin, Sung-Jun Lee, Wookyung Yu, Jin-Kyu Park, Seong-Kyoon Choi. A preliminary study about the potential risks of the UV-weathered microplastic: The proteome-level changes in the brain in response to polystyrene derived weathered microplastics. Environmental Research 2023, 233 , 116411. https://doi.org/10.1016/j.envres.2023.116411
  18. Birgitte Villadsen, Camilla Thygesen, Manuela Grebing, Stefan J. Kempf, Marie B. Sandberg, Pia Jensen, Stefanie H. Kolstrup, Helle H. Nielsen, Martin R. Larsen, Bente Finsen. Ceruloplasmin‐deficient mice show changes in PTM profiles of proteins involved in messenger RNA processing and neuronal projections and synaptic processes. Journal of Neurochemistry 2023, 165 (1) , 76-94. https://doi.org/10.1111/jnc.15754
  19. Paula Carrillo-Rodriguez, Frode Selheim, Maria Hernandez-Valladares. Mass Spectrometry-Based Proteomics Workflows in Cancer Research: The Relevance of Choosing the Right Steps. Cancers 2023, 15 (2) , 555. https://doi.org/10.3390/cancers15020555
  20. Mahmut Emir, Ahmet Caglar Ozketen, Ayse Andac Ozketen, Arzu Çelik Oğuz, Mei Huang, Aziz Karakaya, Christof Rampitsch, Aslihan Gunel. Increased levels of cell wall degrading enzymes and peptidases are associated with aggressiveness in a virulent isolate of Pyrenophora teres f. maculata. Journal of Plant Physiology 2022, 279 , 153839. https://doi.org/10.1016/j.jplph.2022.153839
  21. Ahmed B. Montaser, Janita Kuiri, Teemu Natunen, Pavel Hruška, David Potěšil, Seppo Auriola, Mikko Hiltunen, Tetsuya Terasaki, Marko Lehtonen, Aaro Jalkanen, Kristiina M. Huttunen. Enhanced drug delivery by a prodrug approach effectively relieves neuroinflammation in mice. Life Sciences 2022, 310 , 121088. https://doi.org/10.1016/j.lfs.2022.121088
  22. Adam Aleksander Karpiński, Julio Cesar Torres Elguera, Anne Sanner, Witold Konopka, Leszek Kaczmarek, Dominic Winter, Anna Konopka, Ewa Bulska. Study on Tissue Homogenization Buffer Composition for Brain Mass Spectrometry-Based Proteomics. Biomedicines 2022, 10 (10) , 2466. https://doi.org/10.3390/biomedicines10102466
  23. Laken Kruger, Guihua Yue, Vijaya Saradhi Mettu, Alison Paquette, Sheela Sathyanarayana, Bhagwat Prasad. Differential proteomics analysis of JEG-3 and JAR placental cell models and the effect of androgen treatment. The Journal of Steroid Biochemistry and Molecular Biology 2022, 222 , 106138. https://doi.org/10.1016/j.jsbmb.2022.106138
  24. Minna M. Konert, Anna Wysocka, Peter Koník, Roman Sobotka. High-light-inducible proteins HliA and HliB: pigment binding and protein–protein interactions. Photosynthesis Research 2022, 152 (3) , 317-332. https://doi.org/10.1007/s11120-022-00904-z
  25. Miao-Hsia Lin, Pei-Shan Wu, Tzu-Hsuan Wong, I-Ying Lin, Johnathan Lin, Jürgen Cox, Sung-Huan Yu. Benchmarking differential expression, imputation and quantification methods for proteomics data. Briefings in Bioinformatics 2022, 23 (3) https://doi.org/10.1093/bib/bbac138
  26. Julia Patricia Schessner, Eugenia Voytik, Isabell Bludau. A practical guide to interpreting and generating bottom‐up proteomics data visualizations. PROTEOMICS 2022, 22 (8) https://doi.org/10.1002/pmic.202100103
  27. Hamid Hamzeiy, Daniela Ferretti, Maria S. Robles, Jürgen Cox. Perseus plugin “Metis” for metabolic-pathway-centered quantitative multi-omics data analysis for static and time-series experimental designs. Cell Reports Methods 2022, 2 (4) , 100198. https://doi.org/10.1016/j.crmeth.2022.100198
  28. Andrew T. Rajczewski, Pratik D. Jagtap, Timothy J. Griffin. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Review of Proteomics 2022, 19 (3) , 165-181. https://doi.org/10.1080/14789450.2022.2070476
  29. Balasubramanian Cibichakravarthy, Juan A. Oses-Prieto, Michael Ben-Yosef, Alma L. Burlingame, Timothy L. Karr, Yuval Gottlieb, . Comparative Proteomics of Coxiella like Endosymbionts (CLEs) in the Symbiotic Organs of Rhipicephalus sanguineus Ticks. Microbiology Spectrum 2022, 10 (1) https://doi.org/10.1128/spectrum.01673-21
  30. Rodrigo Mohallem, Uma K. Aryal. Quantitative Proteomics and Phosphoproteomics Reveal TNF-α-Mediated Protein Functions in Hepatocytes. Molecules 2021, 26 (18) , 5472. https://doi.org/10.3390/molecules26185472
  31. Fumiko Matsuzaki, Shinsuke Uda, Yukiyo Yamauchi, Masaki Matsumoto, Tomoyoshi Soga, Kazumitsu Maehara, Yasuyuki Ohkawa, Keiichi I. Nakayama, Shinya Kuroda, Hiroyuki Kubota. An extensive and dynamic trans-omic network illustrating prominent regulatory mechanisms in response to insulin in the liver. Cell Reports 2021, 36 (8) , 109569. https://doi.org/10.1016/j.celrep.2021.109569
  32. Alexa Derksen, Hung-Yu Shih, Diane Forget, Lama Darbelli, Luan T. Tran, Christian Poitras, Kether Guerrero, Sundaresan Tharun, Fowzan S. Alkuraya, Wesam I. Kurdi, Cam-Tu Emilie Nguyen, Anne-Marie Laberge, Yue Si, Marie-Soleil Gauthier, Joshua L. Bonkowsky, Benoit Coulombe, Geneviève Bernard. Variants in LSM7 impair LSM complexes assembly, neurodevelopment in zebrafish and may be associated with an ultra-rare neurological disease. Human Genetics and Genomics Advances 2021, 2 (3) , 100034. https://doi.org/10.1016/j.xhgg.2021.100034
  33. Rahimah Hassan, Nurulhasanah Othman, Sharif M. Mansor, Christian P. Müller, Zurina Hassan. Proteomic analysis reveals brain Rab35 as a potential biomarker of mitragynine withdrawal in rats. Brain Research Bulletin 2021, 172 , 139-150. https://doi.org/10.1016/j.brainresbull.2021.04.018
  34. Lir Beck, Michal Harel, Shun Yu, Ettai Markovits, Ben Boursi, Gal Markel, Tamar Geiger. Clinical Proteomics of Metastatic Melanoma Reveals Profiles of Organ Specificity and Treatment Resistance. Clinical Cancer Research 2021, 27 (7) , 2074-2086. https://doi.org/10.1158/1078-0432.CCR-20-3752
  35. Gali Yanovich-Arad, Paula Ofek, Eilam Yeini, Mariya Mardamshina, Artem Danilevsky, Noam Shomron, Rachel Grossman, Ronit Satchi-Fainaro, Tamar Geiger. Proteogenomics of glioblastoma associates molecular patterns with survival. Cell Reports 2021, 34 (9) , 108787. https://doi.org/10.1016/j.celrep.2021.108787
  36. Saul Lema A, Marina Klemenčič, Franziska Völlmy, Maarten Altelaar, Christiane Funk. The Role of Pseudo-Orthocaspase (SyOC) of Synechocystis sp. PCC 6803 in Attenuating the Effect of Oxidative Stress. Frontiers in Microbiology 2021, 12 https://doi.org/10.3389/fmicb.2021.634366
  37. Kinza Rian, Marta R. Hidalgo, Cankut Çubuk, Matias M. Falco, Carlos Loucera, Marina Esteban-Medina, Inmaculada Alamo-Alvarez, María Peña-Chilet, Joaquín Dopazo. Genome-scale mechanistic modeling of signaling pathways made easy: A bioconductor/cytoscape/web server framework for the analysis of omic data. Computational and Structural Biotechnology Journal 2021, 19 , 2968-2978. https://doi.org/10.1016/j.csbj.2021.05.022
  38. Bruno Pinto, Giovanni Morelli, Mohit Rastogi, Annalisa Savardi, Amos Fumagalli, Andrea Petretto, Martina Bartolucci, Emilio Varea, Tiziano Catelani, Andrea Contestabile, Laura E. Perlini, Laura Cancedda. Rescuing Over-activated Microglia Restores Cognitive Performance in Juvenile Animals of the Dp(16) Mouse Model of Down Syndrome. Neuron 2020, 108 (5) , 887-904.e12. https://doi.org/10.1016/j.neuron.2020.09.010
  39. Cristian De Gregorio, David Contador, Diego Díaz, Constanza Cárcamo, Daniela Santapau, Lorena Lobos-Gonzalez, Cristian Acosta, Mario Campero, Daniel Carpio, Caterina Gabriele, Marco Gaspari, Victor Aliaga-Tobar, Vinicius Maracaja-Coutinho, Marcelo Ezquer, Fernando Ezquer. Human adipose-derived mesenchymal stem cell-conditioned medium ameliorates polyneuropathy and foot ulceration in diabetic BKS db/db mice. Stem Cell Research & Therapy 2020, 11 (1) https://doi.org/10.1186/s13287-020-01680-0
  40. Timothy R. Howard, Ileana M. Cristea. Interrogating Host Antiviral Environments Driven by Nuclear DNA Sensing: A Multiomic Perspective. Biomolecules 2020, 10 (12) , 1591. https://doi.org/10.3390/biom10121591
  41. Sung‐Huan Yu, Daniela Ferretti, Julia P. Schessner, Jan Daniel Rudolph, Georg H. H. Borner, Jürgen Cox. Expanding the Perseus Software for Omics Data Analysis With Custom Plugins. Current Protocols in Bioinformatics 2020, 71 (1) https://doi.org/10.1002/cpbi.105
  42. Syed Feroj Ahmed, Lori Buetow, Mads Gabrielsen, Sergio Lilla, Chatrin Chatrin, Gary J. Sibbet, Sara Zanivan, Danny T. Huang. DELTEX2 C-terminal domain recognizes and recruits ADP-ribosylated proteins for ubiquitination. Science Advances 2020, 6 (34) https://doi.org/10.1126/sciadv.abc0629
  43. Joel D. Federspiel, Katelyn C. Cook, Michelle A. Kennedy, Samvida S. Venkatesh, Clayton J. Otter, William A. Hofstadter, Pierre M. Jean Beltran, Ileana M. Cristea. Mitochondria and Peroxisome Remodeling across Cytomegalovirus Infection Time Viewed through the Lens of Inter-ViSTA. Cell Reports 2020, 32 (4) , 107943. https://doi.org/10.1016/j.celrep.2020.107943
  44. Nellie A. Martin, Kirsten H. Hyrlov, Maria L. Elkjaer, Eva K. Thygesen, Agnieszka Wlodarczyk, Kirstine J. Elbaek, Christopher Aboo, Justyna Okarmus, Eirikur Benedikz, Richard Reynolds, Zoltan Hegedus, Allan Stensballe, Åsa Fex Svenningsen, Trevor Owens, Zsolt Illes. Absence of miRNA-146a Differentially Alters Microglia Function and Proteome. Frontiers in Immunology 2020, 11 https://doi.org/10.3389/fimmu.2020.01110
  45. Jason M. Held. Redox Systems Biology: Harnessing the Sentinels of the Cysteine Redoxome. Antioxidants & Redox Signaling 2020, 32 (10) , 659-676. https://doi.org/10.1089/ars.2019.7725
  46. Chen Chen, Jie Hou, John J. Tanner, Jianlin Cheng. Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis. International Journal of Molecular Sciences 2020, 21 (8) , 2873. https://doi.org/10.3390/ijms21082873
  47. Chenyang Li, Jayaseelan Murugaiyan, Christian Thomas, Thomas Alter, Carolin Riedel. Isolate Specific Cold Response of Yersinia enterocolitica in Transcriptional, Proteomic, and Membrane Physiological Changes. Frontiers in Microbiology 2020, 10 https://doi.org/10.3389/fmicb.2019.03037
  48. Christof Rampitsch, Mei Huang, Slavica Djuric-Cignaovic, Xiben Wang, Ursla Fernando. Temporal Quantitative Changes in the Resistant and Susceptible Wheat Leaf Apoplastic Proteome During Infection by Wheat Leaf Rust (Puccinia triticina). Frontiers in Plant Science 2019, 10 https://doi.org/10.3389/fpls.2019.01291
  49. Michal Harel, Rona Ortenberg, Siva Karthik Varanasi, Kailash Chandra Mangalhara, Mariya Mardamshina, Ettai Markovits, Erez N. Baruch, Victoria Tripple, May Arama-Chayoth, Eyal Greenberg, Anjana Shenoy, Ruveyda Ayasun, Naama Knafo, Shihao Xu, Liat Anafi, Gali Yanovich-Arad, Georgina D. Barnabas, Shira Ashkenazi, Michal J. Besser, Jacob Schachter, Marcus Bosenberg, Gerald S. Shadel, Iris Barshack, Susan M. Kaech, Gal Markel, Tamar Geiger. Proteomics of Melanoma Response to Immunotherapy Reveals Mitochondrial Dependence. Cell 2019, 179 (1) , 236-250.e18. https://doi.org/10.1016/j.cell.2019.08.012

Journal of Proteome Research

Cite this: J. Proteome Res. 2019, 18, 5, 2052–2064
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.jproteome.8b00927
Published April 1, 2019

Copyright © 2019 American Chemical Society. This publication is licensed under CC-BY.

Article Views

11k

Altmetric

-

Citations

Learn about these metrics

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

  • Abstract

    Figure 1

    Figure 1. Schematic overview of the new network functionality in Perseus. PerseusNet implements a number of processing and analysis steps facilitated by the network collection data type. While including proteomics centric analyses, such as for the analysis of interaction screens, the network module also provides a number of general purpose tools, as, for instance, for network annotation, filtering, and topology determination. With the extension of the Perseus plugin API to networks and furthermore to other programing languages, it becomes possible to integrate existing network analysis tools in Perseus. Networks are easily imported to and exported from Perseus, due to its support for standard formats.

    Figure 2

    Figure 2. Schematic representation of the network collection data type. User-facing information is displayed in tabular form with tables listing the networks in the collection, as well as providing detailed information on the nodes and edges of each network. Internally an auxiliary graph data structure aids in the implementation of graph algorithms. Node- and edge-mapping provide the required cross-references between the tabular and graph representation.

    Figure 3

    Figure 3. Schematic of the Perseus plugin system. Plugins written in C# are native to Perseus and implement their functionality directly on top of the application programming interfaces and data structures provided by the application framework. PluginInterop enables the execution of scripts in the Python and R languages, as well as other external programs. By communicating via the file system, data are transferred between Perseus and the external program. The companion libraries ‘perseuspy’ and ‘PerseusR’ enable developers to access the data science ecosystem in their language of choice. For custom graphical user interface elements and an improved user experience of external tools, developers can implement a thin C# wrapper class that extends the generic functionality of PluginInterop.

    Figure 4

    Figure 4. AP-MS. (a) The Hawaii plot provides an overview over an entire dataset, in this case consisting of three baits in two conditions (Table S1 and Experimental Section). (14) Each volcano plot displays the results of a pull-down of a specific bait (Bap1, Eed, Ring1b) in one of the ESC or NPC cell lines. Significant interactors are determined using a permutation-based FDR and the resulting high-confidence Class A (solid line) and low-confidence Class B (dashed lines) thresholds are displayed in the plot. In this case only in the Bap1 ESC pull-down, Class B interactions could be found. Class A interactors are displayed in dark gray, other proteins are shown in light gray. (b) Enrichment plot comparing the Eed pull-downs in ESC and NPC cell lines. Significant interactors in any of the two conditions are displayed in black, nonsignificant proteins are displayed in light gray. Proteins differentially enriched in one of the two conditions will be located far from the diagonal and can be identified visually. (c) Visualization of the resulting protein interaction network for both cell lines. Bait proteins are colored in green, and their interactors are colored in blue. Thick lines represent Class A interactions, thinner lines Class B. Interactions which were already annotated in the human CORUM database are highlighted in red.

    Figure 5

    Figure 5. Handling large-scale protein interaction databases in Perseus. (a) Interactions in PPI databases are often annotated with confidence scores derived from various sources. Perseus provides tools to load and combine confidence scores derived a variety of data sources, including dynamic confidence adjustments based on condition-specific data. High-confidence networks can be obtained by removing edges below a given hard threshold or alternatively, confidence scores can be utilized directly as so-called edge weights, thereby allowing for the inclusion of lower-confidence interactions. (b) Histogram of the combined confidence score from the human STRING PPI network. Superimposed in orange is the number of interactions in the filtered network if the edges with scores lower than the current value were removed. Filtering out low confidence edges leads to a significant reduction in the number of edges in the final network. (c) Log–log plot of the node degree against the degree frequency generated from the human STRING PPI network. The R2 value of the linear fit (orange) to these data represents the scale-free fit index.

    Figure 6

    Figure 6. Network analysis of an EGF stimulation phosphoproteomics study. (a) Comparison of network topologies used for the analysis of phosphoproteomics data. Nodes in the network are represented as gray circles or pie charts where each slice represents the observed phosphorylation changes at a specific site on the protein. Physical protein–protein interactions (left side) are present between all classes of proteins and are by definition undirected. In order to capture the enzymatic action of kinases more accurately, directed interactions (right side) from kinase to substrate are defined in a site-specific manner. (b) KSEA Z-score and PHOTON signaling functionality scores derived from phosphoproteomics data measured after EGF stimulation (Table S2) only weakly correlate to each other (Pearson correlation 0.52). Kinases annotated in GO with the term ‘Epidermal growth factor receptor signaling pathway’ are highlighted in red. Both methods assign high scores to central members of the expected pathway. (c) Signaling network reconstructed by PHOTON from the 100 highest scoring proteins anchored at EGF. The interactive visualization has an automatic layout and phosphorylation data overlay.

    Figure 7

    Figure 7. Co-expression network analysis on clinical data. (a) The correlation matrix is an equivalent representation of a fully connected network with edge weights corresponding to the correlation between the proteins. (b) Co-expression clustering and identified co-expression modules annotate the original expression matrix. Phenotype data can be correlated with representative co-expression module profiles and provide a high-level interpretation of the modules. (c) Parameter selection of the power parameter for the Yanovich et al. (19) dataset (Table S3 and Experimental Section). The lowest power reaching close to a high scale-free fit index of 0.9 (red line) was selected. (d) Co-expression cluster dendrogram. Each color corresponds to one co-expression module. (e) Correlation heat map between module eigengenes and clinical parameters.

  • References


    This article references 49 other publications.

    1. 1
      Bar-Yam, Y. General Features of Complex Systems. Knowledge Management, Organizational Intelligence and Learning, and Complexity, Vol. I; Encyclopedia of Life Support Systems, 1997; www.eolss.net
    2. 2
      O’Connor, T.; Wong, H. Y. Emergent Properties. Stanford Encyclopedia of Philosophy, Spring 2012 ed.; Metaphysics Research Lab, Stanford University, 2012; https://plato.stanford.edu/archives/spr2012/entries/properties-emergent/
    3. 3
      Grandjean, M. A social network analysis of Twitter: Mapping the digital humanities community. Cogent Arts Humanit. 2016, 3, 1171458,  DOI: 10.1080/23311983.2016.1171458
    4. 4
      Goffman, C. And What is Your Erdös Number?. Am. Math. Mon. 1969, 76, 791,  DOI: 10.1080/00029890.1969.12000324
    5. 5
      Sporns, O.; Tononi, G.; Kötter, R. The human connectome: A structural description of the human brain. PLoS Comput. Biol. 2005, 1, e42,  DOI: 10.1371/journal.pcbi.0010042
    6. 6
      Seung, S. I am my connectome. TED: Ideas Worth Spreading , 2010; https://www.ted.com/talks/sebastian_seung.
    7. 7
      Seung, H. S. Connectome: How the Brain’s Wiring Makes Us Who We Are; Penguin Books Ltd., 2013.
    8. 8
      Barabási, A.-L.; Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 2004, 5, 101113,  DOI: 10.1038/nrg1272
    9. 9
      Butte, A. J.; Kohane, I. S. Mutual Information Relevance Networks: Functional Genomic Clustering Using Pairwise Entropy Measurements. Biocomputing 2000, 2000, 418429,  DOI: 10.1142/9789814447331_0040
    10. 10
      Sinitcyn, P.; Rudolph, J. D.; Cox, J. Computational Methods for Understanding Mass Spectrometry-Based Shotgun Proteomics Data. Annu. Rev. Biomed. Data Sci. 2018, 1, 207234,  DOI: 10.1146/annurev-biodatasci-080917-013516
    11. 11
      Tyanova, S. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 2016, 13, 73140,  DOI: 10.1038/nmeth.3901
    12. 12
      Tyanova, S.; Cox, J. Methods in Molecular Biology Springer, 2018; Vol. 1711, pp 133148.
    13. 13
      Shannon, P. Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 24982504,  DOI: 10.1101/gr.1239303
    14. 14
      Kloet, S. L. The dynamic interactome and genomic targets of Polycomb complexes during stem-cell differentiation. Nat. Struct. Mol. Biol. 2016, 23, 682690,  DOI: 10.1038/nsmb.3248
    15. 15
      Smith, C. L.; Blake, J. A.; Kadin, J. A.; Richardson, J. E.; Bult, C. J. Mouse Genome Database (MGD)-2018: Knowledgebase for the laboratory mouse. Nucleic Acids Res. 2018, 46, D83642,  DOI: 10.1093/nar/gkx1006
    16. 16
      Langfelder, P.; Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008, 9, 559,  DOI: 10.1186/1471-2105-9-559
    17. 17
      Rudolph, J. D.; de Graauw, M.; van de Water, B.; Geiger, T.; Sharan, R. Elucidation of Signaling Pathways from Large-Scale Phosphoproteomic Data Using Protein Interaction Networks. Cell Syst. 2016, 3, 585593,  DOI: 10.1016/j.cels.2016.11.005
    18. 18
      Hornbeck, P. V. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015, 43, D51220,  DOI: 10.1093/nar/gku1267
    19. 19
      Yanovich, G. Clinical Proteomics of Breast Cancer Reveals a Novel Layer of Breast Cancer Classification. Cancer Res. 2018, 78, 60016010,  DOI: 10.1158/0008-5472.CAN-18-1079
    20. 20
      Szklarczyk, D. The STRING database in 2017: Quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 2017, 45, D362D368,  DOI: 10.1093/nar/gkw937
    21. 21
      Chatr-Aryamontri, A. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017, 45, D369D379,  DOI: 10.1093/nar/gkw1102
    22. 22
      Orchard, S. The MIntAct project - IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 2014, 42, D35863,  DOI: 10.1093/nar/gkt1115
    23. 23
      Ruepp, A. CORUM: The comprehensive resource of mammalian protein complexes-2009. Nucleic Acids Res. 2010, 38, D497501,  DOI: 10.1093/nar/gkp914
    24. 24
      Gingras, A. C.; Gstaiger, M.; Raught, B.; Aebersold, R. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol. 2007, 8, 645654,  DOI: 10.1038/nrm2208
    25. 25
      Dunham, W. H.; Mullin, M.; Gingras, A. C. Affinity-purification coupled to mass spectrometry: Basic principles and strategies. Proteomics 2012, 12, 1576,  DOI: 10.1002/pmic.201100523
    26. 26
      Hein, M. Y. A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances. Cell 2015, 163, 712723,  DOI: 10.1016/j.cell.2015.09.053
    27. 27
      Huttlin, E. L. The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell 2015, 162, 425440,  DOI: 10.1016/j.cell.2015.06.043
    28. 28
      Hubner, N. C. Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions. J. Cell Biol. 2010, 189, 739754,  DOI: 10.1083/jcb.200911091
    29. 29
      Noble, W. S. How does multiple testing correction work?. Nat. Biotechnol. 2009, 27, 11351137,  DOI: 10.1038/nbt1209-1135
    30. 30
      Tusher, V. G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 51165121,  DOI: 10.1073/pnas.091062498
    31. 31
      Franz, M. Cytoscape.js: A graph theory library for visualisation and analysis. Bioinformatics 2015, 32, 309311,  DOI: 10.1093/bioinformatics/btv557
    32. 32
      Dogrusoz, U.; Giral, E.; Cetintas, A.; Civril, A.; Demir, E. A layout algorithm for undirected compound graphs. Inf. Sci. (N. Y.) 2009, 179, 980994,  DOI: 10.1016/j.ins.2008.11.017
    33. 33
      Pedamallu, C. S.; Ozdamar, L. A Review on protein-protein interaction network databases. In Springer Proceedings in Mathematics and Statistics; Springer, 2014; Vol. 73, pp 511519.
    34. 34
      Alanis-Lobato, G.; Andrade-Navarro, M. A.; Schaefer, M. H. HIPPIE v2.0: Enhancing meaningfulness and reliability of protein-protein interaction networks. Nucleic Acids Res. 2017, 45, D40814,  DOI: 10.1093/nar/gkw985
    35. 35
      Gene Ontology Consortium, C. Gene Ontology Consortium: going forward. Nucleic Acids Res. 2015, 43, D104956,  DOI: 10.1093/nar/gku1179
    36. 36
      Clauset, A.; Shalizi, C. R.; Newman, M. E. J. Power-law distributions in empirical data. SIAM Rev. 2009, 51, 661,  DOI: 10.1137/070710111
    37. 37
      Albert, R. Scale-free networks in cell biology. J. Cell Sci. 2005, 118, 4947,  DOI: 10.1242/jcs.02714
    38. 38
      Riley, N. M.; Coon, J. J. Phosphoproteomics in the Age of Rapid and Deep Proteome Profiling. Anal. Chem. 2016, 88, 7494,  DOI: 10.1021/acs.analchem.5b04123
    39. 39
      Casado, P. Kinase-substrate enrichment analysis provides insights into the heterogeneity of signaling pathway activation in leukemia cells. Sci. Signaling 2013, 6, rs6rs6,  DOI: 10.1126/scisignal.2003573
    40. 40
      Hernandez-Armenta, C.; Ochoa, D.; Gonçalves, E.; Saez-Rodriguez, J.; Beltrao, P. Benchmarking substrate-based kinase activity inference using phosphoproteomic data. Bioinformatics 2017, 33, 18451851,  DOI: 10.1093/bioinformatics/btx082
    41. 41
      Herranz, N. mTOR regulates MAPKAPK2 translation to control the senescence-associated secretory phenotype. Nat. Cell Biol. 2015, 17, 1205,  DOI: 10.1038/ncb3225
    42. 42
      Wilkes, E. H.; Terfve, C.; Gribben, J. G.; Saez-Rodriguez, J.; Cutillas, P. R. Empirical inference of circuitry and plasticity in a kinase signaling network. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 7719,  DOI: 10.1073/pnas.1423344112
    43. 43
      Linding, R. NetworKIN: A resource for exploring cellular phosphorylation networks. Nucleic Acids Res. 2007, 36, D69599,  DOI: 10.1093/nar/gkm902
    44. 44
      Wiredja, D. D.; Koyutürk, M.; Chance, M. R. The KSEA App: a web-based tool for kinase activity inference from quantitative phosphoproteomics. Bioinformatics 2017, 33, 3489,  DOI: 10.1093/bioinformatics/btx415
    45. 45
      Zhang, B.; Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 2005, 4, 17,  DOI: 10.2202/1544-6115.1128
    46. 46
      Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 2008, 26, 13671372,  DOI: 10.1038/nbt.1511
    47. 47
      Sinitcyn, P. MaxQuant goes Linux. Nat. Methods 2018, 15, 401,  DOI: 10.1038/s41592-018-0018-y
    48. 48
      Kristensen, A. R.; Foster, L. J. Protein correlation profiling-SILAC to study protein-protein interactions. Methods Mol. Biol. 2014, 1188, 263,  DOI: 10.1007/978-1-4939-1142-4_18
    49. 49
      Liu, F.; Lössl, P.; Scheltema, R.; Viner, R.; Heck, A. J. R. Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat. Commun. 2017, 8, 15473,  DOI: 10.1038/ncomms15473
  • Supporting Information

    Supporting Information


    The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.8b00927.

    • Figure S1, Graphical workflow combining matrix and network activities; Figure S2, Organization of the new Perseus plugin API for networks; Figure S3, Context-specific documentation; and explanations of Tables S1–S3 (PDF)

    • Table S1, AP-MS pull screen (TXT)

    • Table S2, phosphoproteomics of EGF stimulation (TXT)

    • Table S3, clinical proteomics dataset (TXT)

    • Supplementary Data 1, Perseus network collection data format: example of a network collection describing three small, randomly generated networks (ZIP)


    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.