IsoProt: A Complete and Reproducible Workflow To Analyze iTRAQ/TMT Experiments

Reproducibility has become a major concern in biomedical research. In proteomics, bioinformatic workflows can quickly consist of multiple software tools each with its own set of parameters. Their usage involves the definition of often hundreds of parameters as well as data operations to ensure tool interoperability. Hence, a manuscript’s methods section is often insufficient to completely describe and reproduce a data analysis workflow. Here we present IsoProt: A complete and reproducible bioinformatic workflow deployed on a portable container environment to analyze data from isobarically labeled, quantitative proteomics experiments. The workflow uses only open source tools and provides a user-friendly and interactive browser interface to configure and execute the different operations. Once the workflow is executed, the results including the R code to perform statistical analyses can be downloaded as an HTML document providing a complete record of the performed analyses. IsoProt therefore represents a reproducible bioinformatics workflow that will yield identical results on any computer platform.


■ INTRODUCTION
Lack of reproducibility in general, and in bioinformatics workflows specifically, is a growing concern. 1 Bioinformatic workflows in proteomics experiments often consist of multiple software tools, each with its own set of parameters. Seemingly small changes to a workflow, such as using different normalization method details, can have dramatic effects on the final result. Due to the many steps and settings that make a complex workflow, it is often impossible to fully document it in a research paper's methods section. Additionally, finding and using the exact same software versions later on often represents a major obstacle when replicating bioinformatic analyses. Older versions may no longer be compatible with the available operating system or are just altogether unavailable. Therefore, fully reproducible workflows should not only record the exact software versions and parameters, but also preserve specific software versions and ensure that they will produce the same results in different computing environments.
Several projects exist to create reproducible bioinformatic workflows. Biocontainers 2 provides Docker containers to make bioinformatic tools available in a standardized way. Docker containers are lightweight virtual machines that, in the case of Biocontainers, ensure that a given software version performs identically on any operating system supported by Docker. Therefore, users do not have to install any software but only download the respective container. Galaxy 3 is a web-based platform for biomedical research mainly focused on genomics. It contains thousands of tools that can be joined together to create workflows and also supports tools for proteomics analyses. KNIME (http://www.knime.com, KNIME AG) is another workflow software focused on data analysis in general. All OpenMS 4 nodes were recently integrated in KNIME, making it possible to build complete proteomics workflows with it. ProteomeDiscoverer (Thermo Fisher) is also a workflow system but specifically targeting proteomics data analysis. Several academic research groups 4−6 are contributing to ProteomeDiscoverer making it usable for a wide variety of proteomics workflows. Finally, to a certain extent MaxQuant 7 with Perseus 8 allows the user to create a complete analysis workflow in a single software.
Nevertheless, all of these existing solutions have shortcomings that prevent the creation of complete, reproducible workflows. Biocontainers is a platform to supply bioinformatic tools in a standardized fashion but has no functionality to combine these tools into workflows. KNIME and Galaxy are very powerful analysis platforms that can be adapted to a wide variety of data analysis problems. This functionality comes at the cost of high complexity, and many nonexpert users will find it difficult to adopt Galaxy and KNIME to their needs. Additionally, both KNIME and Galaxy do not contain methods to take a snapshot of the external tools used to actually process the data. ProteomeDiscoverer also depends on external nodes. Therefore, to fully replicate an existing workflow the user again has to take care of locating and installing the exact same versions of these nodes. Moreover, new ProteomeDiscoverer versions generally come with significant changes which requires nodes to be specifically developed for a given version. Nodes developed for one version of ProteomeDiscoverer are generally incompatible with newer ones. Therefore, none of these existing solutions fulfill all requirements of a completely reproducible workflow environment.
Isobaric labeling has become one of the most common methods for quantitative mass spectrometry based proteomics experiments. A major advantage is that it allows researchers to multiplex samples and thereby reduce instrument runtime and eliminate variability caused by the mass spectrometer itself. The two methods currently available for these experiments, tandem mass tag (TMT 9 ) and multiplexed isobaric tagging technology for relative quantitation (iTRAQ 10 ) basically only differ in the reporter masses they generate but do not require dedicated software tools.
Even though isobaric labeling has become a standard method in many laboratories, dedicated, easy-to-use software solutions to analyze these data are still rare. This is particularly problematic when dealing with more complex experimental designs that include multiple runs on the mass spectrometer, such as multiple instances of differently labeled multiplexed samples. Existing dedicated software solutions, such as iQuant, 11 isobar, 12 MilQuant, 13 and IsobariQ 14 all require identification results from specific search engines and do not support complex experimental designs with more than two treatment groups or samples split across multiple iTRAQ/ TMT runs. Therefore, many research groups rely on unpublished in-house scripts to process their experiments, which greatly hampers reproducibility.
In an effort to simplify proteomics data analysis and provide fully reproducible data analysis workflows, we launched the ProtProtocols project (https://protprotocols.github.io) under the umbrella of the European Bioinformatics Community (EuBIC). 15 On the basis of the Biocontainers project, 2 the protocols are shipped in containerized Docker images that include all necessary software tools. Docker containers are lightweight virtual machines that encapsulate all the software required for the protocol to run. This ensures that the version of all used software is linked to the protocol version and the user does not have to worry about installing any separate tools. Hence, 100% reproducibility can be achieved by using the same protocol version on any computer with a Docker environment.
Here, we present IsoProt which serves as a blueprint for the ProtProtocol concept. IsoProt is designed for the analysis of isobarically labeled experiments, which is one of the most commonly used methods for high-throughput proteomics. Next to a user-friendly web interface, IsoProt provides accurate statistical analyses for a wide range of common experimental designs.

■ EXPERIMENTAL PROCEDURES
Software Layout and Implementation General Implementation. All software was installed in a Docker image to ensure full reproducibility on each computer system supported by Docker. To simplify the installation and usage of our protocols, we created the free, open-source "ProtProtocol docker-launcher" (https://github.com/ ProtProtocols/docker-launcher). It provides an easy-to-use graphical user interface that can automatically install the protocol (once Docker is installed) and launch the image. As it is written in Java, it supports the major operating systems Windows, Mac OSX, and Linux. Therefore, many technical difficulties surrounding the use of Docker are hidden from the user. Detailed instructions on how to install and use all tools,

Journal of Proteome Research
Article as well as how to extend ProtProtocols, can be found at https://protprotocols.github.io/documentation/.
The complete protocol is run through a Jupyter notebook (http://jupyter.org) corresponding to one web page in the browser. All relevant parameters can be set through common graphical user elements created through Jupyter widgets. Therefore, the user interface is highly similar to most available search engines. The complete source code as well as additional documentation of the protocol is freely available through https://protprotocols.github.io.
Proteomics Software. IsoProt handles the entire analysis pipeline from mass spectra given as peak lists to the set of differentially regulated proteins ( Figure 1A). We used SearchGUI 16 and PeptideShaker 17 to perform peptide identification and validation, with MS-GF+ 18 as a database search engine. Proteins are summarized and quantified by R scripts based on the MSnBase R library. 19 R scripts furthermore generate figures for quality control and perform statistical tests (LIMMA library 20 ) according to the experimental design.

Input Files and Parameters
Input Files. The only files required for the analysis are mass spectra as peak lists (MGF format) and a FASTA file containing the protein sequences where we recommend the UniProt version of the FASTA format. Databases can already contain decoy sequences (following the SearchGUI instructions, http://compomics.github.io/projects/searchgui.html); otherwise, the decoy database is created automatically. The files can be copied into the Docker file structure or directly mirrored onto the /data folder as automatically done by our docker-launcher application.
Analysis Parameters. All parameters required for the data analysis can be changed through a graphical user interface integrated into the Jupyter notebook. In the first section, the user has to set database search related parameters such as precursor and fragment ion tolerance, the FASTA sequence database to use, the labeling agent used, and the fixed and variable modifications to consider.
On the basis of the selected labeling method and detected folder structure, the interface to enter the experimental design is generated. The protocol currently supports two setups: (1) all MGF files are placed in the input directory and are part of the same (fractionated) run ( Figure 1B) or (2) MGF files from different runs are organized by placing them in different subdirectories ( Figure 1C). Next, the experimental design user interface allows the user to enter names for the sample groups (for example "treatment" and "control") and names for the samples (one name per channel and subdirectory) and assign each sample to one of the groups. Most importantly, the protocol supports up to 20 sample groups and can thereby model complex experimental designs.
Finally, the user is asked to enter parameters related to the analysis of the quantitative data. Once all required information is entered, the search and analysis are directly controlled through buttons in the user interface.

Output Files and Quality Control
IsoProt provides figures and tables for the different steps of the analysis including peptide identifications, quantitative values of peptide-spectrum matches (PSMs), and proteins as well as a

Journal of Proteome Research
Article table for the statistical results from the significance analysis. Visual measures for quality control were implemented as R scripts and include total intensities of the reporter ion channels for each sample, violin plots at different stages of the analysis, principal component analysis, and volcano plots ( Figure 2).

Test Data Sets
To evaluate the performance of our analysis workflow, we processed the data from three publically available data sets using the same search parameters as in the original studies. We downloaded the respective RAW files from PRIDE Archive 22 and converted them into the MGF file format using ProteoWizard's msconvert tool 23 when no MGF peak list files were available.
Benchmark Data Set. D'Angelo et al. recently published a TMT benchmark data set containing an experiment where 12 human proteins were spiked into an Escherichia coli background 24 using various concentrations (PRIDE Archive identifier PXD005486). D'Angelo et al. used this data set to assess the number of proteins that were incorrectly identified as being regulated. As every protein was added using varying concentrations among the samples, a standard statistical analysis of the spiked-in proteins was not possible. Therefore, our analysis focuses on the accuracy of the derived quantitative estimates for the spiked proteins and the (unchanged) background E. coli proteins.
The complete analysis was performed using IsoProt version 0.2. Spectra were identified using MSGF+ 18 through SearchGUI version 3.3.3. 16 The precursor tolerance was set to 20 ppm and the fragment tolerance to 0.03 Da. One missed cleavages was allowed. Carbamidomethylation and TMT 10plex of K,TMT 10-plex of peptide N-term were set as fixed modifications. Oxidation of M was set as variable modification. PSMs were filtered at a target false discovery rate (FDR) of 0.01 using the target-decoy approach. UniProt E. coli sequences (version August 2018) and the spiked human protein sequences, also from UniProt, were used for spectra identification.
Quantitative analysis was done using the R Bioconductor package MSnbase version 2.7.1. 19 Protein summarization was performed using the "medpolish" method as implemented by MSnbase. Modified peptides were not used for quantitation. Only proteins with at least two identified peptides were accepted for further analysis. Differential expression was assessed using the R Bioconductor package limma version 3.34. 20 Cerebral Malaria Pathogenesis. The study uses TMT6 labeling to compare mouse blood with different stages of cerebral malaria (d3, ECM) to noninfected mice (NI). 25 Four replicates of each of the three sample types were arranged in two TMT6 sets and run separately, corresponding to a similar case as in Figure 1C, now having three conditions being distributed over two separate runs on the mass spectrometer. Peak list data files (MGF file format) were downloaded from PRIDE Archive (PXD003772).
The analysis was again performed using IsoProt version 0.2 (see above) with the precursor tolerance set to 10 ppm and the fragment tolerance to 0.05 Da. One missed cleavage was allowed. Carbamidomethylation and TMT 6-plex of K,TMT 6plex of peptide N-term were set as fixed modifications. Oxidation of M was set as variable modification. PSMs were filtered at a target FDR of 0.01 using the target-decoy approach. SwissProt sequences from mouse (January 2018) were used for spectra identification. Only proteins with at least two identified peptides were accepted for further analysis.
Nonmuscle Invasive and Muscle-Invasive Bladder Cancer. The study compares tumor tissue samples from nonmuscle invasive and muscle-invasive bladder cancer. 26 MGF files were downloaded from PRIDE Archive (PXD002170).
The analysis was again performed using IsoProt version 0.2 (see above) with the precursor tolerance set to 10 ppm and the fragment tolerance to 0.05 Da. One missed cleavage was allowed. Carbamidomethylation and iTRAQ 8-plex of K, iTRAQ 8-plex of Y, iTRAQ 8-plex of peptide N-term were set as fixed modifications. Oxidation of M was set as variable modification. PSMs were filtered at a target FDR of 0.01 using the target-decoy approach. Sequences from SwissProt sequences from human (January 2017) were used for spectra identification. Only proteins with at least two identified peptides were accepted for further analysis.

■ RESULTS
IsoProt allows users running the full data analysis of iTRAQ/ TMT experiments in a straightforward and reproducible way. The protocol supports different experimental designs including multiple runs on the mass spectrometer and differently labeled multiple samples. Additionally, the open layout of the protocol allows complex adjustments and modifications at all stages of the workflow.

A Fully Reproducible Environment
The protocol can be run on any computer with a functional Docker environment, by just downloading and running the available Docker image. This is fully automated through our "ProtProtocol docker-launcher" tool (https://github.com/ ProtProtocols/docker-launcher). Hence, the protocol avoids all possible platform-and operating system-specific installation issues and provides identical results independent of operating system, its configuration, and computer hardware.
Every IsoProt release has a stable version number that points to a specific docker image. Therefore, by citing the used IsoProt version number, it will always be possible to exactly restore the used analysis environment, including the versions of all used software tools. Once the protocol has been executed, it is possible to save it, including all generated figures, as a standard HTML page. Therefore, the complete analysis workflow can be easily made available, for example, at the time of review, and be viewed with a standard web browser. Additionally, all user-entered parameters are stored in text files next to the analyses results which can easily be reused for future projects (see https://protprotocols.github.io/ documentation/isoprot/save_analysis for details). For an overview of the visualizations, see Figure 2.

Simple Example Workflow
IsoProt can be tested using an example data set that is small enough to run in under 10 min on a standard computer. The data set is part of the IsoProt Docker image, and necessary parameters settings are preloaded when starting IsoProt. The database search via SearchGUI and validation via Peptide-Shaker result in a tab-delimited file containing detailed information on all PSMs. Search and output parameters are automatically saved for future reference. Additionally, a "methods" section is generated that can be included in a manuscript and describes all used settings. Each spectrum file is processed separately to match and quantify PSMs that

Journal of Proteome Research
Article passed the identification FDR (default 0.01). The mass distribution of all matched fragment ions allows control for critical channels with inefficient labeling (Figure 2A). All PSM quantifications are saved in a separate file (AllQuantPSMs.csv).
The output of all files of each run on the mass spectrometer are merged, normalized, and visualized for quality control. Violin plots of normalized PSM intensities compare the intensity distributions ( Figure 2C). Channels with different distributions can identify problematic samples or changes within the entire proteome. Six different histograms counting PSM, peptide, protein, and protein group numbers allows determining protein coverage and uniqueness by the available mass spectra. Similarity between samples is assessed through scatter plots comparing all quantified spectra from all ion channels ( Figure 2B).
Using the default parameters, the PSMs are summarized to proteins using median summarization after outlier removal requiring a minimum of 1 PSM per protein. In addition, the protocol supports iPQF, 27 mean expression, median expression (without outlier removal), and robust summarization as methods. A violin plot of protein ratios versus mean of all channels shows whether the analyzed samples exhibit similar distributions on the protein level ( Figure 2D).
Quantifications from different runs (only one in the example) are merged and submitted to a principal component analysis ( Figure 2E). This places all samples in a twodimensional space and color codes different treatment groups. Studies where the samples of the different types are not placed as distinguishable groups are unlikely to provide differentially regulated proteins. Additionally, potential systematic biases can quickly be discovered using this plot.
The example set quantified a total of 221 protein groups. LIMMA statistical tests did not find any regulated proteins with FDR < 0.05, which is in agreement with the original results. p-Values and false discovery rates (p-values corrected for multiple testing) are visualized in histograms, volcano plots ( Figure 2F), and a figure counting the number of differentially regulated proteins over a range of FDRs. The latter can be used to identify a suitable combination of the confidence threshold and the number of significant proteins. It is advised to keep FDR < 0.1 as the number of false positives becomes critically high otherwise.

Performance Tests by Reprocessing Public Data
Benchmark Data Set. D'Angelo et al. performed a comparison of different approaches to analyze TMT data sets. 24 In their first data set, the authors spiked different concentrations of 12 human proteins into an E. coli background. They used this data set to assess the type-I error as the number of false positive proteins. Similar to the original study, we assigned the first five channels to one treatment group, and the second five channels to the second group. As expected, no proteins were identified as being significantly regulated. The estimated log-fold changes of the E. coli background proteins were all close to 0 ( Figure 3A).
Proteins were spiked twice using the same concentration in different channels and only once for the two highest concentrations. Therefore, only a single, or two replicate measurements at maximum are available when comparing two concentrations. Since this setup prevents a standard statistical evaluation, we focused on the accuracy of the estimated fold changes using the same error measurements as in the original manuscript. Similarly, we assessed the accuracy of our fold change estimates using the bias and the root-mean-square error (RMSE). Across most spiked fold-changes, we observed a comparable bias and RMSE ( Figure 3B).
For the highest spiked fold-change, we observed slightly higher average error rates than D'Angelo et al. This is most likely caused by the fact that D'Angelo et al. imputed missing values by taking the lowest observed intensity of the given PSM across all samples. Thereby, missing values were automatically interpreted as very low expression. As expected, the measured abundance of the lowest concentrations showed larger variation with several missing values. In our approach, these missing values were ignored thus leading to less stable average fold-changes. Imputing missing values like D'Angelo et al. did naturally reduced this variation leading to reduced error rates. However, when we, for example, estimated the fold change of the two highest protein concentrations (also a fold change of 2), the bias is 0 with an RMSE of 0.2 improving the error rates dramatically.
While D'Angelo et al.'s imputation approach is valid if values can be expected to be missing not at random (i.e., because of a concentration below the limit of detection), it is not valid for values missing (completely) at random (i.e., because of inefficient labeling). 28 Therefore, for the spiked proteins D'Angelo et al.'s approach should only have been applied to

Journal of Proteome Research
Article cases were the lowest amount of proteins were spiked. Since it is generally unknown why a value is missing in actual experiments, our pipeline is not using any imputation. Limma's model treats these values as "missing as random", which we feel is more appropriate for most biological studies.
To estimate the effect of these different approaches, we calculated the variance of the spiked-in proteins as the sum of the absolute difference between the duplicate measurements. Independent of the used protein summarization method, imputing missing values increased the variance of all but the duplicates with zero concentration of the respective proteins ( Figure 3C). In our opinion, this highlights the downside of using "blind" imputation for all missing values as this can result in increased noise levels or bias in the data set. The complete output of our pipeline can be found in Supplementary File 1.
Cerebral Malaria Pathogenesis. The authors investigated differences in the plasma proteome between healthy and malaria-infected mice (two stages). The available two TMT 6plex sets were considered to contain independent samples. IsoProt quantified more protein groups (324 versus 289) when requiring a minimum of 2 unique PSMs and an identification FDR < 1%. For the further comparison, we restricted the IsoProt output to the uniquely identified 214 proteins (no peptides shared with other proteins).
In the original study, statistical testing was carried out separately for the two TMT runs, yielding a total of 54 (more precisely 43 as 11 were detected in both runs) proteins found to be differentially regulated between Plasmodium berghei ANKA (PbA)-infected (d8 postinfection, labeled ECM) and noninfected (labeled NI) mice (Mann−Whitney U test, p ≤ 0.001). Since the authors did not correct p-values for multiple testing, these results cannot be considered significant. We found a total of 41 differentially regulated proteins (FDR < 0.01) and an overlap of only 20 proteins with the original study.
Given the different statistical procedures, we analyzed all proteins that were found differentially regulated by either one of the methods. All but four proteins found differentially regulated in the original study were quantified by IsoProt and showed similar abundances in both analyses ( Figure 4A). Proteins only deemed significant in the original study were not found significant by IsoProt mostly due to low fold-changes ( Figure 4B).
We further investigated the two proteins that mostly differed between the two types of analyses. Retinol-binding protein 4 (Q00724) was the protein with the lowest FDR within the proteins found differentially regulated only by IsoProt. Figure  4C shows PSM measurements for the 2 TMT runs of this

Journal of Proteome Research
Article protein (scaled for better comparison). Summarized protein abundances (thick lines) by median summarization with outlier removal show that the PSMs of peptides with less differential behavior were removed. By merging the observation of the two TMT runs, IsoProt increases its statistical power and thus provides evidence for regulatory behavior of this protein.
On the other hand, protein protein disulfide-isomerase (P09103) was the protein with the highest FDR (least significant) in IsoProt that was found significantly regulated in the original study (TMT-1, Figure 4D). Given only high abundances in one of the two ECM replicates in TMT-1, manual interpretation would discard this protein from being regulated ( Figure 4D). The PSMs measured in the second TMT-2 run confirm this observation. The complete output of our pipeline can be found in Supplementary File 1.
Nonmuscle Invasive and Muscle-Invasive Bladder Cancer. IsoProt quantified 1145 protein groups when restricting to a minimum of 2 unique peptides and 1% FDR, compared to 1092 in the original study (minimum of 2 peptides, Occam razor principle for peptide inference and 1% FDR). Both analyses had an overlap of 662 proteins. Despite only having different bioinformatics workflows, the mean logfold changes of proteins between the two cancer subtypes were very different ( Figure 5A, Pearson's correlation of 0.78).
IsoProt found one differentially regulated protein (15hydroxyprostaglandin dehydrogenase, FDR < 0.01) after correction for multiple testing, which was not carried out in the original study. In order to allow a comparison of both results, we therefore also used uncorrected p-values for the following analysis. This is not recommended as it is prone to greatly overestimate the number of regulated proteins. When comparing these uncorrected p-values, the majority of "significant" proteins were different between the two studies ( Figure 5B,C, colored points indicate p < 0.05 in the other respective study).
This striking difference in the statistical results is due to different normalization approaches used. Their effect can be seen in the distribution of protein abundances ( Figure 5D,E). The authors of the original study normalized the ratios between cancer subtypes after protein summarization and averaging of replicates. The more common and in our opinion correct approach is to normalize the different channels (i.e., individual samples) on the (measured) PSM or (aggregated) peptide level prior to the aggregated analysis of these measurements on the protein level and, most importantly, prior to merging any independent (i.e., replicate) measurements. Strong deviations of individual channels which are visible on the peptide level were thus discarded in the original study. The complete output of our pipeline can be found in Supplementary File 1.

Journal of Proteome Research
Article ■ DISCUSSION IsoProt shows how the ProtProtocols framework can be used to create user-friendly, reproducible bioinformatic workflows. IsoProt makes it simple to include the complete bioinformatic data processing workflow as a supplementary file. Thereby, reviewers and other researchers can easily assess the used methods.
Encapsulating protocols into docker containers preserves the complete setup including all software versions which can be referenced through a single protocol version number. This allows anyone to replicate the results at any later stage without having to worry that older software might no longer work. Once a given version of the protocol is downloaded, users can be sure that it will behave in exactly the same way on all supported platforms.
The use of docker makes the protocol highly portable. Docker currently supports Windows, Linux, and Mac OS making our protocol truly multiplatform. The fact that the protocol can be installed through a single command makes it trivial to move the setup from one machine to another. With our "ProtProtocol docker-launcher" tool, the protocol can even be installed with the click of a single button. This should greatly reduce the effort in setting up a complex proteomics analysis environment. Unfortunately, Docker support for Windows is not yet fully stable. Therefore, several Windows users experienced issues when installing Docker which prevented them from using IsoProt. Even though this currently reduces the ease-of-use of ProtProtocols on Windows machines, we believe that this will quickly be improved since Microsoft recently became an official partner of Docker. 29 IsoProt's performance was tested on three publicly available data sets. The results highlight that subtle differences in the data analysis can lead to considerable differences in the final results. Such differences can only be identified by reproducing the complete environment of the analysis workflow, something that is very difficult to realize when only relying on information from a scientific paper. Thus, more complete and easily readable information on the used workflow and its parameters, or even the entire computational environment, will considerably improve paper reviews as well as reproducing and discussing results from already published studies. Such workflows will further increase quality and credibility of both scientific studies and the presenting journals. IsoProt enables users to easily provide such complete information on their analysis. Our approach facilitates comparison with other data analysis pipelines or testing of robustness to parameter changes with minimal efforts requiring only peak list files, their relation to the experimental design and main parameters for identification and quantification.
All of these developments are available as free and opensource software. Thereby, we encourage other researchers to use the ProtProtocol infrastructure as starting point to develop their own analysis workflows and make them available to the community. All our tools are modularized and prepared to support and simplify such external developments. Since Docker has become an industry standard for containerized applications long-term support seems to be guaranteed for these developments.
In summary, we developed a user-friendly environment for fully reproducible data analysis and exemplified its use through a complete workflow for the analysis of data from isobarically labeled mass spectrometry experiments.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10