Communicating Mass Spectrometry Quality Information in mzQC with Python, R, and JavaClick to copy article linkArticle link copied!
- Chris Bielow*Chris Bielow*Email: [email protected]Bioinformatics Solution Center, Institut für Mathematik und Informatik, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, GermanyMore by Chris Bielow
- Nils HoffmannNils HoffmannInstitute for Bio- and Geosciences (IBG-5), Forschungszentrum Jülich GmbH, 52428 Jülich, GermanyMore by Nils Hoffmann
- David Jimenez-MoralesDavid Jimenez-MoralesDepartment of Medicine, Stanford University School of Medicine, Stanford, California 94305, United StatesMore by David Jimenez-Morales
- Tim Van Den BosscheTim Van Den BosscheDepartment of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, BelgiumVIB-UGent Center for Medical Biotechnology, VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, BelgiumMore by Tim Van Den Bossche
- Juan Antonio VizcaínoJuan Antonio VizcaínoEuropean Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United KingdomMore by Juan Antonio Vizcaíno
- David L. TabbDavid L. TabbEuropean Research Institute for the Biology of Ageing, University Medical Center Groningen, Groningen 9713 AV, The NetherlandsMore by David L. Tabb
- Wout BittremieuxWout BittremieuxDepartment of Computer Science, University of Antwerp, Antwerpen 2020, BelgiumMore by Wout Bittremieux
- Mathias Walzer*Mathias Walzer*Email: [email protected]European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United KingdomMore by Mathias Walzer
Abstract
Mass spectrometry is a powerful technique for analyzing molecules in complex biological samples. However, inter- and intralaboratory variability and bias can affect the data due to various factors, including sample handling and preparation, instrument calibration and performance, and data acquisition and processing. To address this issue, the Quality Control (QC) working group of the Human Proteome Organization’s Proteomics Standards Initiative has established the standard mzQC file format for reporting and exchanging information relating to data quality. mzQC is based on the JavaScript Object Notation (JSON) format and provides a lightweight yet versatile file format that can be easily implemented in software. Here, we present open-source software libraries to process mzQC data in three programming languages: Python, using pymzqc; R, using rmzqc; and Java, using jmzqc. The libraries follow a common data model and provide shared functionalities, including the (de)serialization and validation of mzQC files. We demonstrate use of the software libraries in a workflow for extracting, analyzing, and visualizing QC metrics from different sources. Additionally, we show how these libraries can be integrated with each other, with existing software tools, and in automated workflows for the QC of mass spectrometry data. All software libraries are available as open source under the MS-Quality-Hub organization on GitHub (https://github.com/MS-Quality-Hub).
This publication is licensed under
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
Special Issue
Published as part of Journal of the American Society for Mass Spectrometry virtual special issue “Asilomar: Computational Mass Spectrometry”.
Introduction
Methods
mzQC Software Libraries
Functionality | Software library | API |
---|---|---|
Read (deserialize): consume an mzQC file (optionally from a JSON string, a local file, or a remote file) and return a data object representing the file contents | pymzqc | MZQCFile.JsonSerialisable.FromJson(..) |
rmzqc | rmzqc::readMZQC(..) | |
MzQC$fromData(..) | ||
jmzqc | Converter.of(..) | |
Write (serialize): export an mzQC data object to a JSON file or JSON string | pymzqc | MZQCFile.JsonSerialisable.ToJson(..) |
rmzqc | rmzqc::writeMZQC(..) | |
jsonlite::toJSON(..) | ||
jmzqc | Converter.toJsonString(..) | |
Converter.toJsonFile(..) | ||
Syntactic validation: verify that an mzQC file conforms to the mzQC schema specification | pymzqc | SyntaxCheck().validate(..) |
rmzqc | rmzqc::validateFromFile(..) | |
rmzqc::validateFromString(..) | ||
rmzqc::validateFromObj(..) | ||
jmzqc | Converter.validate(..) | |
Semantic validation: verify that an mzQC file conforms to the mzQC semantic content constraints | pymzqc | SemanticCheck().validate(..) |
Code availability
Software library | URL |
---|---|
pymzqc | PyPI: https://pypi.org/project/pymzqc/ |
GitHub: https://github.com/MS-Quality-Hub/pymzqc | |
rmzqc | CRAN: https://cran.r-project.org/web/packages/rmzqc/index.html |
GitHub: https://github.com/MS-Quality-Hub/rmzqc | |
jzmqc | Maven Central: https://central.sonatype.com/artifact/org.lifs-tools/jmzqc |
GitHub: https://github.com/MS-Quality-Hub/jmzqc |
Data
Results
Figure 1
Figure 1. mzQC processing workflow. Each software library is separately used to process different QC metrics, which are ultimately combined into a single QC report. Note that the mzQC software libraries do not calculate QC metric values themselves, but rather this functionality is provided by external scripts (“Java app”, “R script”, “Python script”) that subsequently use the respective mzQC library to produce the corresponding mzQC reports.
Figure 2
Figure 2. Heatmap with dendrogram displaying QC metrics across eight MS runs (DMSO controls colored in green, sulforaphane samples colored in blue), clustered by MS runs on the horizontal axis and by QC metrics on the vertical axis. Colors in the heatmap represent percentile ranks calculated from the combined data set, with darker shades indicating lower percentile ranks and lighter shades indicating higher ranks. The QC metrics include the number of acquired MS/MS spectra, MS/MS identifications, peptide identifications, protein identifications, summed total ion current, and number of missed cleavages, among others. The metrics discussed in the text are highlighted in green. See the Jupyter Notebook on our GitHub repository (https://github.com/MS-Quality-Hub/mzqclib-manuscript/) for data analysis and code to generate the plot.
Conclusion
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jasms.4c00174.
Supplementary Table 1: List of QC metrics considered during the analysis and their accession numbers in the PSI-MS controlled vocabulary (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
M.W. would like to acknowledge funding from the H2020 EPIC-XS grant [Grant number 823839], BBSRC ‘Proteomics DIA’ [BB/P024599/1], and from The Wellcome Trust [208391/Z/17/Z]. Additionally, J.A.V. would like to acknowledge EMBL core funding. This work was supported in part by the de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) and ELIXIR-DE (Forschungszentrum Jülich and W-de.NBI-001, W-de.NBI-004, W-de.NBI-008, W-de.NBI-010, W-de.NBI-013, W-de.NBI-014, W-de.NBI-016, and W-de.NBI-022). T.V.D.B. acknowledges funding from the Research Foundation Flanders (FWO) [1286824N]. W.B. acknowledges support by the University of Antwerp Research Fund.
References
This article references 39 other publications.
- 1Bittremieux, W.; Tabb, D. L.; Impens, F.; Staes, A.; Timmerman, E.; Martens, L.; Laukens, K. Quality Control in Mass Spectrometry-Based Proteomics. Mass Spectrom. Rev. 2018, 37 (5), 697– 711, DOI: 10.1002/mas.21544Google Scholar1Quality control in mass spectrometry-based proteomicsBittremieux, Wout; Tabb, David L.; Impens, Francis; Staes, An; Timmerman, Evy; Martens, Lennart; Laukens, KrisMass Spectrometry Reviews (2018), 37 (5), 697-711CODEN: MSRVD3; ISSN:0277-7037. (John Wiley & Sons, Inc.)A review. Mass spectrometry is a highly complex anal. technique and mass spectrometry-based proteomics expts. can be subject to a large variability, which forms an obstacle to obtaining accurate and reproducible results. Therefore, a comprehensive and systematic approach to quality control is an essential requirement to inspire confidence in the generated results. A typical mass spectrometry expt. consists of multiple different phases including the sample prepn., liq. chromatog., mass spectrometry, and bioinformatics stages. We review potential sources of variability that can impact the results of a mass spectrometry expt. occurring in all of these steps, and we discuss how to monitor and remedy the neg. influences on the exptl. results. Furthermore, we describe how specialized quality control samples of varying sample complexity can be incorporated into the exptl. workflow and how they can be used to rigorously assess detailed aspects of the instrument performance.
- 2Baker, M. 1,500 Scientists Lift the Lid on Reproducibility. Nature 2016, 533 (7604), 452– 454, DOI: 10.1038/533452aGoogle Scholar21,500 scientists lift the lid on reproducibilityBaker, MonyaNature (London, United Kingdom) (2016), 533 (7604), 452-454CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Survey sheds light on the 'crisis' rocking research.
- 3Rodriguez, H.; Snyder, M.; Uhlén, M.; Andrews, P.; Beavis, R.; Borchers, C.; Chalkley, R. J.; Cho, S. Y.; Cottingham, K.; Dunn, M.; Dylag, T.; Edgar, R.; Hare, P.; Heck, A. J. R.; Hirsch, R. F.; Kennedy, K.; Kolar, P.; Kraus, H.-J.; Mallick, P.; Nesvizhskii, A.; Ping, P.; Pontén, F.; Yang, L.; Yates, J. R.; Stein, S. E.; Hermjakob, H.; Kinsinger, C. R.; Apweiler, R. Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: The Amsterdam Principles. J. Proteome Res. 2009, 8 (7), 3689– 3692, DOI: 10.1021/pr900023zGoogle Scholar3Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: The Amsterdam PrinciplesRodriguez, Henry; Snyder, Mike; Uhlen, Mathias; Andrews, Phil; Beavis, Ronald; Borchers, Christoph; Chalkley, Robert J.; Cho, Sang Yun; Cottingham, Katie; Dunn, Michael; Dylag, Tomasz; Edgar, Ron; Hare, Peter; Heck, Albert J. R.; Hirsch, Roland F.; Kennedy, Karen; Kolar, Patrik; Kraus, Hans-Joachim; Mallick, Parag; Nesvizhskii, Alexey; Ping, Peipei; Ponten, Fredrik; Yang, Liming; Yates, John R.; Stein, Stephen E.; Hermjakob, Henning; Kinsinger, Christopher R.; Apweiler, RolfJournal of Proteome Research (2009), 8 (7), 3689-3692CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Policies supporting the rapid and open sharing of genomic data have directly fueled the accelerated pace of discovery in large-scale genomics research. The proteomics community is starting to implement analogous policies and infrastructure for making large-scale proteomics data widely available on a precompetitive basis. On August 14, 2008, the National Cancer Institute (NCI) convened the "International Summit on Proteomics Data Release and Sharing Policy" in Amsterdam, The Netherlands, to identify and address potential roadblocks to rapid and open access to data. The six principles agreed upon by key stakeholders at the summit addressed issues surrounding (1) timing, (2) comprehensiveness, (3) format, (4) deposition to repositories, (5) quality metrics, and (6) responsibility for proteomics data release. This summit report explores various approaches to develop a framework of data release and sharing principles that will most effectively fulfill the needs of the funding agencies and the research community.
- 4Kinsinger, C. R.; Apffel, J.; Baker, M.; Bian, X.; Borchers, C. H.; Bradshaw, R.; Brusniak, M.-Y.; Chan, D. W.; Deutsch, E. W.; Domon, B.; Gorman, J.; Grimm, R.; Hancock, W.; Hermjakob, H.; Horn, D.; Hunter, C.; Kolar, P.; Kraus, H.-J.; Langen, H.; Linding, R.; Moritz, R. L.; Omenn, G. S.; Orlando, R.; Pandey, A.; Ping, P.; Rahbar, A.; Rivers, R.; Seymour, S. L.; Simpson, R. J.; Slotta, D.; Smith, R. D.; Stein, S. E.; Tabb, D. L.; Tagle, D.; Yates, J. R. I.; Rodriguez, H. Recommendations for Mass Spectrometry Data Quality Metrics for Open Access Data (Corollary to the Amsterdam Principles). Mol. Cell. Proteomics 2011, 10 (12), O111.015446, DOI: 10.1074/mcp.O111.015446Google ScholarThere is no corresponding record for this reference.
- 5Rudnick, P. A.; Clauser, K. R.; Kilpatrick, L. E.; Tchekhovskoi, D. V.; Neta, P.; Blonder, N.; Billheimer, D. D.; Blackman, R. K.; Bunk, D. M.; Cardasis, H. L.; Ham, A.-J. L.; Jaffe, J. D.; Kinsinger, C. R.; Mesri, M.; Neubert, T. A.; Schilling, B.; Tabb, D. L.; Tegeler, T. J.; Vega-Montoto, L.; Variyath, A. M.; Wang, M.; Wang, P.; Whiteaker, J. R.; Zimmerman, L. J.; Carr, S. A.; Fisher, S. J.; Gibson, B. W.; Paulovich, A. G.; Regnier, F. E.; Rodriguez, H.; Spiegelman, C.; Tempst, P.; Liebler, D. C.; Stein, S. E. Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses. Mol. Cell. Proteomics 2010, 9 (2), 225– 241, DOI: 10.1074/mcp.M900223-MCP200Google Scholar5Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analysesRudnick, Paul A.; Clauser, Karl R.; Kilpatrick, Lisa E.; Tchekhovskoi, Dmitrii V.; Neta, Pedatsur; Blonder, Niksa; Billheimer, Dean D.; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Ham, Amy-Joan L.; Jaffe, Jacob D.; Kinsinger, Christopher R.; Mesri, Mehdi; Neubert, Thomas A.; Schilling, Birgit; Tabb, David L.; Tegeler, Tony J.; Vega-Montoto, Lorenzo; Variyath, Asokan Mulayath; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Paulovich, Amanda G.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Tempst, Paul; Liebler, Daniel C.; Stein, Stephen E.Molecular and Cellular Proteomics (2010), 9 (2), 225-241CODEN: MCPOBS; ISSN:1535-9484. (American Society for Biochemistry and Molecular Biology)A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quant. assessment of system performance and evaluation of tech. variability. Here we describe 46 system performance metrics for monitoring chromatog. performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlab. studies conducted under a common std. operating procedure identified outlier data and provided clues to specific causes. Moreover, interlab. variation reflected by the metrics indicates which system components vary the most between labs. Application of these metrics enables rational, quant. quality assessment for proteomics and other LC-MS/MS anal. applications.
- 6Ma, Z.-Q.; Polzin, K. O.; Dasari, S.; Chambers, M. C.; Schilling, B.; Gibson, B. W.; Tran, B. Q.; Vega-Montoto, L.; Liebler, D. C.; Tabb, D. L. QuaMeter: Multivendor Performance Metrics for LC-MS/MS Proteomics Instrumentation. Anal. Chem. 2012, 84 (14), 5845– 5850, DOI: 10.1021/ac300629pGoogle Scholar6QuaMeter: Multivendor Performance Metrics for LC-MS/MS Proteomics InstrumentationMa, Ze-Qiang; Polzin, Kenneth O.; Dasari, Surendra; Chambers, Matthew C.; Schilling, Birgit; Gibson, Bradford W.; Tran, Bao Q.; Vega-Montoto, Lorenzo; Liebler, Daniel C.; Tabb, David L.Analytical Chemistry (Washington, DC, United States) (2012), 84 (14), 5845-5850CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)LC-MS/MS-based proteomics studies rely on stable anal. system performance that can be evaluated by objective criteria. The National Institute of Stds. and Technol. (NIST) introduced the MSQC software to compute diverse metrics from exptl. LC-MS/MS data, enabling quality anal. and quality control (QA/QC) of proteomics instrumentation. In practice, however, several attributes of the MSQC software prevent its use for routine instrument monitoring. Here, we present QuaMeter, an open-source tool that improves MSQC in several aspects. QuaMeter can directly read raw data from instruments manufd. by different vendors. The software can work with a wide variety of peptide identification software for improved reliability and flexibility. Finally, QC metrics implemented in QuaMeter are rigorously defined and tested. The source code and binary versions of QuaMeter are available under Apache 2.0 License at http://fenchurch.mc.vanderbilt.edu.
- 7Pichler, P.; Mazanek, M.; Dusberger, F.; Weilnböck, L.; Huber, C. G.; Stingl, C.; Luider, T. M.; Straube, W. L.; Köcher, T.; Mechtler, K. SIMPATIQCO: A Server-Based Software Suite Which Facilitates Monitoring the Time Course of LC-MS Performance Metrics on Orbitrap Instruments. J. Proteome Res. 2012, 11 (11), 5540– 5547, DOI: 10.1021/pr300163uGoogle Scholar7SIMPATIQCO: A Server-Based Software Suite Which Facilitates Monitoring the Time Course of LC-MS Performance Metrics on Orbitrap InstrumentsPichler, Peter; Mazanek, Michael; Dusberger, Frederico; Weilnboeck, Lisa; Huber, Christian G.; Stingl, Christoph; Luider, Theo M.; Straube, Werner L.; Koecher, Thomas; Mechtler, KarlJournal of Proteome Research (2012), 11 (11), 5540-5547CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)While the performance of liq. chromatog. (LC) and mass spectrometry (MS) instrumentation continues to increase, applications such as analyses of complete or near-complete proteomes and quant. studies require const. and optimal system performance. For this reason, research labs. and core facilities alike are recommended to implement quality control (QC) measures as part of their routine workflows. Many labs. perform sporadic quality control checks. However, successive and systematic longitudinal monitoring of system performance would be facilitated by dedicated automatic or semiautomatic software solns. that aid an effortless anal. and display of QC metrics over time. We present the software package SIMPATIQCO (SIMPle AuTomatIc Quality COntrol) designed for evaluation of data from LTQ Orbitrap, Q-Exactive, LTQ FT, and LTQ instruments. A centralized SIMPATIQCO server can process QC data from multiple instruments. The software calcs. QC metrics supervising every step of data acquisition from LC and electrospray to MS. For each QC metric the software learns the range indicating adequate system performance from the uploaded data using robust statistics. Results are stored in a database and can be displayed in a comfortable manner from any computer in the lab. via a web browser. QC data can be monitored for individual LC runs as well as plotted over time. SIMPATIQCO thus assists the longitudinal monitoring of important QC metrics such as peptide elution times, peak widths, intensities, total ion current (TIC) as well as sensitivity, and overall LC-MS system performance; in this way the software also helps identify potential problems. The SIMPATIQCO software package is available free of charge.
- 8Bielow, C.; Mastrobuoni, G.; Kempa, S. Proteomics Quality Control: Quality Control Software for MaxQuant Results. J. Proteome Res. 2016, 15 (3), 777– 787, DOI: 10.1021/acs.jproteome.5b00780Google Scholar8Proteomics Quality Control: Quality Control Software for MaxQuant ResultsBielow, Chris; Mastrobuoni, Guido; Kempa, StefanJournal of Proteome Research (2016), 15 (3), 777-787CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Mass spectrometry-based proteomics coupled to liq. chromatog. has matured into an automatized, high-throughput technol., producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality anal. (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream anal. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC-MS data generated by the MaxQuant software pipeline. PTXQC creates a QC report contg. a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of exptl. designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant's Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC.
- 9Chiva, C.; Olivella, R.; Borràs, E.; Espadas, G.; Pastor, O.; Solé, A.; Sabidó, E. QCloud: A Cloud-Based Quality Control System for Mass Spectrometry-Based Proteomics Laboratories. PLOS ONE 2018, 13 (1), e0189209 DOI: 10.1371/journal.pone.0189209Google Scholar9QCloud: A cloud-based quality control system for mass spectrometry-based proteomics laboratoriesChiva, Cristina; Olivella, Roger; Borras, Eva; Espadas, Guadalupe; Pastor, Olga; Sole, Amanda; Sabido, EduardPLoS One (2018), 13 (1), e0189209/1-e0189209/14CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)The increasing no. of biomedical and translational applications in mass spectrometrybased proteomics poses new anal. challenges and raises the need for automated quality control systems. Despite previous efforts to set std. file formats, data processing workflows and key evaluation parameters for quality control, automated quality control systems are not yet widespread among proteomics labs., which limits the acquisition of high-quality results, inter-lab. comparisons and the assessment of variability of instrumental platforms. Here we present QCloud, a cloud-based system to support proteomics labs. in daily quality assessment using a user-friendly interface, easy setup, automated data processing and archiving, and unbiased instrument evaluation. QCloud supports the most common targeted and untargeted proteomics workflows, it accepts data formats from different vendors and it enables the annotation of acquired data and reporting incidences. A complete version of the QCloud system has successfully been developed and it is now open to the proteomics community. QCloud system is an open source project, publicly available under a Creative Commons License Attribution- ShareAlike 4.0.
- 10Broadhurst, D.; Goodacre, R.; Reinke, S. N.; Kuligowski, J.; Wilson, I. D.; Lewis, M. R.; Dunn, W. B. Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies. Metabolomics 2018, 14 (6), 72, DOI: 10.1007/s11306-018-1367-3Google Scholar10Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studiesBroadhurst David; Reinke Stacey N; Goodacre Royston; Reinke Stacey N; Kuligowski Julia; Wilson Ian D; Lewis Matthew R; Dunn Warwick B; Dunn Warwick B; Dunn Warwick BMetabolomics : Official journal of the Metabolomic Society (2018), 14 (6), 72 ISSN:1573-3882.BACKGROUND: Quality assurance (QA) and quality control (QC) are two quality management processes that are integral to the success of metabolomics including their application for the acquisition of high quality data in any high-throughput analytical chemistry laboratory. QA defines all the planned and systematic activities implemented before samples are collected, to provide confidence that a subsequent analytical process will fulfil predetermined requirements for quality. QC can be defined as the operational techniques and activities used to measure and report these quality requirements after data acquisition. AIM OF REVIEW: This tutorial review will guide the reader through the use of system suitability and QC samples, why these samples should be applied and how the quality of data can be reported. KEY SCIENTIFIC CONCEPTS OF REVIEW: System suitability samples are applied to assess the operation and lack of contamination of the analytical platform prior to sample analysis. Isotopically-labelled internal standards are applied to assess system stability for each sample analysed. Pooled QC samples are applied to condition the analytical platform, perform intra-study reproducibility measurements (QC) and to correct mathematically for systematic errors. Standard reference materials and long-term reference QC samples are applied for inter-study and inter-laboratory assessment of data.
- 11Stratton, K. G.; Webb-Robertson, B.-J. M.; McCue, L. A.; Stanfill, B.; Claborne, D.; Godinez, I.; Johansen, T.; Thompson, A. M.; Burnum-Johnson, K. E.; Waters, K. M.; Bramer, L. M. pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data. J. Proteome Res. 2019, 18 (3), 1418– 1425, DOI: 10.1021/acs.jproteome.8b00760Google Scholar11pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological DataStratton, Kelly G.; Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Stanfill, Bryan; Claborne, Daniel; Godinez, Iobani; Johansen, Thomas; Thompson, Allison M.; Burnum-Johnson, Kristin E.; Waters, Katrina M.; Bramer, Lisa M.Journal of Proteome Research (2019), 18 (3), 1418-1425CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Prior to statistical anal. of mass spectrometry (MS) data, quality control (QC) of the identified biomol. peak intensities is imperative for reducing process-based sources of variation and extreme biol. outliers. Without this step, statistical results can be biased. Addnl., liq. chromatog.-MS proteomics data present inherent challenges due to large amts. of missing data that require special consideration during statistical anal. While a no. of R packages exist to address these challenges individually, there is no single R package that addresses all of them. The authors present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data anal. (EDA), visualization, and statistical anal. robust to missing data. Example anal. using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quant. and qual. statistical test, 19 proteins whose statistical significance would have been missed by a quant. test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
- 12Naake, T.; Rainer, J.; Huber, W. MsQuality: An Interoperable Open-Source Package for the Calculation of Standardized Quality Metrics of Mass Spectrometry Data. Bioinformatics 2023, 39 (10), btad618, DOI: 10.1093/bioinformatics/btad618Google ScholarThere is no corresponding record for this reference.
- 13Kirwan, J. A.; Gika, H.; Beger, R. D.; Bearden, D.; Dunn, W. B.; Goodacre, R.; Theodoridis, G.; Witting, M.; Yu, L.-R.; Wilson, I. D. the metabolomics Quality Assurance and Quality Control Consortium (mQACC). Quality Assurance and Quality Control Reporting in Untargeted Metabolic Phenotyping: mQACC Recommendations for Analytical Quality Management. Metabolomics 2022, 18 (9), 70, DOI: 10.1007/s11306-022-01926-3Google Scholar13Quality assurance and quality control reporting in untargeted metabolic phenotyping: mQACC recommendations for analytical quality managementKirwan, Jennifer A.; Gika, Helen; Beger, Richard D.; Bearden, Dan; Dunn, Warwick B.; Goodacre, Royston; Theodoridis, Georgios; Witting, Michael; Yu, Li-Rong; Wilson, Ian D.; the metabolomics Quality Assurance and Quality Control ConsortiumMetabolomics (2022), 18 (9), 70CODEN: METAHQ; ISSN:1573-3890. (Springer)Abstr.: Background: Demonstrating that the data produced in metabolic phenotyping investigations (metabolomics/metabonomics) is of good quality is increasingly seen as a key factor in gaining acceptance for the results of such studies. The use of established quality control (QC) protocols, including appropriate QC samples, is an important and evolving aspect of this process. However, inadequate or incorrect reporting of the QA/QC procedures followed in the study may lead to misinterpretation or overemphasis of the findings and prevent future metanal. of the body of work. Objective: The aim of this guidance is to provide researchers with a framework that encourages them to describe quality assessment and quality control procedures and outcomes in mass spectrometry and NMR spectroscopy-based methods in untargeted metabolomics, with a focus on reporting on QC samples in sufficient detail for them to be understood, trusted and replicated. There is no intent to be proscriptive with regard to anal. best practices; rather, guidance for reporting QA/QC procedures is suggested. A template that can be completed as studies progress to ensure that relevant data is collected, and further documents, are provided as online resources. Key reporting practices: Multiple topics should be considered when reporting QA/QC protocols and outcomes for metabolic phenotyping data. Coverage should include the role(s), sources, types, prepn. and uses of the QC materials and samples generally employed in the generation of metabolomic data. Details such as sample matrixes and sample prepn., the use of test mixts. and system suitability tests, blanks and technique-specific factors are considered and methods for reporting are discussed, including the importance of reporting the acceptance criteria for the QCs. To this end, the reporting of the QC samples and results are considered at two levels of detail: "minimal" and "best reporting practice" levels.
- 14Köfeler, H. C.; Ahrends, R.; Baker, E. S.; Ekroos, K.; Han, X.; Hoffmann, N.; Holčapek, M.; Wenk, M. R.; Liebisch, G. Recommendations for Good Practice in MS-Based Lipidomics. J. Lipid Res. 2021, 62, 100138, DOI: 10.1016/j.jlr.2021.100138Google Scholar14Recommendations for good practice in MS-based lipidomicsKofeler Harald C; Ahrends Robert; Baker Erin S; Ekroos Kim; Han Xianlin; Hoffmann Nils; Holcapek Michal; Wenk Markus R; Liebisch GerhardJournal of lipid research (2021), 62 (), 100138 ISSN:.In the last 2 decades, lipidomics has become one of the fastest expanding scientific disciplines in biomedical research. With an increasing number of new research groups to the field, it is even more important to design guidelines for assuring high standards of data quality. The Lipidomics Standards Initiative is a community-based endeavor for the coordination of development of these best practice guidelines in lipidomics and is embedded within the International Lipidomics Society. It is the intention of this review to highlight the most quality-relevant aspects of the lipidomics workflow, including preanalytics, sample preparation, MS, and lipid species identification and quantitation. Furthermore, this review just does not only highlights examples of best practice but also sheds light on strengths, drawbacks, and pitfalls in the lipidomic analysis workflow. While this review is neither designed to be a step-by-step protocol by itself nor dedicated to a specific application of lipidomics, it should nevertheless provide the interested reader with links and original publications to obtain a comprehensive overview concerning the state-of-the-art practices in the field.
- 15McDonald, J. G.; Ejsing, C. S.; Kopczynski, D.; Holčapek, M.; Aoki, J.; Arita, M.; Arita, M.; Baker, E. S.; Bertrand-Michel, J.; Bowden, J. A.; Brügger, B.; Ellis, S. R.; Fedorova, M.; Griffiths, W. J.; Han, X.; Hartler, J.; Hoffmann, N.; Koelmel, J. P.; Köfeler, H. C.; Mitchell, T. W.; O’Donnell, V. B.; Saigusa, D.; Schwudke, D.; Shevchenko, A.; Ulmer, C. Z.; Wenk, M. R.; Witting, M.; Wolrab, D.; Xia, Y.; Ahrends, R.; Liebisch, G.; Ekroos, K. Introducing the Lipidomics Minimal Reporting Checklist. Nat. Metab. 2022, 4 (9), 1086– 1088, DOI: 10.1038/s42255-022-00628-3Google Scholar15Introducing the Lipidomics Minimal Reporting ChecklistMcDonald Jeffrey G; Ejsing Christer S; Ejsing Christer S; Kopczynski Dominik; Ahrends Robert; Holcapek Michal; Wolrab Denise; Aoki Junken; Aoki Junken; Arita Makoto; Arita Masanori; Baker Erin S; Bertrand-Michel Justine; Bowden John A; Brugger Britta; Ellis Shane R; Ellis Shane R; Mitchell Todd W; Fedorova Maria; Griffiths William J; Han Xianlin; Han Xianlin; Hartler Jurgen; Hartler Jurgen; Hoffmann Nils; Koelmel Jeremy P; Kofeler Harald C; O'Donnell Valerie B; Saigusa Daisuke; Schwudke Dominik; Schwudke Dominik; Schwudke Dominik; Shevchenko Andrej; Ulmer Candice Z; Wenk Markus R; Witting Michael; Xia Yu; Liebisch Gerhard; Ekroos KimNature metabolism (2022), 4 (9), 1086-1088 ISSN:.There is no expanded citation for this reference.
- 16Bittremieux, W.; Valkenborg, D.; Martens, L.; Laukens, K. Computational Quality Control Tools for Mass Spectrometry Proteomics. PROTEOMICS 2017, 17 (3–4), 1600159, DOI: 10.1002/pmic.201600159Google ScholarThere is no corresponding record for this reference.
- 17Bittremieux, W.; Walzer, M.; Tenzer, S.; Zhu, W.; Salek, R. M.; Eisenacher, M.; Tabb, D. L. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry. Anal. Chem. 2017, 89 (8), 4474– 4479, DOI: 10.1021/acs.analchem.6b04310Google Scholar17The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass SpectrometryBittremieux, Wout; Walzer, Mathias; Tenzer, Stefan; Zhu, Weimin; Salek, Reza M.; Eisenacher, Martin; Tabb, David L.Analytical Chemistry (Washington, DC, United States) (2017), 89 (8), 4474-4479CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)To have confidence in results acquired during biol. mass spectrometry expts., a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization-Proteomics Stds. Initiative established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the previously proposed qcML format will be adapted to support a variety of use cases for both proteomics and metabolomics applications, and it will be established as an official PSI format. An important consideration is to avoid enforcing restrictive requirements on quality control but instead provide the basic tech. necessities required to support extensive quality control for any type of mass spectrometry-based workflow. The authors want to emphasize that this is an open community effort, and the authors seek participation from all scientists with an interest in this field.
- 18Deutsch, E. W.; Vizcaíno, J. A.; Jones, A. R.; Binz, P.-A.; Lam, H.; Klein, J.; Bittremieux, W.; Perez-Riverol, Y.; Tabb, D. L.; Walzer, M.; Ricard-Blum, S.; Hermjakob, H.; Neumann, S.; Mak, T. D.; Kawano, S.; Mendoza, L.; Van Den Bossche, T.; Gabriels, R.; Bandeira, N.; Carver, J.; Pullman, B.; Sun, Z.; Hoffmann, N.; Shofstahl, J.; Zhu, Y.; Licata, L.; Quaglia, F.; Tosatto, S. C. E.; Orchard, S. E. Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work. J. Proteome Res. 2023, 22 (2), 287– 301, DOI: 10.1021/acs.jproteome.2c00637Google Scholar18Proteomics Standards Initiative at Twenty Years: Current Activities and Future WorkDeutsch, Eric W.; Vizcaino, Juan Antonio; Jones, Andrew R.; Binz, Pierre-Alain; Lam, Henry; Klein, Joshua; Bittremieux, Wout; Perez-Riverol, Yasset; Tabb, David L.; Walzer, Mathias; Ricard-Blum, Sylvie; Hermjakob, Henning; Neumann, Steffen; Mak, Tytus D.; Kawano, Shin; Mendoza, Luis; Van Den Bossche, Tim; Gabriels, Ralf; Bandeira, Nuno; Carver, Jeremy; Pullman, Benjamin; Sun, Zhi; Hoffmann, Nils; Shofstahl, Jim; Zhu, Yunping; Licata, Luana; Quaglia, Federica; Tosatto, Silvio C. E.; Orchard, Sandra E.Journal of Proteome Research (2023), 22 (2), 287-301CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)A review. The Human Proteome Organization (HUPO) Proteomics Stds. Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI stds. We briefly describe the current state of the many existing PSI stds., some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of stds. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
- 19Mayer, G.; Montecchi-Palazzi, L.; Ovelleiro, D.; Jones, A. R.; Binz, P.-A.; Deutsch, E. W.; Chambers, M.; Kallhardt, M.; Levander, F.; Shofstahl, J.; Orchard, S.; Vizcaino, J. A.; Hermjakob, H.; Stephan, C.; Meyer, H. E.; Eisenacher, M. The HUPO Proteomics Standards Initiative- Mass Spectrometry Controlled Vocabulary. Database 2013, 2013, bat009, DOI: 10.1093/database/bat009Google ScholarThere is no corresponding record for this reference.
- 20Röst, H. L.; Schmitt, U.; Aebersold, R.; Malmström, L. pyOpenMS: A Python-Based Interface to the OpenMS Mass-Spectrometry Algorithm Library. PROTEOMICS 2014, 14 (1), 74– 77, DOI: 10.1002/pmic.201300246Google Scholar20pyOpenMS: a Python-based interface to the OpenMS mass-spectrometry algorithm libraryRost Hannes L; Schmitt Uwe; Aebersold Ruedi; Malmstrom LarsProteomics (2014), 14 (1), 74-7 ISSN:.pyOpenMS is an open-source, Python-based interface to the C++ OpenMS library, providing facile access to a feature-rich, open-source algorithm library for MS-based proteomics analysis. It contains Python bindings that allow raw access to the data structures and algorithms implemented in OpenMS, specifically those for file access (mzXML, mzML, TraML, mzIdentML among others), basic signal processing (smoothing, filtering, de-isotoping, and peak-picking) and complex data analysis (including label-free, SILAC, iTRAQ, and SWATH analysis tools). pyOpenMS thus allows fast prototyping and efficient workflow development in a fully interactive manner (using the interactive Python interpreter) and is also ideally suited for researchers not proficient in C++. In addition, our code to wrap a complex C++ library is completely open-source, allowing other projects to create similar bindings with ease. The pyOpenMS framework is freely available at https://pypi.python.org/pypi/pyopenms while the autowrap tool to create Cython code automatically is available at https://pypi.python.org/pypi/autowrap (both released under the 3-clause BSD licence).
- 21Levitsky, L. I.; Klein, J. A.; Ivanov, M. V.; Gorshkov, M. Pyteomics 4.0: Five Years of Development of a Python Proteomics Framework. J. Proteome Res. 2019, 18 (2), 709– 714, DOI: 10.1021/acs.jproteome.8b00717Google Scholar21Pyteomics 4.0: Five Years of Development of a Python Proteomics FrameworkLevitsky, Lev I.; Klein, Joshua A.; Ivanov, Mark V.; Gorshkov, Mikhail V.Journal of Proteome Research (2019), 18 (2), 709-714CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)A review. Many of the novel ideas that drive today's proteomic technologies are focused essentially on exptl. or data-processing workflows. The latter are implemented and published in a no. of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research. To assist in overcoming the bioinformatics challenges in the daily practice of proteomic labs., 5 years ago we developed and announced Pyteomics, a freely available open-source library providing Python interfaces to proteomic data. We summarize the new functionality of Pyteomics developed during the time since its introduction.
- 22Huber, F.; Verhoeven, S.; Meijer, C.; Spreeuw, H.; Castilla, E.; Geng, C.; van der Hooft, J.; Rogers, S.; Belloum, A.; Diblen, F.; Spaaks, J. Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data. J. Open Source Softw. 2020, 5 (52), 2411, DOI: 10.21105/joss.02411Google ScholarThere is no corresponding record for this reference.
- 23Bittremieux, W.; Levitsky, L.; Pilz, M.; Sachsenberg, T.; Huber, F.; Wang, M.; Dorrestein, P. C. Unified and Standardized Mass Spectrometry Data Processing in Python Using Spectrum_utils. J. Proteome Res. 2023, 22 (2), 625– 631, DOI: 10.1021/acs.jproteome.2c00632Google ScholarThere is no corresponding record for this reference.
- 24Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification. Anal. Chem. 2006, 78 (3), 779– 787, DOI: 10.1021/ac051437yGoogle Scholar24XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and IdentificationSmith, Colin A.; Want, Elizabeth J.; O'Maille, Grace; Abagyan, Ruben; Siuzdak, GaryAnalytical Chemistry (2006), 78 (3), 779-787CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Metabolite profiling in biomarker discovery, enzyme substrate assignment, drug activity/specificity detn., and basic metabolic research requires new data preprocessing approaches to correlate specific metabolites to their biol. origin. Here we introduce an LC/MS-based data anal. approach, XCMS, which incorporates novel nonlinear retention time alignment, matched filtration, peak detection, and peak matching. Without using internal stds., the method dynamically identifies hundreds of endogenous metabolites for use as stds., calcg. a nonlinear retention time correction profile for each sample. Following retention time correction, the relative metabolite ion intensities are directly compared to identify changes in specific endogenous metabolites, such as potential biomarkers. The software is demonstrated using data sets from a previously reported enzyme knockout study and a large-scale study of plasma samples. XCMS is freely available under an open-source license at http://metlin.scripps.edu/download/.
- 25Gatto, L.; Gibb, S.; Rainer, J. MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry Data. J. Proteome Res. 2021, 20 (1), 1063– 1069, DOI: 10.1021/acs.jproteome.0c00313Google Scholar25MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry DataGatto, Laurent; Gibb, Sebastian; Rainer, JohannesJournal of Proteome Research (2021), 20 (1), 1063-1069CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)We present version 2 of the MSnbase R/Bioconductor package. MSnbase provides infrastructure for the manipulation, processing, and visualization of mass spectrometry data. We focus on the new on-disk infrastructure, that allows the handling of large raw mass spectrometry expts. on commodity hardware and illustrate how the package is used for elegant data processing, method development, and visualization.
- 26Barsnes, H.; Vaudel, M.; Colaert, N.; Helsens, K.; Sickmann, A.; Berven, F. S.; Martens, L. Compomics-Utilities: An Open-Source Java Library for Computational Proteomics. BMC Bioinformatics 2011, 12 (1), 70, DOI: 10.1186/1471-2105-12-70Google ScholarThere is no corresponding record for this reference.
- 27Schmid, R.; Heuckeroth, S.; Korf, A.; Smirnov, A.; Myers, O.; Dyrlund, T. S.; Bushuiev, R.; Murray, K. J.; Hoffmann, N.; Lu, M.; Sarvepalli, A.; Zhang, Z.; Fleischauer, M.; Dührkop, K.; Wesner, M.; Hoogstra, S. J.; Rudt, E.; Mokshyna, O.; Brungs, C.; Ponomarov, K.; Mutabdžija, L.; Damiani, T.; Pudney, C. J.; Earll, M.; Helmer, P. O.; Fallon, T. R.; Schulze, T.; Rivas-Ubach, A.; Bilbao, A.; Richter, H.; Nothias, L.-F.; Wang, M.; Orešič, M.; Weng, J.-K.; Böcker, S.; Jeibmann, A.; Hayen, H.; Karst, U.; Dorrestein, P. C.; Petras, D.; Du, X.; Pluskal, T. Integrative Analysis of Multimodal Mass Spectrometry Data in MZmine 3. Nat. Biotechnol. 2023, 41 (4), 447– 449, DOI: 10.1038/s41587-023-01690-2Google Scholar27Integrative analysis of multimodal mass spectrometry data in MZmine 3Schmid, Robin; Heuckeroth, Steffen; Korf, Ansgar; Smirnov, Aleksandr; Myers, Owen; Dyrlund, Thomas S.; Bushuiev, Roman; Murray, Kevin J.; Hoffmann, Nils; Lu, Miaoshan; Sarvepalli, Abinesh; Zhang, Zheng; Fleischauer, Markus; Duhrkop, Kai; Wesner, Mark; Hoogstra, Shawn J.; Rudt, Edward; Mokshyna, Olena; Brungs, Corinna; Ponomarov, Kirill; Mutabdzija, Lana; Damiani, Tito; Pudney, Chris J.; Earll, Mark; Helmer, Patrick O.; Fallon, Timothy R.; Schulze, Tobias; Rivas-Ubach, Albert; Bilbao, Aivett; Richter, Henning; Nothias, Louis-Felix; Wang, Mingxun; Oresic, Matej; Weng, Jing-Ke; Bocker, Sebastian; Jeibmann, Astrid; Hayen, Heiko; Karst, Uwe; Dorrestein, Pieter C.; Petras, Daniel; Du, Xiuxia; Pluskal, TomasNature Biotechnology (2023), 41 (4), 447-449CODEN: NABIF9; ISSN:1087-0156. (Nature Portfolio)There is no expanded citation for this reference.
- 28Martens, L.; Chambers, M.; Sturm, M.; Kessner, D.; Levander, F.; Shofstahl, J.; Tang, W. H.; Römpp, A.; Neumann, S.; Pizarro, A. D.; Montecchi-Palazzi, L.; Tasman, N.; Coleman, M.; Reisinger, F.; Souda, P.; Hermjakob, H.; Binz, P.-A.; Deutsch, E. W. mzML─a Community Standard for Mass Spectrometry Data. Mol. Cell. Proteomics 2011, 10 (1), R110.000133, DOI: 10.1074/mcp.R110.000133Google ScholarThere is no corresponding record for this reference.
- 29Griss, J.; Jones, A. R.; Sachsenberg, T.; Walzer, M.; Gatto, L.; Hartler, J.; Thallinger, G. G.; Salek, R. M.; Steinbeck, C.; Neuhauser, N.; Cox, J.; Neumann, S.; Fan, J.; Reisinger, F.; Xu, Q.-W.; del Toro, N.; Perez-Riverol, Y.; Ghali, F.; Bandeira, N.; Xenarios, I.; Kohlbacher, O.; Vizcaíno, J. A.; Hermjakob, H. The mzTab Data Exchange Format: Communicating Mass-Spectrometry-Based Proteomics and Metabolomics Experimental Results to a Wider Audience. Mol. Cell. Proteomics 2014, 13 (10), 2765– 2775, DOI: 10.1074/mcp.O113.036681Google Scholar29The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider AudienceGriss, Johannes; Jones, Andrew R.; Sachsenberg, Timo; Walzer, Mathias; Gatto, Laurent; Hartler, Jurgen; Thallinger, Gerhard G.; Salek, Reza M.; Steinbeck, Christoph; Neuhauser, Nadin; Cox, Jurgen; Neumann, Steffen; Fan, Jun; Reisinger, Florian; Xu, Qing-Wei; del Toro, Noemi; Perez-Riverol, Yasset; Ghali, Fawaz; Bandeira, Nuno; Xenarios, Ioannis; Kohlbacher, Oliver; Vizcaino, Juan Antonio; Hermjakob, HenningMolecular & Cellular Proteomics (2014), 13 (10), 2765-2775CODEN: MCPOBS; ISSN:1535-9484. (American Society for Biochemistry and Molecular Biology)We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. MzTab is intended as a lightwt. supplement to the existing std. XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. MzTab files can contain protein, peptide, and small mol. identifications together with exptl. metadata and basic quant. information. The format is not intended to store the complete exptl. evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the exptl. design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biol. community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive addnl. documentation can be found online.
- 30Deutsch, E. W.; Bandeira, N.; Sharma, V.; Perez-Riverol, Y.; Carver, J. J.; Kundu, D. J.; García-Seisdedos, D.; Jarnuczak, A. F.; Hewapathirana, S.; Pullman, B. S.; Wertz, J.; Sun, Z.; Kawano, S.; Okuda, S.; Watanabe, Y.; Hermjakob, H.; MacLean, B.; MacCoss, M. J.; Zhu, Y.; Ishihama, Y.; Vizcaíno, J. A. The ProteomeXchange Consortium in 2020: Enabling “big Data” Approaches in Proteomics. Nucleic Acids Res. 2019, 48 (D1), D1145– D1152, DOI: 10.1093/nar/gkz984Google ScholarThere is no corresponding record for this reference.
- 31Marshall, S. A.; Young, R. B.; Lewis, J. M.; Rutten, E. L.; Gould, J.; Barlow, C. K.; Giogha, C.; Marcelino, V. R.; Fields, N.; Schittenhelm, R. B.; Hartland, E. L.; Scott, N. E.; Forster, S. C.; Gulliver, E. L. The Broccoli-Derived Antioxidant Sulforaphane Changes the Growth of Gastrointestinal Microbiota, Allowing for the Production of Anti-Inflammatory Metabolites. J. Funct. Foods 2023, 107, 105645, DOI: 10.1016/j.jff.2023.105645Google ScholarThere is no corresponding record for this reference.
- 32Hulstaert, N.; Shofstahl, J.; Sachsenberg, T.; Walzer, M.; Barsnes, H.; Martens, L.; Perez-Riverol, Y. ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion. J. Proteome Res. 2020, 19 (1), 537– 542, DOI: 10.1021/acs.jproteome.9b00328Google Scholar32ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File ConversionHulstaert, Niels; Shofstahl, Jim; Sachsenberg, Timo; Walzer, Mathias; Barsnes, Harald; Martens, Lennart; Perez-Riverol, YassetJournal of Proteome Research (2020), 19 (1), 537-542CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)The field of computational proteomics is approaching the big data age, driven both by a continuous growth in the no. of samples analyzed per expt. as well as by the growing amt. of data obtained in each anal. run. In order to process these large amts. of data, it is increasingly necessary to use elastic compute resources such as Linux-based cluster environments and cloud infrastructures. Unfortunately, the vast majority of cross-platform proteomics tools are not able to operate directly on the proprietary formats generated by the diverse mass spectrometers. Here, we present ThermoRawFileParser, an open-source, cross-platform tool that converts Thermo RAW files into open file formats such as MGF and the HUPO-PSI std. file format mzML. To ensure the broadest possible availability and to increase integration capabilities with popular workflow systems such as Galaxy or Nextflow, we have also built Conda package and BioContainers container around ThermoRawFileParser. In addn., we implemented a user-friendly interface (ThermoRawFileParserGUI) for those users not familiar with command-line tools. Finally, we performed a benchmark of ThermoRawFileParser and msconvert to verify that the converted mzML files contain reliable quant. results.
- 33Park, C. Y.; Klammer, A. A.; Käll, L.; MacCoss, M. J.; Noble, W. S. Rapid and Accurate Peptide Identification from Tandem Mass Spectra. J. Proteome Res. 2008, 7 (7), 3022– 3027, DOI: 10.1021/pr800127yGoogle Scholar33Rapid and Accurate Peptide Identification from Tandem Mass SpectraPark, Christopher Y.; Klammer, Aaron A.; Kall, Lukas; MacCoss, Michael J.; Noble, William S.Journal of Proteome Research (2008), 7 (7), 3022-3027CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Mass spectrometry, the core technol. in the field of proteomics, promises to enable scientists to identify and quantify the entire complement of proteins in a complex biol. sample. Currently, the primary bottleneck in this type of expt. is computational. Existing algorithms for interpreting mass spectra are slow and fail to identify a large proportion of the given spectra. We describe a database search program called Crux that reimplements and extends the widely used database search program SEQUEST. For speed, Crux uses a peptide indexing scheme to rapidly retrieve candidate peptides for a given spectrum. For each peptide in the target database, Crux generates shuffled decoy peptides on the fly, providing a good null model and, hence, accurate false discovery rate ests. Crux also implements two recently described postprocessing methods: a p value calcn. based upon fitting a Weibull distribution to the obsd. scores, and a semisupervised method that learns to discriminate between target and decoy matches. Both methods significantly improve the overall rate of peptide identification. Crux is implemented in C and is distributed with source code freely to noncommercial users.
- 34The UniProt Consortium UniProt: A Hub for Protein Information. Nucleic Acids Res. 2015, 43 (D1), D204– D212, DOI: 10.1093/nar/gku989 .Google ScholarThere is no corresponding record for this reference.
- 35Lin, A.; See, D.; Fondrie, W. E.; Keich, U.; Noble, W. S. Target-Decoy False Discovery Rate Estimation Using Crema. PROTEOMICS 2024, 24 (8), 2300084, DOI: 10.1002/pmic.202300084Google ScholarThere is no corresponding record for this reference.
- 36Côté, R. G.; Reisinger, F.; Martens, L. jmzML, an Open-Source Java API for mzML, the PSI Standard for MS Data. PROTEOMICS 2010, 10 (7), 1332– 1335, DOI: 10.1002/pmic.200900719Google ScholarThere is no corresponding record for this reference.
- 37Pluskal, T.; Hoffmann, N.; Du, X.; Weng, J.-K. Mass Spectrometry Development Kit (MSDK): A Java Library for Mass Spectrometry Data Processing. In New Developments in Mass Spectrometry; Winkler, R., Ed.; Royal Society of Chemistry: Cambridge, 2020; pp 399– 405, DOI: 10.1039/9781788019880-00399 .Google ScholarThere is no corresponding record for this reference.
- 38McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference; van der Walt, S.; Millman, J., Eds.; Austin, Texas, USA, 2010; pp 51– 56, DOI: 10.25080/Majora-92bf1922-00a .Google ScholarThere is no corresponding record for this reference.
- 39Thomas, K.; Benjamin, R.-K.; Fernando, P.; Brian, G.; Matthias, B.; Jonathan, F.; Kyle, K.; Jessica, H.; Jason, G.; Sylvain, C.; Paul, I.; Damián, A.; Safia, A.; Carol, W. Jupyter Development Team. Jupyter Notebooks - A Publishing Format for Reproducible Computational Workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas; IOS Press, 2016; pp 87– 90.Google ScholarThere is no corresponding record for this reference.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 4 publications.
- Wasim Sandhu, Ira J. Gray, Sarah Lin, Joshua E. Elias, Brian C. DeFelice. Rapid QC-MS: Interactive Dashboard for Synchronous Mass Spectrometry Data Acquisition Quality Control. Analytical Chemistry 2024, 96
(44)
, 17465-17470. https://doi.org/10.1021/acs.analchem.4c00786
- Tim Van Den Bossche, Jean Armengaud, Dirk Benndorf, Jose Alfredo Blakeley‐Ruiz, Madita Brauer, Kai Cheng, Marybeth Creskey, Daniel Figeys, Lucia Grenga, Timothy J. Griffin, Céline Henry, Robert L. Hettich, Tanja Holstein, Pratik D. Jagtap, Nico Jehmlich, Jonghyun Kim, Manuel Kleiner, Benoit J. Kunath, Xuxa Malliet, Lennart Martens, Subina Mehta, Bart Mesuere, Zhibin Ning, Alessandro Tanca, Sergio Uzzau, Pieter Verschaffelt, Jing Wang, Paul Wilmes, Xu Zhang, Xin Zhang, Leyuan Li, . The microbiologist's guide to metaproteomics. iMeta 2025, 4
(3)
https://doi.org/10.1002/imt2.70031
- S.E. Orchard. What have Data Standards ever done for us?. Molecular & Cellular Proteomics 2025, 374 , 100933. https://doi.org/10.1016/j.mcpro.2025.100933
- Tim Van Den Bossche, Denis Beslic, Sam van Puyenbroeck, Tomi Suomi, Tanja Holstein, Lennart Martens, Laura L. Elo, Thilo Muth. Metaproteomics Beyond Databases: Addressing the Challenges and Potentials of De Novo Sequencing. PROTEOMICS 2025, 21 https://doi.org/10.1002/pmic.202400321
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. mzQC processing workflow. Each software library is separately used to process different QC metrics, which are ultimately combined into a single QC report. Note that the mzQC software libraries do not calculate QC metric values themselves, but rather this functionality is provided by external scripts (“Java app”, “R script”, “Python script”) that subsequently use the respective mzQC library to produce the corresponding mzQC reports.
Figure 2
Figure 2. Heatmap with dendrogram displaying QC metrics across eight MS runs (DMSO controls colored in green, sulforaphane samples colored in blue), clustered by MS runs on the horizontal axis and by QC metrics on the vertical axis. Colors in the heatmap represent percentile ranks calculated from the combined data set, with darker shades indicating lower percentile ranks and lighter shades indicating higher ranks. The QC metrics include the number of acquired MS/MS spectra, MS/MS identifications, peptide identifications, protein identifications, summed total ion current, and number of missed cleavages, among others. The metrics discussed in the text are highlighted in green. See the Jupyter Notebook on our GitHub repository (https://github.com/MS-Quality-Hub/mzqclib-manuscript/) for data analysis and code to generate the plot.
References
This article references 39 other publications.
- 1Bittremieux, W.; Tabb, D. L.; Impens, F.; Staes, A.; Timmerman, E.; Martens, L.; Laukens, K. Quality Control in Mass Spectrometry-Based Proteomics. Mass Spectrom. Rev. 2018, 37 (5), 697– 711, DOI: 10.1002/mas.215441Quality control in mass spectrometry-based proteomicsBittremieux, Wout; Tabb, David L.; Impens, Francis; Staes, An; Timmerman, Evy; Martens, Lennart; Laukens, KrisMass Spectrometry Reviews (2018), 37 (5), 697-711CODEN: MSRVD3; ISSN:0277-7037. (John Wiley & Sons, Inc.)A review. Mass spectrometry is a highly complex anal. technique and mass spectrometry-based proteomics expts. can be subject to a large variability, which forms an obstacle to obtaining accurate and reproducible results. Therefore, a comprehensive and systematic approach to quality control is an essential requirement to inspire confidence in the generated results. A typical mass spectrometry expt. consists of multiple different phases including the sample prepn., liq. chromatog., mass spectrometry, and bioinformatics stages. We review potential sources of variability that can impact the results of a mass spectrometry expt. occurring in all of these steps, and we discuss how to monitor and remedy the neg. influences on the exptl. results. Furthermore, we describe how specialized quality control samples of varying sample complexity can be incorporated into the exptl. workflow and how they can be used to rigorously assess detailed aspects of the instrument performance.
- 2Baker, M. 1,500 Scientists Lift the Lid on Reproducibility. Nature 2016, 533 (7604), 452– 454, DOI: 10.1038/533452a21,500 scientists lift the lid on reproducibilityBaker, MonyaNature (London, United Kingdom) (2016), 533 (7604), 452-454CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Survey sheds light on the 'crisis' rocking research.
- 3Rodriguez, H.; Snyder, M.; Uhlén, M.; Andrews, P.; Beavis, R.; Borchers, C.; Chalkley, R. J.; Cho, S. Y.; Cottingham, K.; Dunn, M.; Dylag, T.; Edgar, R.; Hare, P.; Heck, A. J. R.; Hirsch, R. F.; Kennedy, K.; Kolar, P.; Kraus, H.-J.; Mallick, P.; Nesvizhskii, A.; Ping, P.; Pontén, F.; Yang, L.; Yates, J. R.; Stein, S. E.; Hermjakob, H.; Kinsinger, C. R.; Apweiler, R. Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: The Amsterdam Principles. J. Proteome Res. 2009, 8 (7), 3689– 3692, DOI: 10.1021/pr900023z3Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: The Amsterdam PrinciplesRodriguez, Henry; Snyder, Mike; Uhlen, Mathias; Andrews, Phil; Beavis, Ronald; Borchers, Christoph; Chalkley, Robert J.; Cho, Sang Yun; Cottingham, Katie; Dunn, Michael; Dylag, Tomasz; Edgar, Ron; Hare, Peter; Heck, Albert J. R.; Hirsch, Roland F.; Kennedy, Karen; Kolar, Patrik; Kraus, Hans-Joachim; Mallick, Parag; Nesvizhskii, Alexey; Ping, Peipei; Ponten, Fredrik; Yang, Liming; Yates, John R.; Stein, Stephen E.; Hermjakob, Henning; Kinsinger, Christopher R.; Apweiler, RolfJournal of Proteome Research (2009), 8 (7), 3689-3692CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Policies supporting the rapid and open sharing of genomic data have directly fueled the accelerated pace of discovery in large-scale genomics research. The proteomics community is starting to implement analogous policies and infrastructure for making large-scale proteomics data widely available on a precompetitive basis. On August 14, 2008, the National Cancer Institute (NCI) convened the "International Summit on Proteomics Data Release and Sharing Policy" in Amsterdam, The Netherlands, to identify and address potential roadblocks to rapid and open access to data. The six principles agreed upon by key stakeholders at the summit addressed issues surrounding (1) timing, (2) comprehensiveness, (3) format, (4) deposition to repositories, (5) quality metrics, and (6) responsibility for proteomics data release. This summit report explores various approaches to develop a framework of data release and sharing principles that will most effectively fulfill the needs of the funding agencies and the research community.
- 4Kinsinger, C. R.; Apffel, J.; Baker, M.; Bian, X.; Borchers, C. H.; Bradshaw, R.; Brusniak, M.-Y.; Chan, D. W.; Deutsch, E. W.; Domon, B.; Gorman, J.; Grimm, R.; Hancock, W.; Hermjakob, H.; Horn, D.; Hunter, C.; Kolar, P.; Kraus, H.-J.; Langen, H.; Linding, R.; Moritz, R. L.; Omenn, G. S.; Orlando, R.; Pandey, A.; Ping, P.; Rahbar, A.; Rivers, R.; Seymour, S. L.; Simpson, R. J.; Slotta, D.; Smith, R. D.; Stein, S. E.; Tabb, D. L.; Tagle, D.; Yates, J. R. I.; Rodriguez, H. Recommendations for Mass Spectrometry Data Quality Metrics for Open Access Data (Corollary to the Amsterdam Principles). Mol. Cell. Proteomics 2011, 10 (12), O111.015446, DOI: 10.1074/mcp.O111.015446There is no corresponding record for this reference.
- 5Rudnick, P. A.; Clauser, K. R.; Kilpatrick, L. E.; Tchekhovskoi, D. V.; Neta, P.; Blonder, N.; Billheimer, D. D.; Blackman, R. K.; Bunk, D. M.; Cardasis, H. L.; Ham, A.-J. L.; Jaffe, J. D.; Kinsinger, C. R.; Mesri, M.; Neubert, T. A.; Schilling, B.; Tabb, D. L.; Tegeler, T. J.; Vega-Montoto, L.; Variyath, A. M.; Wang, M.; Wang, P.; Whiteaker, J. R.; Zimmerman, L. J.; Carr, S. A.; Fisher, S. J.; Gibson, B. W.; Paulovich, A. G.; Regnier, F. E.; Rodriguez, H.; Spiegelman, C.; Tempst, P.; Liebler, D. C.; Stein, S. E. Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses. Mol. Cell. Proteomics 2010, 9 (2), 225– 241, DOI: 10.1074/mcp.M900223-MCP2005Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analysesRudnick, Paul A.; Clauser, Karl R.; Kilpatrick, Lisa E.; Tchekhovskoi, Dmitrii V.; Neta, Pedatsur; Blonder, Niksa; Billheimer, Dean D.; Blackman, Ronald K.; Bunk, David M.; Cardasis, Helene L.; Ham, Amy-Joan L.; Jaffe, Jacob D.; Kinsinger, Christopher R.; Mesri, Mehdi; Neubert, Thomas A.; Schilling, Birgit; Tabb, David L.; Tegeler, Tony J.; Vega-Montoto, Lorenzo; Variyath, Asokan Mulayath; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Carr, Steven A.; Fisher, Susan J.; Gibson, Bradford W.; Paulovich, Amanda G.; Regnier, Fred E.; Rodriguez, Henry; Spiegelman, Cliff; Tempst, Paul; Liebler, Daniel C.; Stein, Stephen E.Molecular and Cellular Proteomics (2010), 9 (2), 225-241CODEN: MCPOBS; ISSN:1535-9484. (American Society for Biochemistry and Molecular Biology)A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quant. assessment of system performance and evaluation of tech. variability. Here we describe 46 system performance metrics for monitoring chromatog. performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlab. studies conducted under a common std. operating procedure identified outlier data and provided clues to specific causes. Moreover, interlab. variation reflected by the metrics indicates which system components vary the most between labs. Application of these metrics enables rational, quant. quality assessment for proteomics and other LC-MS/MS anal. applications.
- 6Ma, Z.-Q.; Polzin, K. O.; Dasari, S.; Chambers, M. C.; Schilling, B.; Gibson, B. W.; Tran, B. Q.; Vega-Montoto, L.; Liebler, D. C.; Tabb, D. L. QuaMeter: Multivendor Performance Metrics for LC-MS/MS Proteomics Instrumentation. Anal. Chem. 2012, 84 (14), 5845– 5850, DOI: 10.1021/ac300629p6QuaMeter: Multivendor Performance Metrics for LC-MS/MS Proteomics InstrumentationMa, Ze-Qiang; Polzin, Kenneth O.; Dasari, Surendra; Chambers, Matthew C.; Schilling, Birgit; Gibson, Bradford W.; Tran, Bao Q.; Vega-Montoto, Lorenzo; Liebler, Daniel C.; Tabb, David L.Analytical Chemistry (Washington, DC, United States) (2012), 84 (14), 5845-5850CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)LC-MS/MS-based proteomics studies rely on stable anal. system performance that can be evaluated by objective criteria. The National Institute of Stds. and Technol. (NIST) introduced the MSQC software to compute diverse metrics from exptl. LC-MS/MS data, enabling quality anal. and quality control (QA/QC) of proteomics instrumentation. In practice, however, several attributes of the MSQC software prevent its use for routine instrument monitoring. Here, we present QuaMeter, an open-source tool that improves MSQC in several aspects. QuaMeter can directly read raw data from instruments manufd. by different vendors. The software can work with a wide variety of peptide identification software for improved reliability and flexibility. Finally, QC metrics implemented in QuaMeter are rigorously defined and tested. The source code and binary versions of QuaMeter are available under Apache 2.0 License at http://fenchurch.mc.vanderbilt.edu.
- 7Pichler, P.; Mazanek, M.; Dusberger, F.; Weilnböck, L.; Huber, C. G.; Stingl, C.; Luider, T. M.; Straube, W. L.; Köcher, T.; Mechtler, K. SIMPATIQCO: A Server-Based Software Suite Which Facilitates Monitoring the Time Course of LC-MS Performance Metrics on Orbitrap Instruments. J. Proteome Res. 2012, 11 (11), 5540– 5547, DOI: 10.1021/pr300163u7SIMPATIQCO: A Server-Based Software Suite Which Facilitates Monitoring the Time Course of LC-MS Performance Metrics on Orbitrap InstrumentsPichler, Peter; Mazanek, Michael; Dusberger, Frederico; Weilnboeck, Lisa; Huber, Christian G.; Stingl, Christoph; Luider, Theo M.; Straube, Werner L.; Koecher, Thomas; Mechtler, KarlJournal of Proteome Research (2012), 11 (11), 5540-5547CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)While the performance of liq. chromatog. (LC) and mass spectrometry (MS) instrumentation continues to increase, applications such as analyses of complete or near-complete proteomes and quant. studies require const. and optimal system performance. For this reason, research labs. and core facilities alike are recommended to implement quality control (QC) measures as part of their routine workflows. Many labs. perform sporadic quality control checks. However, successive and systematic longitudinal monitoring of system performance would be facilitated by dedicated automatic or semiautomatic software solns. that aid an effortless anal. and display of QC metrics over time. We present the software package SIMPATIQCO (SIMPle AuTomatIc Quality COntrol) designed for evaluation of data from LTQ Orbitrap, Q-Exactive, LTQ FT, and LTQ instruments. A centralized SIMPATIQCO server can process QC data from multiple instruments. The software calcs. QC metrics supervising every step of data acquisition from LC and electrospray to MS. For each QC metric the software learns the range indicating adequate system performance from the uploaded data using robust statistics. Results are stored in a database and can be displayed in a comfortable manner from any computer in the lab. via a web browser. QC data can be monitored for individual LC runs as well as plotted over time. SIMPATIQCO thus assists the longitudinal monitoring of important QC metrics such as peptide elution times, peak widths, intensities, total ion current (TIC) as well as sensitivity, and overall LC-MS system performance; in this way the software also helps identify potential problems. The SIMPATIQCO software package is available free of charge.
- 8Bielow, C.; Mastrobuoni, G.; Kempa, S. Proteomics Quality Control: Quality Control Software for MaxQuant Results. J. Proteome Res. 2016, 15 (3), 777– 787, DOI: 10.1021/acs.jproteome.5b007808Proteomics Quality Control: Quality Control Software for MaxQuant ResultsBielow, Chris; Mastrobuoni, Guido; Kempa, StefanJournal of Proteome Research (2016), 15 (3), 777-787CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Mass spectrometry-based proteomics coupled to liq. chromatog. has matured into an automatized, high-throughput technol., producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality anal. (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream anal. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC-MS data generated by the MaxQuant software pipeline. PTXQC creates a QC report contg. a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of exptl. designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant's Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC.
- 9Chiva, C.; Olivella, R.; Borràs, E.; Espadas, G.; Pastor, O.; Solé, A.; Sabidó, E. QCloud: A Cloud-Based Quality Control System for Mass Spectrometry-Based Proteomics Laboratories. PLOS ONE 2018, 13 (1), e0189209 DOI: 10.1371/journal.pone.01892099QCloud: A cloud-based quality control system for mass spectrometry-based proteomics laboratoriesChiva, Cristina; Olivella, Roger; Borras, Eva; Espadas, Guadalupe; Pastor, Olga; Sole, Amanda; Sabido, EduardPLoS One (2018), 13 (1), e0189209/1-e0189209/14CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)The increasing no. of biomedical and translational applications in mass spectrometrybased proteomics poses new anal. challenges and raises the need for automated quality control systems. Despite previous efforts to set std. file formats, data processing workflows and key evaluation parameters for quality control, automated quality control systems are not yet widespread among proteomics labs., which limits the acquisition of high-quality results, inter-lab. comparisons and the assessment of variability of instrumental platforms. Here we present QCloud, a cloud-based system to support proteomics labs. in daily quality assessment using a user-friendly interface, easy setup, automated data processing and archiving, and unbiased instrument evaluation. QCloud supports the most common targeted and untargeted proteomics workflows, it accepts data formats from different vendors and it enables the annotation of acquired data and reporting incidences. A complete version of the QCloud system has successfully been developed and it is now open to the proteomics community. QCloud system is an open source project, publicly available under a Creative Commons License Attribution- ShareAlike 4.0.
- 10Broadhurst, D.; Goodacre, R.; Reinke, S. N.; Kuligowski, J.; Wilson, I. D.; Lewis, M. R.; Dunn, W. B. Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies. Metabolomics 2018, 14 (6), 72, DOI: 10.1007/s11306-018-1367-310Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studiesBroadhurst David; Reinke Stacey N; Goodacre Royston; Reinke Stacey N; Kuligowski Julia; Wilson Ian D; Lewis Matthew R; Dunn Warwick B; Dunn Warwick B; Dunn Warwick BMetabolomics : Official journal of the Metabolomic Society (2018), 14 (6), 72 ISSN:1573-3882.BACKGROUND: Quality assurance (QA) and quality control (QC) are two quality management processes that are integral to the success of metabolomics including their application for the acquisition of high quality data in any high-throughput analytical chemistry laboratory. QA defines all the planned and systematic activities implemented before samples are collected, to provide confidence that a subsequent analytical process will fulfil predetermined requirements for quality. QC can be defined as the operational techniques and activities used to measure and report these quality requirements after data acquisition. AIM OF REVIEW: This tutorial review will guide the reader through the use of system suitability and QC samples, why these samples should be applied and how the quality of data can be reported. KEY SCIENTIFIC CONCEPTS OF REVIEW: System suitability samples are applied to assess the operation and lack of contamination of the analytical platform prior to sample analysis. Isotopically-labelled internal standards are applied to assess system stability for each sample analysed. Pooled QC samples are applied to condition the analytical platform, perform intra-study reproducibility measurements (QC) and to correct mathematically for systematic errors. Standard reference materials and long-term reference QC samples are applied for inter-study and inter-laboratory assessment of data.
- 11Stratton, K. G.; Webb-Robertson, B.-J. M.; McCue, L. A.; Stanfill, B.; Claborne, D.; Godinez, I.; Johansen, T.; Thompson, A. M.; Burnum-Johnson, K. E.; Waters, K. M.; Bramer, L. M. pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data. J. Proteome Res. 2019, 18 (3), 1418– 1425, DOI: 10.1021/acs.jproteome.8b0076011pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological DataStratton, Kelly G.; Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Stanfill, Bryan; Claborne, Daniel; Godinez, Iobani; Johansen, Thomas; Thompson, Allison M.; Burnum-Johnson, Kristin E.; Waters, Katrina M.; Bramer, Lisa M.Journal of Proteome Research (2019), 18 (3), 1418-1425CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Prior to statistical anal. of mass spectrometry (MS) data, quality control (QC) of the identified biomol. peak intensities is imperative for reducing process-based sources of variation and extreme biol. outliers. Without this step, statistical results can be biased. Addnl., liq. chromatog.-MS proteomics data present inherent challenges due to large amts. of missing data that require special consideration during statistical anal. While a no. of R packages exist to address these challenges individually, there is no single R package that addresses all of them. The authors present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data anal. (EDA), visualization, and statistical anal. robust to missing data. Example anal. using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quant. and qual. statistical test, 19 proteins whose statistical significance would have been missed by a quant. test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
- 12Naake, T.; Rainer, J.; Huber, W. MsQuality: An Interoperable Open-Source Package for the Calculation of Standardized Quality Metrics of Mass Spectrometry Data. Bioinformatics 2023, 39 (10), btad618, DOI: 10.1093/bioinformatics/btad618There is no corresponding record for this reference.
- 13Kirwan, J. A.; Gika, H.; Beger, R. D.; Bearden, D.; Dunn, W. B.; Goodacre, R.; Theodoridis, G.; Witting, M.; Yu, L.-R.; Wilson, I. D. the metabolomics Quality Assurance and Quality Control Consortium (mQACC). Quality Assurance and Quality Control Reporting in Untargeted Metabolic Phenotyping: mQACC Recommendations for Analytical Quality Management. Metabolomics 2022, 18 (9), 70, DOI: 10.1007/s11306-022-01926-313Quality assurance and quality control reporting in untargeted metabolic phenotyping: mQACC recommendations for analytical quality managementKirwan, Jennifer A.; Gika, Helen; Beger, Richard D.; Bearden, Dan; Dunn, Warwick B.; Goodacre, Royston; Theodoridis, Georgios; Witting, Michael; Yu, Li-Rong; Wilson, Ian D.; the metabolomics Quality Assurance and Quality Control ConsortiumMetabolomics (2022), 18 (9), 70CODEN: METAHQ; ISSN:1573-3890. (Springer)Abstr.: Background: Demonstrating that the data produced in metabolic phenotyping investigations (metabolomics/metabonomics) is of good quality is increasingly seen as a key factor in gaining acceptance for the results of such studies. The use of established quality control (QC) protocols, including appropriate QC samples, is an important and evolving aspect of this process. However, inadequate or incorrect reporting of the QA/QC procedures followed in the study may lead to misinterpretation or overemphasis of the findings and prevent future metanal. of the body of work. Objective: The aim of this guidance is to provide researchers with a framework that encourages them to describe quality assessment and quality control procedures and outcomes in mass spectrometry and NMR spectroscopy-based methods in untargeted metabolomics, with a focus on reporting on QC samples in sufficient detail for them to be understood, trusted and replicated. There is no intent to be proscriptive with regard to anal. best practices; rather, guidance for reporting QA/QC procedures is suggested. A template that can be completed as studies progress to ensure that relevant data is collected, and further documents, are provided as online resources. Key reporting practices: Multiple topics should be considered when reporting QA/QC protocols and outcomes for metabolic phenotyping data. Coverage should include the role(s), sources, types, prepn. and uses of the QC materials and samples generally employed in the generation of metabolomic data. Details such as sample matrixes and sample prepn., the use of test mixts. and system suitability tests, blanks and technique-specific factors are considered and methods for reporting are discussed, including the importance of reporting the acceptance criteria for the QCs. To this end, the reporting of the QC samples and results are considered at two levels of detail: "minimal" and "best reporting practice" levels.
- 14Köfeler, H. C.; Ahrends, R.; Baker, E. S.; Ekroos, K.; Han, X.; Hoffmann, N.; Holčapek, M.; Wenk, M. R.; Liebisch, G. Recommendations for Good Practice in MS-Based Lipidomics. J. Lipid Res. 2021, 62, 100138, DOI: 10.1016/j.jlr.2021.10013814Recommendations for good practice in MS-based lipidomicsKofeler Harald C; Ahrends Robert; Baker Erin S; Ekroos Kim; Han Xianlin; Hoffmann Nils; Holcapek Michal; Wenk Markus R; Liebisch GerhardJournal of lipid research (2021), 62 (), 100138 ISSN:.In the last 2 decades, lipidomics has become one of the fastest expanding scientific disciplines in biomedical research. With an increasing number of new research groups to the field, it is even more important to design guidelines for assuring high standards of data quality. The Lipidomics Standards Initiative is a community-based endeavor for the coordination of development of these best practice guidelines in lipidomics and is embedded within the International Lipidomics Society. It is the intention of this review to highlight the most quality-relevant aspects of the lipidomics workflow, including preanalytics, sample preparation, MS, and lipid species identification and quantitation. Furthermore, this review just does not only highlights examples of best practice but also sheds light on strengths, drawbacks, and pitfalls in the lipidomic analysis workflow. While this review is neither designed to be a step-by-step protocol by itself nor dedicated to a specific application of lipidomics, it should nevertheless provide the interested reader with links and original publications to obtain a comprehensive overview concerning the state-of-the-art practices in the field.
- 15McDonald, J. G.; Ejsing, C. S.; Kopczynski, D.; Holčapek, M.; Aoki, J.; Arita, M.; Arita, M.; Baker, E. S.; Bertrand-Michel, J.; Bowden, J. A.; Brügger, B.; Ellis, S. R.; Fedorova, M.; Griffiths, W. J.; Han, X.; Hartler, J.; Hoffmann, N.; Koelmel, J. P.; Köfeler, H. C.; Mitchell, T. W.; O’Donnell, V. B.; Saigusa, D.; Schwudke, D.; Shevchenko, A.; Ulmer, C. Z.; Wenk, M. R.; Witting, M.; Wolrab, D.; Xia, Y.; Ahrends, R.; Liebisch, G.; Ekroos, K. Introducing the Lipidomics Minimal Reporting Checklist. Nat. Metab. 2022, 4 (9), 1086– 1088, DOI: 10.1038/s42255-022-00628-315Introducing the Lipidomics Minimal Reporting ChecklistMcDonald Jeffrey G; Ejsing Christer S; Ejsing Christer S; Kopczynski Dominik; Ahrends Robert; Holcapek Michal; Wolrab Denise; Aoki Junken; Aoki Junken; Arita Makoto; Arita Masanori; Baker Erin S; Bertrand-Michel Justine; Bowden John A; Brugger Britta; Ellis Shane R; Ellis Shane R; Mitchell Todd W; Fedorova Maria; Griffiths William J; Han Xianlin; Han Xianlin; Hartler Jurgen; Hartler Jurgen; Hoffmann Nils; Koelmel Jeremy P; Kofeler Harald C; O'Donnell Valerie B; Saigusa Daisuke; Schwudke Dominik; Schwudke Dominik; Schwudke Dominik; Shevchenko Andrej; Ulmer Candice Z; Wenk Markus R; Witting Michael; Xia Yu; Liebisch Gerhard; Ekroos KimNature metabolism (2022), 4 (9), 1086-1088 ISSN:.There is no expanded citation for this reference.
- 16Bittremieux, W.; Valkenborg, D.; Martens, L.; Laukens, K. Computational Quality Control Tools for Mass Spectrometry Proteomics. PROTEOMICS 2017, 17 (3–4), 1600159, DOI: 10.1002/pmic.201600159There is no corresponding record for this reference.
- 17Bittremieux, W.; Walzer, M.; Tenzer, S.; Zhu, W.; Salek, R. M.; Eisenacher, M.; Tabb, D. L. The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass Spectrometry. Anal. Chem. 2017, 89 (8), 4474– 4479, DOI: 10.1021/acs.analchem.6b0431017The Human Proteome Organization-Proteomics Standards Initiative Quality Control Working Group: Making Quality Control More Accessible for Biological Mass SpectrometryBittremieux, Wout; Walzer, Mathias; Tenzer, Stefan; Zhu, Weimin; Salek, Reza M.; Eisenacher, Martin; Tabb, David L.Analytical Chemistry (Washington, DC, United States) (2017), 89 (8), 4474-4479CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)To have confidence in results acquired during biol. mass spectrometry expts., a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization-Proteomics Stds. Initiative established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the previously proposed qcML format will be adapted to support a variety of use cases for both proteomics and metabolomics applications, and it will be established as an official PSI format. An important consideration is to avoid enforcing restrictive requirements on quality control but instead provide the basic tech. necessities required to support extensive quality control for any type of mass spectrometry-based workflow. The authors want to emphasize that this is an open community effort, and the authors seek participation from all scientists with an interest in this field.
- 18Deutsch, E. W.; Vizcaíno, J. A.; Jones, A. R.; Binz, P.-A.; Lam, H.; Klein, J.; Bittremieux, W.; Perez-Riverol, Y.; Tabb, D. L.; Walzer, M.; Ricard-Blum, S.; Hermjakob, H.; Neumann, S.; Mak, T. D.; Kawano, S.; Mendoza, L.; Van Den Bossche, T.; Gabriels, R.; Bandeira, N.; Carver, J.; Pullman, B.; Sun, Z.; Hoffmann, N.; Shofstahl, J.; Zhu, Y.; Licata, L.; Quaglia, F.; Tosatto, S. C. E.; Orchard, S. E. Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work. J. Proteome Res. 2023, 22 (2), 287– 301, DOI: 10.1021/acs.jproteome.2c0063718Proteomics Standards Initiative at Twenty Years: Current Activities and Future WorkDeutsch, Eric W.; Vizcaino, Juan Antonio; Jones, Andrew R.; Binz, Pierre-Alain; Lam, Henry; Klein, Joshua; Bittremieux, Wout; Perez-Riverol, Yasset; Tabb, David L.; Walzer, Mathias; Ricard-Blum, Sylvie; Hermjakob, Henning; Neumann, Steffen; Mak, Tytus D.; Kawano, Shin; Mendoza, Luis; Van Den Bossche, Tim; Gabriels, Ralf; Bandeira, Nuno; Carver, Jeremy; Pullman, Benjamin; Sun, Zhi; Hoffmann, Nils; Shofstahl, Jim; Zhu, Yunping; Licata, Luana; Quaglia, Federica; Tosatto, Silvio C. E.; Orchard, Sandra E.Journal of Proteome Research (2023), 22 (2), 287-301CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)A review. The Human Proteome Organization (HUPO) Proteomics Stds. Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI stds. We briefly describe the current state of the many existing PSI stds., some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of stds. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
- 19Mayer, G.; Montecchi-Palazzi, L.; Ovelleiro, D.; Jones, A. R.; Binz, P.-A.; Deutsch, E. W.; Chambers, M.; Kallhardt, M.; Levander, F.; Shofstahl, J.; Orchard, S.; Vizcaino, J. A.; Hermjakob, H.; Stephan, C.; Meyer, H. E.; Eisenacher, M. The HUPO Proteomics Standards Initiative- Mass Spectrometry Controlled Vocabulary. Database 2013, 2013, bat009, DOI: 10.1093/database/bat009There is no corresponding record for this reference.
- 20Röst, H. L.; Schmitt, U.; Aebersold, R.; Malmström, L. pyOpenMS: A Python-Based Interface to the OpenMS Mass-Spectrometry Algorithm Library. PROTEOMICS 2014, 14 (1), 74– 77, DOI: 10.1002/pmic.20130024620pyOpenMS: a Python-based interface to the OpenMS mass-spectrometry algorithm libraryRost Hannes L; Schmitt Uwe; Aebersold Ruedi; Malmstrom LarsProteomics (2014), 14 (1), 74-7 ISSN:.pyOpenMS is an open-source, Python-based interface to the C++ OpenMS library, providing facile access to a feature-rich, open-source algorithm library for MS-based proteomics analysis. It contains Python bindings that allow raw access to the data structures and algorithms implemented in OpenMS, specifically those for file access (mzXML, mzML, TraML, mzIdentML among others), basic signal processing (smoothing, filtering, de-isotoping, and peak-picking) and complex data analysis (including label-free, SILAC, iTRAQ, and SWATH analysis tools). pyOpenMS thus allows fast prototyping and efficient workflow development in a fully interactive manner (using the interactive Python interpreter) and is also ideally suited for researchers not proficient in C++. In addition, our code to wrap a complex C++ library is completely open-source, allowing other projects to create similar bindings with ease. The pyOpenMS framework is freely available at https://pypi.python.org/pypi/pyopenms while the autowrap tool to create Cython code automatically is available at https://pypi.python.org/pypi/autowrap (both released under the 3-clause BSD licence).
- 21Levitsky, L. I.; Klein, J. A.; Ivanov, M. V.; Gorshkov, M. Pyteomics 4.0: Five Years of Development of a Python Proteomics Framework. J. Proteome Res. 2019, 18 (2), 709– 714, DOI: 10.1021/acs.jproteome.8b0071721Pyteomics 4.0: Five Years of Development of a Python Proteomics FrameworkLevitsky, Lev I.; Klein, Joshua A.; Ivanov, Mark V.; Gorshkov, Mikhail V.Journal of Proteome Research (2019), 18 (2), 709-714CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)A review. Many of the novel ideas that drive today's proteomic technologies are focused essentially on exptl. or data-processing workflows. The latter are implemented and published in a no. of ways, from custom scripts and programs, to projects built using general-purpose or specialized workflow engines; a large part of routine data processing is performed manually or with custom scripts that remain unpublished. Facilitating the development of reproducible data-processing workflows becomes essential for increasing the efficiency of proteomic research. To assist in overcoming the bioinformatics challenges in the daily practice of proteomic labs., 5 years ago we developed and announced Pyteomics, a freely available open-source library providing Python interfaces to proteomic data. We summarize the new functionality of Pyteomics developed during the time since its introduction.
- 22Huber, F.; Verhoeven, S.; Meijer, C.; Spreeuw, H.; Castilla, E.; Geng, C.; van der Hooft, J.; Rogers, S.; Belloum, A.; Diblen, F.; Spaaks, J. Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data. J. Open Source Softw. 2020, 5 (52), 2411, DOI: 10.21105/joss.02411There is no corresponding record for this reference.
- 23Bittremieux, W.; Levitsky, L.; Pilz, M.; Sachsenberg, T.; Huber, F.; Wang, M.; Dorrestein, P. C. Unified and Standardized Mass Spectrometry Data Processing in Python Using Spectrum_utils. J. Proteome Res. 2023, 22 (2), 625– 631, DOI: 10.1021/acs.jproteome.2c00632There is no corresponding record for this reference.
- 24Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification. Anal. Chem. 2006, 78 (3), 779– 787, DOI: 10.1021/ac051437y24XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and IdentificationSmith, Colin A.; Want, Elizabeth J.; O'Maille, Grace; Abagyan, Ruben; Siuzdak, GaryAnalytical Chemistry (2006), 78 (3), 779-787CODEN: ANCHAM; ISSN:0003-2700. (American Chemical Society)Metabolite profiling in biomarker discovery, enzyme substrate assignment, drug activity/specificity detn., and basic metabolic research requires new data preprocessing approaches to correlate specific metabolites to their biol. origin. Here we introduce an LC/MS-based data anal. approach, XCMS, which incorporates novel nonlinear retention time alignment, matched filtration, peak detection, and peak matching. Without using internal stds., the method dynamically identifies hundreds of endogenous metabolites for use as stds., calcg. a nonlinear retention time correction profile for each sample. Following retention time correction, the relative metabolite ion intensities are directly compared to identify changes in specific endogenous metabolites, such as potential biomarkers. The software is demonstrated using data sets from a previously reported enzyme knockout study and a large-scale study of plasma samples. XCMS is freely available under an open-source license at http://metlin.scripps.edu/download/.
- 25Gatto, L.; Gibb, S.; Rainer, J. MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry Data. J. Proteome Res. 2021, 20 (1), 1063– 1069, DOI: 10.1021/acs.jproteome.0c0031325MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry DataGatto, Laurent; Gibb, Sebastian; Rainer, JohannesJournal of Proteome Research (2021), 20 (1), 1063-1069CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)We present version 2 of the MSnbase R/Bioconductor package. MSnbase provides infrastructure for the manipulation, processing, and visualization of mass spectrometry data. We focus on the new on-disk infrastructure, that allows the handling of large raw mass spectrometry expts. on commodity hardware and illustrate how the package is used for elegant data processing, method development, and visualization.
- 26Barsnes, H.; Vaudel, M.; Colaert, N.; Helsens, K.; Sickmann, A.; Berven, F. S.; Martens, L. Compomics-Utilities: An Open-Source Java Library for Computational Proteomics. BMC Bioinformatics 2011, 12 (1), 70, DOI: 10.1186/1471-2105-12-70There is no corresponding record for this reference.
- 27Schmid, R.; Heuckeroth, S.; Korf, A.; Smirnov, A.; Myers, O.; Dyrlund, T. S.; Bushuiev, R.; Murray, K. J.; Hoffmann, N.; Lu, M.; Sarvepalli, A.; Zhang, Z.; Fleischauer, M.; Dührkop, K.; Wesner, M.; Hoogstra, S. J.; Rudt, E.; Mokshyna, O.; Brungs, C.; Ponomarov, K.; Mutabdžija, L.; Damiani, T.; Pudney, C. J.; Earll, M.; Helmer, P. O.; Fallon, T. R.; Schulze, T.; Rivas-Ubach, A.; Bilbao, A.; Richter, H.; Nothias, L.-F.; Wang, M.; Orešič, M.; Weng, J.-K.; Böcker, S.; Jeibmann, A.; Hayen, H.; Karst, U.; Dorrestein, P. C.; Petras, D.; Du, X.; Pluskal, T. Integrative Analysis of Multimodal Mass Spectrometry Data in MZmine 3. Nat. Biotechnol. 2023, 41 (4), 447– 449, DOI: 10.1038/s41587-023-01690-227Integrative analysis of multimodal mass spectrometry data in MZmine 3Schmid, Robin; Heuckeroth, Steffen; Korf, Ansgar; Smirnov, Aleksandr; Myers, Owen; Dyrlund, Thomas S.; Bushuiev, Roman; Murray, Kevin J.; Hoffmann, Nils; Lu, Miaoshan; Sarvepalli, Abinesh; Zhang, Zheng; Fleischauer, Markus; Duhrkop, Kai; Wesner, Mark; Hoogstra, Shawn J.; Rudt, Edward; Mokshyna, Olena; Brungs, Corinna; Ponomarov, Kirill; Mutabdzija, Lana; Damiani, Tito; Pudney, Chris J.; Earll, Mark; Helmer, Patrick O.; Fallon, Timothy R.; Schulze, Tobias; Rivas-Ubach, Albert; Bilbao, Aivett; Richter, Henning; Nothias, Louis-Felix; Wang, Mingxun; Oresic, Matej; Weng, Jing-Ke; Bocker, Sebastian; Jeibmann, Astrid; Hayen, Heiko; Karst, Uwe; Dorrestein, Pieter C.; Petras, Daniel; Du, Xiuxia; Pluskal, TomasNature Biotechnology (2023), 41 (4), 447-449CODEN: NABIF9; ISSN:1087-0156. (Nature Portfolio)There is no expanded citation for this reference.
- 28Martens, L.; Chambers, M.; Sturm, M.; Kessner, D.; Levander, F.; Shofstahl, J.; Tang, W. H.; Römpp, A.; Neumann, S.; Pizarro, A. D.; Montecchi-Palazzi, L.; Tasman, N.; Coleman, M.; Reisinger, F.; Souda, P.; Hermjakob, H.; Binz, P.-A.; Deutsch, E. W. mzML─a Community Standard for Mass Spectrometry Data. Mol. Cell. Proteomics 2011, 10 (1), R110.000133, DOI: 10.1074/mcp.R110.000133There is no corresponding record for this reference.
- 29Griss, J.; Jones, A. R.; Sachsenberg, T.; Walzer, M.; Gatto, L.; Hartler, J.; Thallinger, G. G.; Salek, R. M.; Steinbeck, C.; Neuhauser, N.; Cox, J.; Neumann, S.; Fan, J.; Reisinger, F.; Xu, Q.-W.; del Toro, N.; Perez-Riverol, Y.; Ghali, F.; Bandeira, N.; Xenarios, I.; Kohlbacher, O.; Vizcaíno, J. A.; Hermjakob, H. The mzTab Data Exchange Format: Communicating Mass-Spectrometry-Based Proteomics and Metabolomics Experimental Results to a Wider Audience. Mol. Cell. Proteomics 2014, 13 (10), 2765– 2775, DOI: 10.1074/mcp.O113.03668129The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider AudienceGriss, Johannes; Jones, Andrew R.; Sachsenberg, Timo; Walzer, Mathias; Gatto, Laurent; Hartler, Jurgen; Thallinger, Gerhard G.; Salek, Reza M.; Steinbeck, Christoph; Neuhauser, Nadin; Cox, Jurgen; Neumann, Steffen; Fan, Jun; Reisinger, Florian; Xu, Qing-Wei; del Toro, Noemi; Perez-Riverol, Yasset; Ghali, Fawaz; Bandeira, Nuno; Xenarios, Ioannis; Kohlbacher, Oliver; Vizcaino, Juan Antonio; Hermjakob, HenningMolecular & Cellular Proteomics (2014), 13 (10), 2765-2775CODEN: MCPOBS; ISSN:1535-9484. (American Society for Biochemistry and Molecular Biology)We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. MzTab is intended as a lightwt. supplement to the existing std. XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. MzTab files can contain protein, peptide, and small mol. identifications together with exptl. metadata and basic quant. information. The format is not intended to store the complete exptl. evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the exptl. design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biol. community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive addnl. documentation can be found online.
- 30Deutsch, E. W.; Bandeira, N.; Sharma, V.; Perez-Riverol, Y.; Carver, J. J.; Kundu, D. J.; García-Seisdedos, D.; Jarnuczak, A. F.; Hewapathirana, S.; Pullman, B. S.; Wertz, J.; Sun, Z.; Kawano, S.; Okuda, S.; Watanabe, Y.; Hermjakob, H.; MacLean, B.; MacCoss, M. J.; Zhu, Y.; Ishihama, Y.; Vizcaíno, J. A. The ProteomeXchange Consortium in 2020: Enabling “big Data” Approaches in Proteomics. Nucleic Acids Res. 2019, 48 (D1), D1145– D1152, DOI: 10.1093/nar/gkz984There is no corresponding record for this reference.
- 31Marshall, S. A.; Young, R. B.; Lewis, J. M.; Rutten, E. L.; Gould, J.; Barlow, C. K.; Giogha, C.; Marcelino, V. R.; Fields, N.; Schittenhelm, R. B.; Hartland, E. L.; Scott, N. E.; Forster, S. C.; Gulliver, E. L. The Broccoli-Derived Antioxidant Sulforaphane Changes the Growth of Gastrointestinal Microbiota, Allowing for the Production of Anti-Inflammatory Metabolites. J. Funct. Foods 2023, 107, 105645, DOI: 10.1016/j.jff.2023.105645There is no corresponding record for this reference.
- 32Hulstaert, N.; Shofstahl, J.; Sachsenberg, T.; Walzer, M.; Barsnes, H.; Martens, L.; Perez-Riverol, Y. ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion. J. Proteome Res. 2020, 19 (1), 537– 542, DOI: 10.1021/acs.jproteome.9b0032832ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File ConversionHulstaert, Niels; Shofstahl, Jim; Sachsenberg, Timo; Walzer, Mathias; Barsnes, Harald; Martens, Lennart; Perez-Riverol, YassetJournal of Proteome Research (2020), 19 (1), 537-542CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)The field of computational proteomics is approaching the big data age, driven both by a continuous growth in the no. of samples analyzed per expt. as well as by the growing amt. of data obtained in each anal. run. In order to process these large amts. of data, it is increasingly necessary to use elastic compute resources such as Linux-based cluster environments and cloud infrastructures. Unfortunately, the vast majority of cross-platform proteomics tools are not able to operate directly on the proprietary formats generated by the diverse mass spectrometers. Here, we present ThermoRawFileParser, an open-source, cross-platform tool that converts Thermo RAW files into open file formats such as MGF and the HUPO-PSI std. file format mzML. To ensure the broadest possible availability and to increase integration capabilities with popular workflow systems such as Galaxy or Nextflow, we have also built Conda package and BioContainers container around ThermoRawFileParser. In addn., we implemented a user-friendly interface (ThermoRawFileParserGUI) for those users not familiar with command-line tools. Finally, we performed a benchmark of ThermoRawFileParser and msconvert to verify that the converted mzML files contain reliable quant. results.
- 33Park, C. Y.; Klammer, A. A.; Käll, L.; MacCoss, M. J.; Noble, W. S. Rapid and Accurate Peptide Identification from Tandem Mass Spectra. J. Proteome Res. 2008, 7 (7), 3022– 3027, DOI: 10.1021/pr800127y33Rapid and Accurate Peptide Identification from Tandem Mass SpectraPark, Christopher Y.; Klammer, Aaron A.; Kall, Lukas; MacCoss, Michael J.; Noble, William S.Journal of Proteome Research (2008), 7 (7), 3022-3027CODEN: JPROBS; ISSN:1535-3893. (American Chemical Society)Mass spectrometry, the core technol. in the field of proteomics, promises to enable scientists to identify and quantify the entire complement of proteins in a complex biol. sample. Currently, the primary bottleneck in this type of expt. is computational. Existing algorithms for interpreting mass spectra are slow and fail to identify a large proportion of the given spectra. We describe a database search program called Crux that reimplements and extends the widely used database search program SEQUEST. For speed, Crux uses a peptide indexing scheme to rapidly retrieve candidate peptides for a given spectrum. For each peptide in the target database, Crux generates shuffled decoy peptides on the fly, providing a good null model and, hence, accurate false discovery rate ests. Crux also implements two recently described postprocessing methods: a p value calcn. based upon fitting a Weibull distribution to the obsd. scores, and a semisupervised method that learns to discriminate between target and decoy matches. Both methods significantly improve the overall rate of peptide identification. Crux is implemented in C and is distributed with source code freely to noncommercial users.
- 34The UniProt Consortium UniProt: A Hub for Protein Information. Nucleic Acids Res. 2015, 43 (D1), D204– D212, DOI: 10.1093/nar/gku989 .There is no corresponding record for this reference.
- 35Lin, A.; See, D.; Fondrie, W. E.; Keich, U.; Noble, W. S. Target-Decoy False Discovery Rate Estimation Using Crema. PROTEOMICS 2024, 24 (8), 2300084, DOI: 10.1002/pmic.202300084There is no corresponding record for this reference.
- 36Côté, R. G.; Reisinger, F.; Martens, L. jmzML, an Open-Source Java API for mzML, the PSI Standard for MS Data. PROTEOMICS 2010, 10 (7), 1332– 1335, DOI: 10.1002/pmic.200900719There is no corresponding record for this reference.
- 37Pluskal, T.; Hoffmann, N.; Du, X.; Weng, J.-K. Mass Spectrometry Development Kit (MSDK): A Java Library for Mass Spectrometry Data Processing. In New Developments in Mass Spectrometry; Winkler, R., Ed.; Royal Society of Chemistry: Cambridge, 2020; pp 399– 405, DOI: 10.1039/9781788019880-00399 .There is no corresponding record for this reference.
- 38McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference; van der Walt, S.; Millman, J., Eds.; Austin, Texas, USA, 2010; pp 51– 56, DOI: 10.25080/Majora-92bf1922-00a .There is no corresponding record for this reference.
- 39Thomas, K.; Benjamin, R.-K.; Fernando, P.; Brian, G.; Matthias, B.; Jonathan, F.; Kyle, K.; Jessica, H.; Jason, G.; Sylvain, C.; Paul, I.; Damián, A.; Safia, A.; Carol, W. Jupyter Development Team. Jupyter Notebooks - A Publishing Format for Reproducible Computational Workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas; IOS Press, 2016; pp 87– 90.There is no corresponding record for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jasms.4c00174.
Supplementary Table 1: List of QC metrics considered during the analysis and their accession numbers in the PSI-MS controlled vocabulary (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.