ACS Publications. Most Trusted. Most Cited. Most Read
My Activity
CONTENT TYPES

Figure 1Loading Img

EPIFANY: A Method for Efficient High-Confidence Protein Inference

  • Julianus Pfeuffer*
    Julianus Pfeuffer
    Applied Bioinformatics, Department of Computer Science, University of Tübingen, 72076 Tübingen, Germany
    Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
    Algorithmic Bioinformatics, Department of Bioinformatics, Freie Universität Berlin, 14195 Berlin, Germany
    *E-mail: [email protected] (J.P.).
  • Timo Sachsenberg
    Timo Sachsenberg
    Applied Bioinformatics, Department of Computer Science, University of Tübingen, 72076 Tübingen, Germany
    Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
  • Tjeerd M. H. Dijkstra
    Tjeerd M. H. Dijkstra
    Biomolecular Interactions, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
  • Oliver Serang*
    Oliver Serang
    Department of Computer Science, University of Montana, Missoula, Montana 59812, United States
    *E-mail: [email protected] (O.S.).
  • Knut Reinert
    Knut Reinert
    Algorithmic Bioinformatics, Department of Bioinformatics, Freie Universität Berlin, 14195 Berlin, Germany
    More by Knut Reinert
  • , and 
  • Oliver Kohlbacher*
    Oliver Kohlbacher
    Biomolecular Interactions, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
    Applied Bioinformatics, Department of Computer Science, University of Tübingen, 72076 Tübingen, Germany
    Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
    Institute for Translational Bioinformatics, University Hospital Tübingen, 72076 Tübingen, Germany
    Quantitative Biology Center, University of Tübingen, 72076 Tübingen, Germany
    *E-mail: [email protected] (O.K.).
Cite this: J. Proteome Res. 2020, 19, 3, 1060–1072
Publication Date (Web):January 24, 2020
https://doi.org/10.1021/acs.jproteome.9b00566
Copyright © 2020 American Chemical Society

    Article Views

    1033

    Altmetric

    -

    Citations

    LEARN ABOUT THESE METRICS
    Other access options
    Supporting Info (3)»

    Abstract

    Abstract Image

    Accurate protein inference in the presence of shared peptides is still one of the key problems in bottom-up proteomics. Most protein inference tools employing simple heuristic inference strategies are efficient but exhibit reduced accuracy. More advanced probabilistic methods often exhibit better inference quality but tend to be too slow for large data sets. Here, we present a novel protein inference method, EPIFANY, combining a loopy belief propagation algorithm with convolution trees for efficient processing of Bayesian networks. We demonstrate that EPIFANY combines the reliable protein inference of Bayesian methods with significantly shorter runtimes. On the 2016 iPRG protein inference benchmark data, EPIFANY is the only tested method that finds all true-positive proteins at a 5% protein false discovery rate (FDR) without strict prefiltering on the peptide-spectrum match (PSM) level, yielding an increase in identification performance (+10% in the number of true positives and +14% in partial AUC) compared to previous approaches. Even very large data sets with hundreds of thousands of spectra (which are intractable with other Bayesian and some non-Bayesian tools) can be processed with EPIFANY within minutes. The increased inference quality including shared peptides results in better protein inference results and thus increased robustness of the biological hypotheses generated. EPIFANY is available as open-source software for all major platforms at https://OpenMS.de/epifany.

    Read this article

    To access this article, please review the available access options below.

    Get instant access

    Purchase Access

    Read this article for 48 hours. Check out below using your ACS ID or as a guest.

    Recommended

    Access through Your Institution

    You may have access to this article through your institution.

    Your institution does not have access to this content. You can change your affiliated institution below.

    Supporting Information

    ARTICLE SECTIONS
    Jump To

    The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jproteome.9b00566.

    • extendedGridAUCsGreedyB001.xlsx: Supplementary Table S5 with actual entrapment AUCs of an extended grid search of our parameters (XLXS)

    • scripts_and_workflows.zip: Supplementary Data 1 with a collection of scripts and workflows used in the preparation of this manuscript (ZIP)

    • Tables S1 (Comet) and S2 (MSGFPlus), extracted search engine-specific features for Percolator; Figures S1–S11, more detailed information on the iPRG2016 results for different samples and PSM cutoffs; Figures S12–S16, guided example of the message passing algorithm; Figures S17 and S18, visualization of potentially oscillating parts of a graph; Figure S19, effect of differently shuffled decoy databases on EPIFANY’s parameter estimation; Tables S3 and S4, runtimes of the different methods on the iPRG2016 A + B sample for a measure of scalability; Figure S20, runtimes versus sizes of connected components; Figure S21, alternative results on the 1% protein entrapment FDR level; detailed descriptions of the model, algorithm, convergence, and parameter estimation; summarization of Figures S1–S11 (PDF)

    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

    Cited By

    This article is cited by 15 publications.

    1. Yulin Li, Qingzu He, Huan Guo, Stella C. Shuai, Jinyan Cheng, Liyu Liu, Jianwei Shuai. AttnPep: A Self-Attention-Based Deep Learning Method for Peptide Identification in Shotgun Proteomics. Journal of Proteome Research 2024, 23 (2) , 834-843. https://doi.org/10.1021/acs.jproteome.3c00729
    2. Steven R. Shuken. An Introduction to Mass Spectrometry-Based Proteomics. Journal of Proteome Research 2023, 22 (7) , 2151-2171. https://doi.org/10.1021/acs.jproteome.2c00838
    3. Mark V. Ivanov, Elizaveta M. Solovyeva, Julia A. Bubis, Mikhail V. Gorshkov. Improving the Protein Inference from Bottom-Up Proteomic Data Using Identifications from MS1 Spectra. Journal of the American Society for Mass Spectrometry 2021, 32 (5) , 1258-1262. https://doi.org/10.1021/jasms.1c00061
    4. Nicolas Sénécaut, Gelio Alves, Hendrik Weisser, Laurent Lignières, Samuel Terrier, Lilian Yang-Crosson, Pierre Poulain, Gaëlle Lelandais, Yi-Kuo Yu, Jean-Michel Camadro. Novel Insights into Quantitative Proteomics from an Innovative Bottom-Up Simple Light Isotope Metabolic (bSLIM) Labeling Data Processing Strategy. Journal of Proteome Research 2021, 20 (3) , 1476-1487. https://doi.org/10.1021/acs.jproteome.0c00478
    5. Julian Aldana, Miranda L. Gardner, Michael A. Freitas. Integrative Multi-Omics Analysis of Oncogenic EZH2 Mutants: From Epigenetic Reprogramming to Molecular Signatures. International Journal of Molecular Sciences 2023, 24 (14) , 11378. https://doi.org/10.3390/ijms241411378
    6. Tanja Holstein, Franziska Kistner, Lennart Martens, Thilo Muth. PepGM: a probabilistic graphical model for taxonomic inference of viral proteome samples with associated confidence scores. Bioinformatics 2023, 39 (5) https://doi.org/10.1093/bioinformatics/btad289
    7. Hui Peng, Limsoon Wong, Wilson Wen Bin Goh, . ProInfer: An interpretable protein inference tool leveraging on biological networks. PLOS Computational Biology 2023, 19 (3) , e1010961. https://doi.org/10.1371/journal.pcbi.1010961
    8. Rachel M. Miller, Lloyd M. Smith. Overview and considerations in bottom-up proteomics. The Analyst 2023, 148 (3) , 475-486. https://doi.org/10.1039/D2AN01246D
    9. Matthew The, Patroklos Samaras, Bernhard Kuster, Mathias Wilhelm. Reanalysis of ProteomicsDB Using an Accurate, Sensitive, and Scalable False Discovery Rate Estimation Approach for Protein Groups. Molecular & Cellular Proteomics 2022, 21 (12) , 100437. https://doi.org/10.1016/j.mcpro.2022.100437
    10. Rachel M. Miller, Ben T. Jordan, Madison M. Mehlferber, Erin D. Jeffery, Christina Chatzipantsiou, Simi Kaur, Robert J. Millikin, Yunxiang Dai, Simone Tiberi, Peter J. Castaldi, Michael R. Shortreed, Chance John Luckey, Ana Conesa, Lloyd M. Smith, Anne Deslattes Mays, Gloria M. Sheynkman. Enhanced protein isoform characterization through long-read proteogenomics. Genome Biology 2022, 23 (1) https://doi.org/10.1186/s13059-022-02624-y
    11. Shichao Feng, Hong-Long Ji, Huan Wang, Bailu Zhang, Ryan Sterzenbach, Chongle Pan, Xuan Guo, . MetaLP: An integrative linear programming method for protein inference in metaproteomics. PLOS Computational Biology 2022, 18 (10) , e1010603. https://doi.org/10.1371/journal.pcbi.1010603
    12. Karin Schork, Michael Turewicz, Julian Uszkoreit, Jörg Rahnenführer, Martin Eisenacher, . Characterization of peptide-protein relationships in protein ambiguity groups via bipartite graphs. PLOS ONE 2022, 17 (10) , e0276401. https://doi.org/10.1371/journal.pone.0276401
    13. Matthew S. Conrad, Miranda L. Gardner, Christine Miguel, Michael A. Freitas, Kara M. Rood, Marwan Ma’ayeh, . Proteomic analysis of the umbilical cord in fetal growth restriction and preeclampsia. PLOS ONE 2022, 17 (2) , e0262041. https://doi.org/10.1371/journal.pone.0262041
    14. Katrin Marcus, Cécile Lelong, Thierry Rabilloud. What Room for Two-Dimensional Gel-Based Proteomics in a Shotgun Proteomics World?. Proteomes 2020, 8 (3) , 17. https://doi.org/10.3390/proteomes8030017
    15. Katrin Marcus, Thierry Rabilloud. How Do the Different Proteomic Strategies Cope with the Complexity of Biological Regulations in a Multi-Omic World? Critical Appraisal and Suggestions for Improvements. Proteomes 2020, 8 (3) , 23. https://doi.org/10.3390/proteomes8030023

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    You’ve supercharged your research process with ACS and Mendeley!

    STEP 1:
    Click to create an ACS ID

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    MENDELEY PAIRING EXPIRED
    Your Mendeley pairing has expired. Please reconnect