Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome ProjectClick to copy article linkArticle link copied!
- Thibault RobinThibault RobinCALIPHO Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, CH-1211 Geneva, SwitzerlandProteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, CH-1211 Geneva, SwitzerlandComputer Science Department, University of Geneva, CH-1211 Geneva, SwitzerlandDepartment of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CH-1211 Geneva, SwitzerlandMore by Thibault Robin
- Amos BairochAmos BairochCALIPHO Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, CH-1211 Geneva, SwitzerlandDepartment of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CH-1211 Geneva, SwitzerlandMore by Amos Bairoch
- Markus MüllerMarkus MüllerVital-IT Group, SIB Swiss Institute of Bioinformatics, Genopode Building, Quartier Sorge, CH-1015 Lausanne, SwitzerlandMore by Markus Müller
- Frédérique LisacekFrédérique LisacekProteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, CH-1211 Geneva, SwitzerlandComputer Science Department, University of Geneva, CH-1211 Geneva, SwitzerlandSection of Biology, University of Geneva, CH-1211 Geneva, SwitzerlandMore by Frédérique Lisacek
- Lydie Lane*Lydie Lane*E-mail: [email protected]. Tel: +41 (0) 22 379 58 41.CALIPHO Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, CH-1211 Geneva, SwitzerlandDepartment of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CH-1211 Geneva, SwitzerlandMore by Lydie Lane
Abstract
The practice of data sharing in the proteomics field took off and quickly spread in recent years as a result of collective effort. Nowadays, most journal editors mandate the submission of the original raw mass spectra to one of the databases of the ProteomeXchange consortium. With the exception of large institutional initiatives such as PeptideAtlas or the GPMDB, few new studies are however based on the reanalysis of mass spectrometry data. A wealth of information is thus left unexploited in public databases and repositories. Here, we present the large-scale reanalysis of 41 publicly available data sets corresponding to experiments carried out on the HeLa cancer cell line using a custom workflow. In addition to the search of new post-translational modification sites and “missing proteins”, our main goal is to identify single amino acid variants and evaluate their impact on protein expression and stability through the spectral counting quantification approach. The X!Tandem software was selected to perform the search of a total of 56 363 701 tandem mass spectra against a customized variant protein database, compiled by the application of the in-house MzVar tool on HeLa-specific somatic and genomic variants retrieved from the COSMIC cell line project. After filtering the resulting identifications with a 1% FDR threshold computed at the protein level, 49 466 unique peptides were identified in 7266 protein entries, allowing the validation of 5576 protein entries in accordance with the HPP guidelines version 2.1. A new “missing protein” was observed (FRAT2, NX_O75474, chromosome 10), and 189 new phosphorylation and 392 new protein N-terminal acetylation sites could be identified. Twenty-four variant peptides were also identified, corresponding to 21 variants in 21 proteins. For three of the nine heterozygous cases where both the variant peptide and its wild-type counterpart were detected, the application of a two-tailed sign test showed a significant difference in the abundance of the two peptide versions.
Cited By
This article is cited by 17 publications.
- Lev I. Levitsky, Mark V. Ivanov, Anton O. Goncharov, Anna A. Kliuchnikova, Julia A. Bubis, Anna A. Lobas, Elizaveta M. Solovyeva, Mikhail A. Pyatnitskiy, Ruslan K. Ovchinnikov, Michail S. Kukharsky, Tatiana E. Farafonova, Svetlana E. Novikova, Victor G. Zgoda, Irina A. Tarasova, Mikhail V. Gorshkov, Sergei A. Moshkovskii. Massive Proteogenomic Reanalysis of Publicly Available Proteomic Datasets of Human Tissues in Search for Protein Recoding via Adenosine-to-Inosine RNA Editing. Journal of Proteome Research 2023, 22
(6)
, 1695-1711. https://doi.org/10.1021/acs.jproteome.2c00740
- Gilbert S. Omenn, Lydie Lane, Christopher M. Overall, Ileana M. Cristea, Fernando J. Corrales, Cecilia Lindskog, Young-Ki Paik, Jennifer E. Van Eyk, Siqi Liu, Stephen R. Pennington, Michael P. Snyder, Mark S. Baker, Nuno Bandeira, Ruedi Aebersold, Robert L. Moritz, Eric W. Deutsch. Research on the Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, According to the HUPO Human Proteome Project. Journal of Proteome Research 2020, 19
(12)
, 4735-4746. https://doi.org/10.1021/acs.jproteome.0c00485
- Gilbert S. Omenn, Lydie Lane, Christopher M. Overall, Fernando J. Corrales, Jochen M. Schwenk, Young-Ki Paik, Jennifer E. Van Eyk, Siqi Liu, Stephen Pennington, Michael P. Snyder, Mark S. Baker, Eric W. Deutsch. Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project. Journal of Proteome Research 2019, 18
(12)
, 4098-4107. https://doi.org/10.1021/acs.jproteome.9b00434
- Young-Ki Paik, , Christopher
M. Overall, , Fernando Corrales, , Eric W. Deutsch, , Lydie Lane, , Gilbert S. Omenn. Toward Completion of the Human Proteome Parts List: Progress Uncovering Proteins That Are Missing or Have Unknown Function and Developing Analytical Methods. Journal of Proteome Research 2018, 17
(12)
, 4023-4030. https://doi.org/10.1021/acs.jproteome.8b00885
- Irena
K. Kushner, Geremy Clair, Samuel Owen Purvine, Joon-Yong Lee, Joshua N. Adkins, Samuel H. Payne. Individual Variability of Protein Expression in Human Tissues. Journal of Proteome Research 2018, 17
(11)
, 3914-3922. https://doi.org/10.1021/acs.jproteome.8b00580
- Heta Desai, Katrina H. Andrews, Kristina V. Bergersen, Samuel Ofori, Fengchao Yu, Flowreen Shikwana, Mark A. Arbing, Lisa M. Boatner, Miranda Villanueva, Nicholas Ung, Elaine F. Reed, Alexey I. Nesvizhskii, Keriann M. Backus. Chemoproteogenomic stratification of the missense variant cysteinome. Nature Communications 2024, 15
(1)
https://doi.org/10.1038/s41467-024-53520-x
- Mahla Chalak, Mahdi Hesaraki, Seyedeh Nasim Mirbahari, Meghdad Yeganeh, Shaghayegh Abdi, Sarah Rajabi, Farhid Hemmatzadeh. Cell Immortality: In Vitro Effective Techniques to Achieve and Investigate Its Applications and Challenges. Life 2024, 14
(3)
, 417. https://doi.org/10.3390/life14030417
- Hu Zeng, Jiahao Huang, Jingyi Ren, Connie Kangni Wang, Zefang Tang, Haowen Zhou, Yiming Zhou, Hailing Shi, Abhishek Aditham, Xin Sui, Hongyu Chen, Jennifer A. Lo, Xiao Wang. Spatially resolved single-cell translatomics at molecular resolution. Science 2023, 380
(6652)
https://doi.org/10.1126/science.add3067
- Callum Henfrey, Shona Murphy, Michael Tellier. Regulation of mature mRNA levels by RNA processing efficiency. NAR Genomics and Bioinformatics 2023, 5
(2)
https://doi.org/10.1093/nargab/lqad059
- Zhongzhi Sun, Zhibin Ning, Kai Cheng, Haonan Duan, Qing Wu, Janice Mayne, Daniel Figeys. MetaPep: A core peptide database for faster human gut metaproteomics database searches. Computational and Structural Biotechnology Journal 2023, 21 , 4228-4237. https://doi.org/10.1016/j.csbj.2023.08.025
- Amos Bairoch. Meet the Editorial Board Member. Current Proteomics 2022, 19
(4)
, 289-289. https://doi.org/10.2174/157016461904220907111423
- Olson Tsang, Jason W. H. Wong. Proteogenomic interrogation of cancer cell lines: an overview of the field. Expert Review of Proteomics 2021, 18
(3)
, 221-232. https://doi.org/10.1080/14789450.2021.1914594
- Wai-Kok Choong, Ting-Yi Sung. Comparison of different variant sequence types coupled with decoy generation methods used in concatenated target-decoy database searches for proteogenomic research. Journal of Proteomics 2021, 231 , 104021. https://doi.org/10.1016/j.jprot.2020.104021
- Brian C. Searle, Kristian E. Swearingen, Christopher A. Barnes, Tobias Schmidt, Siegfried Gessulat, Bernhard Küster, Mathias Wilhelm. Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nature Communications 2020, 11
(1)
https://doi.org/10.1038/s41467-020-15346-1
- Wai-Kok Choong, Jen-Hung Wang, Ting-Yi Sung. MinProtMaxVP: Generating a minimized number of protein variant sequences containing all possible variant peptides for proteogenomic analysis. Journal of Proteomics 2020, 223 , 103819. https://doi.org/10.1016/j.jprot.2020.103819
- Na Li, Xianquan Zhan. Mitochondrial Dysfunction Pathway Networks and Mitochondrial Dynamics in the Pathogenesis of Pituitary Adenomas. Frontiers in Endocrinology 2019, 10 https://doi.org/10.3389/fendo.2019.00690
- Shanshan Liu, Weiqin Chang, Yuemei Jin, Chunyang Feng, Shuying Wu, Jiaxing He, Tianmin Xu. The function of histone acetylation in cervical cancer development. Bioscience Reports 2019, 39
(4)
https://doi.org/10.1042/BSR20190527
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.