Large-Scale Computational Mass Spectrometry and Multi-Omics Editorial
Challenges in Large-Scale Computational Mass Spectrometry and Multiomics
Oliver Kohlbacher *- ,
Olga Vitek *- , and
Susan T. Weintraub *
This publication is free to access through this site. Learn More
Perspectives
From Correlation to Causality: Statistical Approaches to Learning Regulatory Relationships in Large-Scale Biomolecular Investigations
Robert O. Ness - ,
Karen Sachs - , and
Olga Vitek *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Causal inference, the task of uncovering regulatory relationships between components of biomolecular pathways and networks, is a primary goal of many high-throughput investigations. Statistical associations between observed protein concentrations can suggest an enticing number of hypotheses regarding the underlying causal interactions, but when do such associations reflect the underlying causal biomolecular mechanisms? The goal of this perspective is to provide suggestions for causal inference in large-scale experiments, which utilize high-throughput technologies such as mass-spectrometry-based proteomics. We describe in nontechnical terms the pitfalls of inference in large data sets and suggest methods to overcome these pitfalls and reliably find regulatory associations.
Research Articles
Reproducibility of Differential Proteomic Technologies in CPTAC Fractionated Xenografts
David L. Tabb *- ,
Xia Wang - ,
Steven A. Carr - ,
Karl R. Clauser - ,
Philipp Mertins - ,
Matthew C. Chambers - ,
Jerry D. Holman - ,
Jing Wang - ,
Bing Zhang - ,
Lisa J. Zimmerman - ,
Xian Chen - ,
Harsha P. Gunawardena - ,
Sherri R. Davies - ,
Matthew J. C. Ellis - ,
Shunqiang Li - ,
R. Reid Townsend - ,
Emily S. Boja - ,
Karen A. Ketchum - ,
Christopher R. Kinsinger - ,
Mehdi Mesri - ,
Henry Rodriguez - ,
Tao Liu - ,
Sangtae Kim - ,
Jason E. McDermott - ,
Samuel H. Payne - ,
Vladislav A. Petyuk - ,
Karin D. Rodland - ,
Richard D. Smith - ,
Feng Yang - ,
Daniel W. Chan - ,
Bai Zhang - ,
Hui Zhang - ,
Zhen Zhang - ,
Jian-Ying Zhou - , and
Daniel C. Liebler
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
The NCI Clinical Proteomic Tumor Analysis Consortium (CPTAC) employed a pair of reference xenograft proteomes for initial platform validation and ongoing quality control of its data collection for The Cancer Genome Atlas (TCGA) tumors. These two xenografts, representing basal and luminal-B human breast cancer, were fractionated and analyzed on six mass spectrometers in a total of 46 replicates divided between iTRAQ and label-free technologies, spanning a total of 1095 LC–MS/MS experiments. These data represent a unique opportunity to evaluate the stability of proteomic differentiation by mass spectrometry over many months of time for individual instruments or across instruments running dissimilar workflows. We evaluated iTRAQ reporter ions, label-free spectral counts, and label-free extracted ion chromatograms as strategies for data interpretation (source code is available from http://homepages.uc.edu/~wang2x7/Research.htm). From these assessments, we found that differential genes from a single replicate were confirmed by other replicates on the same instrument from 61 to 93% of the time. When comparing across different instruments and quantitative technologies, using multiple replicates, differential genes were reproduced by other data sets from 67 to 99% of the time. Projecting gene differences to biological pathways and networks increased the degree of similarity. These overlaps send an encouraging message about the maturity of technologies for proteomic differentiation.
Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics
Kenneth Verheggen - ,
Davy Maddelein - ,
Niels Hulstaert - ,
Lennart Martens *- ,
Harald Barsnes - , and
Marc Vaudel
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries.
MaRaCluster: A Fragment Rarity Metric for Clustering Fragment Spectra in Shotgun Proteomics
Matthew The - and
Lukas Käll *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Shotgun proteomics experiments generate large amounts of fragment spectra as primary data, normally with high redundancy between and within experiments. Here, we have devised a clustering technique to identify fragment spectra stemming from the same species of peptide. This is a powerful alternative method to traditional search engines for analyzing spectra, specifically useful for larger scale mass spectrometry studies. As an aid in this process, we propose a distance calculation relying on the rarity of experimental fragment peaks, following the intuition that peaks shared by only a few spectra offer more evidence than peaks shared by a large number of spectra. We used this distance calculation and a complete-linkage scheme to cluster data from a recent large-scale mass spectrometry-based study. The clusterings produced by our method have up to 40% more identified peptides for their consensus spectra compared to those produced by the previous state-of-the-art method. We see that our method would advance the construction of spectral libraries as well as serve as a tool for mining large sets of fragment spectra. The source code and Ubuntu binary packages are available at https://github.com/statisticalbiotechnology/maracluster (under an Apache 2.0 license).
Mining Large Scale Tandem Mass Spectrometry Data for Protein Modifications Using Spectral Libraries
Oliver Horlacher - ,
Frederique Lisacek *- , and
Markus Müller *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Experimental improvements in post-translational modification (PTM) detection by tandem mass spectrometry (MS/MS) has allowed the identification of vast numbers of PTMs. Open modification searches (OMSs) of MS/MS data, which do not require prior knowledge of the modifications present in the sample, further increased the diversity of detected PTMs. Despite much effort, there is still a lack of functional annotation of PTMs. One possibility to narrow the annotation gap is to mine MS/MS data deposited in public repositories and to correlate the PTM presence with biological meta-information attached to the data. Since the data volume can be quite substantial and contain tens of millions of MS/MS spectra, the data mining tools must be able to cope with big data. Here, we present two tools, Liberator and MzMod, which are built using the MzJava class library and the Apache Spark large scale computing framework. Liberator builds large MS/MS spectrum libraries, and MzMod searches them in an OMS mode. We applied these tools to a recently published set of 25 million spectra from 30 human tissues and present tissue specific PTMs. We also compared the results to the ones obtained with the OMS tool MODa and the search engine X!Tandem.
Application of de Novo Sequencing to Large-Scale Complex Proteomics Data Sets
Arun Devabhaktuni - and
Joshua E. Elias *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Dependent on concise, predefined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large-scale proteomics data sets and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) that leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to that of other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.
New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer
Francesca Petralia - ,
Won-Min Song - ,
Zhidong Tu *- , and
Pei Wang *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
We focus on characterizing common and different coexpression patterns among RNAs and proteins in breast cancer tumors. To address this problem, we introduce Joint Random Forest (JRF), a novel nonparametric algorithm to simultaneously estimate multiple coexpression networks by effectively borrowing information across protein and gene expression data. The performance of JRF was evaluated through extensive simulation studies using different network topologies and data distribution functions. Advantages of JRF over other algorithms that estimate class-specific networks separately were observed across all simulation settings. JRF also outperformed a competing method based on Gaussian graphic models. We then applied JRF to simultaneously construct gene and protein coexpression networks based on protein and RNAseq data from CPTAC-TCGA breast cancer study. We identified interesting common and differential coexpression patterns among genes and proteins. This information can help to cast light on the potential disease mechanisms of breast cancer.
moCluster: Identifying Joint Patterns Across Multiple Omics Data Sets
Chen Meng - ,
Dominic Helm - ,
Martin Frejno - , and
Bernhard Kuster *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Increasingly, multiple omics approaches are being applied to understand the complexity of biological systems. Yet, computational approaches that enable the efficient integration of such data are not well developed. Here, we describe a novel algorithm, termed moCluster, which discovers joint patterns among multiple omics data. The method first employs a multiblock multivariate analysis to define a set of latent variables representing joint patterns across input data sets, which is further passed to an ordinary clustering algorithm in order to discover joint clusters. Using simulated data, we show that moCluster’s performance is not compromised by issues present in iCluster/iCluster+ (notably, the nondeterministic solution) and that it operates 100× to 1000× faster than iCluster/iCluster+. We used moCluster to cluster proteomic and transcriptomic data from the NCI-60 cell line panel. The resulting cluster model revealed different phenotypes across cellular subtypes, such as doubling time and drug response. Applying moCluster to methylation, mRNA, and protein data from a large study on colorectal cancer patients identified four molecular subtypes, including one characterized by microsatellite instability and high expression of genes/proteins involved in immunity, such as PDL1, a target of multiple drugs currently in development. The other three subtypes have not been discovered before using single data sets, which clearly illustrates the molecular complexity of oncogenesis and the need for holistic, multidata analysis strategies.
Integrative Omics Analysis Reveals Post-Transcriptionally Enhanced Protective Host Response in Colorectal Cancers with Microsatellite Instability
Qi Liu - and
Bing Zhang *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Microsatellite instability (MSI) is a frequent and clinically relevant molecular phenotype in colorectal cancer. MSI cancers have favorable survival compared with microsatellite stable cancers (MSS), possibly due to the pronounced tumor-infiltrating lymphocytes observed in MSI cancers. Consistent with the strong immune response that MSI cancers trigger in the host, previous transcriptome expression studies have identified mRNA signatures characteristic of immune response in MSI cancers. However, proteomics features of MSI cancers and the extent to which the mRNA signatures are reflected at the protein level remain largely unknown. Here, we performed a comprehensive comparison of global proteomics profiles between MSI and MSS colorectal cancers in The Cancer Genome Atlas (TCGA) cohort. We found that protein signatures of MSI are also associated with increased immunogenicity. To reliably quantify post-transcription regulation in MSI cancers, we developed a resampling-based regression method by integrative modeling of transcriptomics and proteomics data sets. Compared with the popular simple method, which detects post-transcriptional regulation by either identifying genes differentially expressed at the mRNA level but not at the protein level or vice versa, our method provided a quantitative, more sensitive, and accurate way to identify genes subject to differential post-transcriptional regulation. With this method, we demonstrated that post-transcriptional regulation, coordinating protein expression with key players, initiates de novo and enhances protective host response in MSI cancers.
Technical Notes
Proteomics Quality Control: Quality Control Software for MaxQuant Results
Chris Bielow *- ,
Guido Mastrobuoni - , and
Stefan Kempa *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Mass spectrometry-based proteomics coupled to liquid chromatography has matured into an automatized, high-throughput technology, producing data on the scale of multiple gigabytes per instrument per day. Consequently, an automated quality control (QC) and quality analysis (QA) capable of detecting measurement bias, verifying consistency, and avoiding propagation of error is paramount for instrument operators and scientists in charge of downstream analysis. We have developed an R-based QC pipeline called Proteomics Quality Control (PTXQC) for bottom-up LC–MS data generated by the MaxQuant1 software pipeline. PTXQC creates a QC report containing a comprehensive and powerful set of QC metrics, augmented with automated scoring functions. The automated scores are collated to create an overview heatmap at the beginning of the report, giving valuable guidance also to nonspecialists. Our software supports a wide range of experimental designs, including stable isotope labeling by amino acids in cell culture (SILAC), tandem mass tags (TMT), and label-free data. Furthermore, we introduce new metrics to score MaxQuant’s Match-between-runs (MBR) functionality by which peptide identifications can be transferred across Raw files based on accurate retention time and m/z. Last but not least, PTXQC is easy to install and use and represents the first QC software capable of processing MaxQuant result tables. PTXQC is freely available at https://github.com/cbielow/PTXQC.
Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis
Lukas P. M. Kremer - ,
Johannes Leufken - ,
Purevdulam Oyunchimeg - ,
Stefan Schulze - , and
Christian Fufezan *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results,1−3 it is only recently that unified approaches are emerging;4,5 however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches6 can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem,7 OMSSA,8 MS-GF+,9 Myrimatch,10 MS Amanda11), statistical postprocessing algorithms (qvality,12 Percolator13), and one algorithm that combines statistically postprocessed outputs from multiple search engines (“combined FDR”14) accessible as an interface in Python. Furthermore, we have implemented a new algorithm (“combined PEP”) that combines multiple search engines employing elements of “combined FDR”,14 PeptideShaker,2 and Bayes’ theorem.
PGx: Putting Peptides to BED
Manor Askenazi *- ,
Kelly V. Ruggles - , and
David Fenyö *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Every molecular player in the cast of biology’s central dogma is being sequenced and quantified with increasing ease and coverage. To bring the resulting genomic, transcriptomic, and proteomic data sets into coherence, tools must be developed that do not constrain data acquisition and analytics in any way but rather provide simple links across previously acquired data sets with minimal preprocessing and hassle. Here we present such a tool: PGx, which supports proteogenomic integration of mass spectrometry proteomics data with next-generation sequencing by mapping identified peptides onto their putative genomic coordinates.
Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy
Anthony J. Cesnik - ,
Michael R. Shortreed - ,
Gloria M. Sheynkman - ,
Brian L. Frey - , and
Lloyd M. Smith *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
Mass-spectrometry-based proteomic analysis underestimates proteomic variation due to the absence of variant peptides and posttranslational modifications (PTMs) from standard protein databases. Each individual carries thousands of missense mutations that lead to single amino acid variants, but these are missed because they are absent from generic proteomic search databases. Myriad types of protein PTMs play essential roles in biological processes but remain undetected because of increased false discovery rates in variable modification searches. We address these two fundamental shortcomings of bottom-up proteomics with two recently developed software tools. The first consists of workflows in Galaxy that mine RNA sequencing data to generate sample-specific databases containing variant peptides and products of alternative splicing events. The second tool applies a new strategy that alters the variable modification approach to consider only curated PTMs at specific positions, thereby avoiding the combinatorial explosion that traditionally leads to high false discovery rates. Using RNA-sequencing-derived databases with this Global Post-Translational Modification (G-PTM) search strategy revealed hundreds of single amino acid variant peptides, tens of novel splice junction peptides, and several hundred posttranslationally modified peptides in each of ten human cell lines.
Letters
Testing and Validation of Computational Methods for Mass Spectrometry
Laurent Gatto - ,
Kasper D. Hansen - ,
Michael R. Hoopmann - ,
Henning Hermjakob - ,
Oliver Kohlbacher - , and
Andreas Beyer *
This publication is Open Access under the license indicated. Learn More
ACS Editors' Choice® is a collection designed to feature scientific articles of broad public interest. Read the latest articles
High-throughput methods based on mass spectrometry (proteomics, metabolomics, lipidomics, etc.) produce a wealth of data that cannot be analyzed without computational methods. The impact of the choice of method on the overall result of a biological study is often underappreciated, but different methods can result in very different biological findings. It is thus essential to evaluate and compare the correctness and relative performance of computational methods. The volume of the data as well as the complexity of the algorithms render unbiased comparisons challenging. This paper discusses some problems and challenges in testing and validation of computational methods. We discuss the different types of data (simulated and experimental validation data) as well as different metrics to compare methods. We also introduce a new public repository for mass spectrometric reference data sets (http://compms.org/RefData) that contains a collection of publicly available data sets for performance evaluation for a wide range of different methods.
Research Articles
Serum Metabolite Profiles Are Altered by Erlotinib Treatment and the Integrin α1-Null Genotype but Not by Post-Traumatic Osteoarthritis
Beata Mickiewicz - ,
Sung Y. Shin - ,
Ambra Pozzi - ,
Hans J. Vogel - , and
Andrea L. Clark *
The risk of developing post-traumatic osteoarthritis (PTOA) following joint injury is high. Furthering our understanding of the molecular mechanisms underlying PTOA and/or identifying novel biomarkers for early detection may help to improve treatment outcomes. Increased expression of integrin α1β1 and inhibition of epidermal growth factor receptor (EGFR) signaling protect the knee from spontaneous OA; however, the impact of the integrin α1β1/EGFR axis on PTOA is currently unknown. We sought to determine metabolic changes in serum samples collected from wild-type and integrin α1-null mice that underwent surgery to destabilize the medial meniscus and were treated with the EGFR inhibitor erlotinib. Following 1H nuclear magnetic resonance spectroscopy, we generated multivariate statistical models that distinguished between the metabolic profiles of erlotinib- versus vehicle-treated mice and the integrin α1-null versus wild-type mouse genotype. Our results show the sex-dependent effects of erlotinib treatment and highlight glutamine as a metabolite that counteracts this treatment. Furthermore, we identified a set of metabolites associated with increased reactive oxygen species production, susceptibility to OA, and regulation of TRP channels in α1-null mice. Our study indicates that systemic pharmacological and genetic factors have a greater effect on serum metabolic profiles than site-specific factors such as surgery.
Interactions between the Powdery Mildew Effector BEC1054 and Barley Proteins Identify Candidate Host Targets
Helen G. Pennington - ,
Dana M. Gheorghe - ,
Annabelle Damerum - ,
Clara Pliego - ,
Pietro D. Spanu - ,
Rainer Cramer - , and
Laurence V. Bindschedler *
This publication is Open Access under the license indicated. Learn More
There are over 500 candidate secreted effector proteins (CSEPs) or Blumeria effector candidates (BECs) specific to the barley powdery mildew pathogen Blumeria graminis f.sp. hordei. The CSEP/BEC proteins are expressed and predicted to be secreted by biotrophic feeding structures called haustoria. Eight BECs are required for the formation of functional haustoria. These include the RNase-like effector BEC1054 (synonym CSEP0064). In order to identify host proteins targeted by BEC1054, recombinant BEC1054 was expressed in E. coli, solubilized, and used in pull-down assays from barley protein extracts. Many putative interactors were identified by LC-MS/MS after subtraction of unspecific binders in negative controls. Therefore, a directed yeast-2-hybrid assay, developed to measure the effectiveness of the interactions in yeast, was used to validate putative interactors. We conclude that BEC1054 may target several host proteins, including a glutathione-S-transferase, a malate dehydrogenase, and a pathogen-related-5 protein isoform, indicating a possible role for BEC1054 in compromising well-known key players of defense and response to pathogens. In addition, BEC1054 interacts with an elongation factor 1 gamma. This study already suggests that BEC1054 plays a central role in barley powdery mildew virulence by acting at several levels.
Quantitative Proteomic Analysis Reveals Populus cathayana Females Are More Sensitive and Respond More Sophisticatedly to Iron Deficiency than Males
Sheng Zhang *- ,
Yunxiang Zhang - ,
Yanchun Cao - ,
Yanbao Lei - , and
Hao Jiang
Previous studies have shown that there are significant sexual differences in the morphological and physiological responses of Populus cathayana Rehder to nitrogen and phosphorus deficiencies, but little is known about the sex-specific differences in responses to iron deficiency. In this study, the effects of iron deficiency on the morphology, physiology, and proteome of P. cathayana males and females were investigated. The results showed that iron deficiency (25 days) significantly decreased height growth, photosynthetic rate, chlorophyll content, and tissue iron concentration in both sexes. A comparison between the sexes indicated that iron-deficient males had less height inhibition and photosynthesis system II or chloroplast ultrastructural damage than iron-deficient females. iTRAQ-based quantitative proteomic analysis revealed that 144 and 68 proteins were decreased in abundance (e.g., proteins involved in photosynthesis, carbohydrate and energy metabolism, and gene expression regulation) and 78 and 39 proteins were increased in abundance (e.g., proteins involved in amino acid metabolism and stress response) according to the criterion of ratio ≥1.5 in females and males, respectively. A comparison between the sexes indicated that iron-deficient females exhibited a greater change in the proteins involved in photosynthesis, carbon and energy metabolism, the redox system, and stress responsive proteins. This study reveals females are more sensitive and have a more sophisticated response to iron deficiency compared with males and provides new insights into differential sexual responses to nutrient deficiency.
Proteome Scale-Protein Turnover Analysis Using High Resolution Mass Spectrometric Data from Stable-Isotope Labeled Plants
Kai-Ting Fan - ,
Aaron K. Rendahl - ,
Wen-Ping Chen - ,
Dana M. Freund - ,
William M. Gray - ,
Jerry D. Cohen - , and
Adrian D. Hegeman *
Protein turnover is an important aspect of the regulation of cellular processes for organisms when responding to developmental or environmental cues. The measurement of protein turnover in plants, in contrast to that of rapidly growing unicellular organismal cultures, is made more complicated by the high degree of amino acid recycling, resulting in significant transient isotope incorporation distributions that must be dealt with computationally for high throughput analysis to be practical. An algorithm in R, ProteinTurnover, was developed to calculate protein turnover with transient stable isotope incorporation distributions in a high throughput automated manner using high resolution MS and MS/MS proteomic analysis of stable isotopically labeled plant material. ProteinTurnover extracts isotopic distribution information from raw MS data for peptides identified by MS/MS from data sets of either isotopic label dilution or incorporation experiments. Variable isotopic incorporation distributions were modeled using binomial and beta-binomial distributions to deconvolute the natural abundance, newly synthesized/partial-labeled, and fully labeled peptide distributions. Maximum likelihood estimation was performed to calculate the distribution abundance proportion of old and newly synthesized peptides. The half-life or turnover rate of each peptide was calculated from changes in the distribution abundance proportions using nonlinear regression. We applied ProteinTurnover to obtain half-lives of proteins from enriched soluble and membrane fractions from Arabidopsis roots.
NMR-Based Lipidomic Approach To Evaluate Controlled Dietary Intake of Lipids in Adipose Tissue of a Rat Mammary Tumor Model
Lobna Ouldamer *- ,
Lydie Nadal-Desbarats - ,
Stephan Chevalier - ,
Gilles Body - ,
Caroline Goupille - , and
Philippe Bougnoux
The fatty acids composition of adipose tissue may provide information on the nutritional part of the risk or evolution of breast cancer. To determine whether 1H NMR of adipose tissue provides information on the nature of the diet consumed, a dietary intervention with increasing percentage of polyunsaturated n-3 docosahexaenoic acid (DHA 22:6n-3, provided as DHASCO oil) was applied to a rat model of N-nitroso-N-methylurea-induced mammary tumors. Spectra of the lipid extracts were obtained from adipose tissues in five groups of Sprague–Dawley rats fed with a diet containing 7% peanut/rapeseed enriched with 8% (w/w) of an oil without (palm oil) or with low (1%), moderate (3%), or high (8%) DHASCO content. A control group received a basal diet with 15% peanut/rapeseed representative of the “Western” diet. After 5 months of those five controlled diets, adipose tissue was collected for analysis of the lipid extract using both 1H NMR analysis on an 11.7 T spectrometer and gas chromatography considered as gold standard. 1H NMR analysis showed a dose-dependent increase in DHA in the lipid extract of adipose tissues and a commensurate decrease in n-6 polyunsaturated fatty acids in the three DHA groups, which allowed one to follow n-6/n-3 ratio changes. The highest n-6/n-3 ratio was observed in the control Western diet group compared to the other diet groups. The integrated spectral regions showed separation between groups, thereby documenting a specific NMR lipid profile corresponding to each dietary intervention. Those diet-dependent NMR lipid profiles were consistent with that obtained with gas chromatography analyses of the same samples. This study is a proof of concept highlighting the potential use of the 1H NMR approach to evaluate dietary intervention in biopsies of adipose tissues.
Global Proteome Analyses of Lysine Acetylation and Succinylation Reveal the Widespread Involvement of both Modification in Metabolism in the Embryo of Germinating Rice Seed
Dongli He - ,
Qiong Wang - ,
Ming Li - ,
Rebecca Njeri Damaris - ,
Xingling Yi - ,
Zhongyi Cheng - , and
Pingfang Yang *
Regulation of rice seed germination has been shown to mainly occur at post-transcriptional levels, of which the changes on proteome status is a major one. Lysine acetylation and succinylation are two prevalent protein post-translational modifications (PTMs) involved in multiple biological processes, especially for metabolism regulation. To investigate the potential mechanism controlling metabolism regulation in rice seed germination, we performed the lysine acetylation and succinylation analyses simultaneously. Using high-accuracy nano-LC–MS/MS in combination with the enrichment of lysine acetylated or succinylated peptides from digested embryonic proteins of 24 h after imbibition (HAI) rice seed, a total of 699 acetylated sites from 389 proteins and 665 succinylated sites from 261 proteins were identified. Among these modified lysine sites, 133 sites on 78 proteins were commonly modified by two PTMs. The overlapped PTM sites were more likely to be in polar acidic/basic amino acid regions and exposed on the protein surface. Both of the acetylated and succinylated proteins cover nearly all aspects of cellular functions. Ribosome complex and glycolysis/gluconeogenesis-related proteins were significantly enriched in both acetylated and succinylated protein profiles through KEGG enrichment and protein–protein interaction network analyses. The acetyl-CoA and succinyl-CoA metabolism-related enzymes were found to be extensively modified by both modifications, implying the functional interaction between the two PTMs. This study provides a rich resource to examine the modulation of the two PTMs on the metabolism pathway and other biological processes in germinating rice seed.
Universal Solid-Phase Reversible Sample-Prep for Concurrent Proteome and N-Glycome Characterization
Hui Zhou - ,
Samantha Morley - ,
Stephen Kostel - ,
Michael R. Freeman - ,
Vivek Joshi - ,
David Brewster - , and
Richard S. Lee *
We describe a novel solid-phase reversible sample-prep (SRS) platform that enables rapid sample preparation for concurrent proteome and N-glycome characterization for nearly all protein samples. SRS utilizes a uniquely functionalized, silica-based bead that has strong affinity toward proteins with minimal to no affinity for peptides and other small molecules. By leveraging this inherent size difference between proteins and peptides, SRS permits high-capacity binding of proteins, rapid removal of small molecules (detergents, metabolites, salts, peptides, etc.), extensive manipulation including enzymatic and chemical treatments on bead-bound proteins, and easy recovery of N-glycans and peptides. SRS was evaluated in a wide range of samples including glycoproteins, cell lysate, murine tissues, and human urine. SRS was also coupled to a quantitative strategy to investigate the differences between DU145 prostate cancer cells and its DIAPH3-silenced counterpart. Previous studies suggested that DIAPH3 silencing in DU145 induced transition to an amoeboid phenotype that correlated with tumor progression and metastasis. In this pilot study we identified distinct proteomic and N-glycomic alterations between them. A metastasis-associated tyrosine kinase receptor ephrin-type-A receptor (EPHA2) was highly up-regulated in DIAPH3-silenced cells, indicating a possible connection between EPHA2 and DIAPH3. Moreover, distinct alterations in the N-glycome were identified, suggesting cross-links between DIAPH3 and glycosyltransferase networks.
Free-Flow Electrophoresis of Plasma Membrane Vesicles Enriched by Two-Phase Partitioning Enhances the Quality of the Proteome from Arabidopsis Seedlings
Roberto de Michele - ,
Heather E. McFarlane - ,
Harriet T. Parsons - ,
Miranda J. Meents - ,
Jeemeng Lao - ,
Susana M. González Fernández-Niño - ,
Christopher J. Petzold - ,
Wolf B. Frommer - ,
A. Lacey Samuels - , and
Joshua L. Heazlewood *
The plant plasma membrane is the interface between the cell and its environment undertaking a range of important functions related to transport, signaling, cell wall biosynthesis, and secretion. Multiple proteomic studies have attempted to capture the diversity of proteins in the plasma membrane using biochemical fractionation techniques. In this study, two-phase partitioning was combined with free-flow electrophoresis to produce a population of highly purified plasma membrane vesicles that were subsequently characterized by tandem mass spectroscopy. This combined high-quality plasma membrane isolation technique produced a reproducible proteomic library of over 1000 proteins with an extended dynamic range including plasma membrane-associated proteins. The approach enabled the detection of a number of putative plasma membrane proteins not previously identified by other studies, including peripheral membrane proteins. Utilizing multiple data sources, we developed a PM-confidence score to provide a value indicating association to the plasma membrane. This study highlights over 700 proteins that, while seemingly abundant at the plasma membrane, are mostly unstudied. To validate this data set, we selected 14 candidates and transiently localized 13 to the plasma membrane using a fluorescent tag. Given the importance of the plasma membrane, this data set provides a valuable tool to further investigate important proteins. The mass spectrometry data are available via ProteomeXchange, identifier PXD001795.
Untargeted Lipidomics Reveals Differences in the Lipid Pattern among Clinical Isolates of Staphylococcus aureus Resistant and Sensitive to Antibiotics
Weronika Hewelt-Belka *- ,
Joanna Nakonieczna - ,
Mariusz Belka - ,
Tomasz Bączek - ,
Jacek Namieśnik - , and
Agata Kot-Wasik
Staphylococcus aureus resistance to antibiotics is a significant clinical problem worldwide. In this study, an untargeted lipidomics approach was used to compare the lipid fingerprints of S. aureus clinical isolates that are resistant and sensitive to antibiotics. High-performance liquid chromatography coupled with time-of-flight mass spectrometry was employed to rapidly and comprehensively analyze bacterial lipids. Chemometric and statistical analyses of the obtained lipid fingerprints revealed variations in several lipid groups between S. aureus strains resistant and sensitive to tested antibiotics including methicillin, gentamicin, ciprofloxacin, erythromycin, and fusidic acid. The levels of identified monoglycosyldiacylglycerol, phosphatidylglycerol, and diglycosyldiacylglycerol lipid groups were found to be upregulated in antibiotic-resistant S. aureus strains, whereas the levels of diacylglycerol lipid groups were downregulated. Differences in the lipid patterns between sensitive and resistant S. aureus strains suggest that antibiotic susceptibility may be associated with the lipid composition of bacterial cells. The lipids that were found to significantly differ between antibiotic-resistant and antibiotic-sensitive clinical isolates are involved in the biosynthesis of major S. aureus membrane lipids and lipoteichoic acid. This study indicates that S. aureus lipid biosynthesis pathways should be explored further to better understand the mechanism of antibiotic resistance in S. aureus strains.
Comparative Assessment of Glycosylation of a Recombinant Human FSH and a Highly Purified FSH Extracted from Human Urine
Hong Wang - ,
Xi Chen - ,
Xiaoxi Zhang - ,
Wei Zhang - ,
Yan Li - ,
Hongrui Yin - ,
Hong Shao - , and
Gang Chen *
Glycosylation is an important PTM and is critical for the manufacture and efficacy of therapeutic glycoproteins. Glycan significantly influences the biological properties of human follicle-stimulating hormone (hFSH). Using a glycoproteomic strategy, this study compared the glycosylation of a putative highly purified FSH (uhFSH) obtained from human urine with that of a recombinant human FSH (rhFSH) obtained from Chinese hamster ovary (CHO) cells. Intact and subunit masses, N-glycans, N-glycosylation sites, and intact N- and O-glycopeptides were analyzed and compared by mass spectrometry. Classic and complementary analytical methods, including SDS-PAGE, isoelectric focusing, and the Steelman–Pohley bioassay were also employed to compare their intact molecular weights, charge variants, and specific activities. Results showed that highly sialylated, branched, and macro-heterogeneity glycans are predominant in the uhFSH compared with those in rhFSH. The O-glycopeptides of both hFSHs, which have not been described previously, were characterized herein. A high degree of heterogeneity was observed in the N-glycopeptides of both hFSHs. The differences in glycosylation provide useful information in elucidating and in further investigation the critical glycan structures of hFSH.
Proteomic Profile of Unstable Atheroma Plaque: Increased Neutrophil Defensin 1, Clusterin, and Apolipoprotein E Levels in Carotid Secretome
Gemma Aragonès - ,
Teresa Auguet - ,
Esther Guiu-Jurado - ,
Alba Berlanga - ,
Marta Curriu - ,
Salomé Martinez - ,
Ajla Alibalic - ,
Carmen Aguilar - ,
Esteban Hernández - ,
María-Luisa Camara - ,
Núria Canela - ,
Pol Herrero - ,
Xavier Ruyra - ,
Vicente Martín-Paredero - , and
Cristóbal Richart *
Because of the clinical significance of carotid atherosclerosis, the search for novel biomarkers has become a priority. The aim of the present study was to compare the protein secretion profile of the carotid atherosclerotic plaque (CAP, n = 12) and nonatherosclerotic mammary artery (MA, n = 10) secretomes. We used a nontargeted proteomic approach that incorporated tandem immunoaffinity depletion, iTRAQ labeling, and nanoflow liquid chromatography coupled to high-resolution mass spectrometry. In total, 162 proteins were quantified, of which 25 showed statistically significant differences in secretome levels between carotid atherosclerotic plaque and nondiseased mammary artery. We found increased levels of neutrophil defensin 1, apolipoprotein E, clusterin, and zinc-alpha-2-glycoprotein in CAP secretomes. Results were validated by ELISA assays. Also, differentially secreted proteins are involved in pathways such as focal adhesion and leukocyte transendothelial migration. In conclusion, this study provides a subset of identified proteins that are differently expressed in secretomes of clinical significance.
Proteome Profiling and Ultrastructural Characterization of the Human RCMH Cell Line: Myoblastic Properties and Suitability for Myopathological Studies
Laxmikanth Kollipara - ,
Stephan Buchkremer - ,
Joachim Weis - ,
Eva Brauers - ,
Mareike Hoss - ,
Stephan Rütten - ,
Pablo Caviedes - ,
René P. Zahedi - , and
Andreas Roos *
Studying (neuro)muscular disorders is a major topic in biomedicine with a demand for suitable model systems. Continuous cell culture (in vitro) systems have several technical advantages over in vivo systems and became widely used tools for discovering physiological/pathophysiological mechanisms in muscle. In particular, myoblast cell lines are suitable model systems to study complex biochemical adaptations occurring in skeletal muscle and cellular responses to altered genetic/environmental conditions. Whereas most in vitro studies use extensively characterized murine C2C12 cells, a comprehensive description of an equivalent human cell line, not genetically manipulated for immortalization, is lacking. Therefore, we characterized human immortal myoblastic RCMH cells using scanning (SEM) and transmission electron microscopy (TEM) and proteomics. Among more than 6200 identified proteins we confirm the known expression of proteins important for muscle function. Comparing the RCMH proteome with two well-defined nonskeletal muscle cells lines (HeLa, U2OS) revealed a considerable enrichment of proteins important for muscle function. SEM/TEM confirmed the presence of agglomerates of cytoskeletal components/intermediate filaments and a prominent rough ER. In conclusion, our results indicate RMCH as a suitable in vitro model for investigating muscle function-related processes such as mechanical stress burden and mechanotransduction, EC coupling, cytoskeleton, muscle cell metabolism and development, and (ER-associated) myopathic disorders.
Identification of Palmitoylated Transitional Endoplasmic Reticulum ATPase by Proteomic Technique and Pan Antipalmitoylation Antibody
Caiyun Fang - ,
Xiaoqin Zhang - ,
Lei Zhang - ,
Xing Gao - ,
Pengyuan Yang - , and
Haojie Lu *
Protein palmitoylation plays a significant role in a wide range of biological processes such as cell signal transduction, metabolism, apoptosis, and carcinogenesis. For high-throughput analysis of protein palmitoylation, approaches based on the acyl-biotin exchange or metabolic labeling of azide/alkynyl-palmitate analogs are commonly used. No palmitoylation antibody has been reported. Here, the palmitoylated proteome of human colon cancer cell lines SW480 was analyzed via a TS-6B-based method. In total, 151 putative palmitoylated sites on 92 proteins, including 100 novel sites, were identified. Except for 3 known palmitoylated transmembrane proteins, ATP1A1, ZDHHC5, and PLP2, some important proteins including kinases, ion channels, receptors, and cytoskeletal proteins were also identified, such as CLIC1, PGK1, PPIA, FKBP4, exportin-2, etc. More importantly, the pan antipalmitoylation antibody was developed and verified for the first time. Our homemade pan antipalmitoylation antiserum could differentiate well protein palmitoylation from mouse brain membrane fraction and SW480 cells, which affords a new technique for analyzing protein palmitoylation by detecting the palmitic acid moiety directly. Furthermore, the candidate protein transitional endoplasmic reticulum ATPase (VCP) identified in SW480 cells was validated to be palmitoylated by Western blotting with anti-VCP antibody and the homemade pan antipalmitoylation antibody.
Cinnamaldehyde Characterization as an Antibacterial Agent toward E. coli Metabolic Profile Using 96-Blade Solid-Phase Microextraction Coupled to Liquid Chromatography–Mass Spectrometry
Fatemeh Mousavi - ,
Barbara Bojko - ,
Vincent Bessonneau - , and
Janusz Pawliszyn *
This publication is Open Access under the license indicated. Learn More
Sampling and sample preparation plays an important role in untargeted analysis as it influences final composition of the analyzed extract and consequently reflection of the metabolome. In the current work, mechanism of bactericidal action of cinnamaldehyde (CA) against Escherichia coli (E. coli) during bacteria growth applying high-throughput solid-phase microextraction in direct immersion mode coupled to a high-performance liquid chromatography–mass spectrometry system was investigated. Numerous discriminant metabolites due to CA addition to the bacteria culture were mapped in the E. coli metabolic pathways. We propose new metabolic pathways confirming that CA acts as an oxidative stress agent against E. coli. The results of the current research have successfully demonstrated that CA changes the bacterial metabolism through interactions with different biochemical families such as proteins, nucleic acids, lipids, and carbohydrates, which needs further validation by proteomics and transcriptomics studies. The results presented here show the great potential of the novel approach in drug discovery and food safety.
Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa
Kenneth R. Durbin - ,
Luca Fornelli - ,
Ryan T. Fellers - ,
Peter F. Doubleday - ,
Masashi Narita - , and
Neil L. Kelleher *
Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. We extended the coverage of the label-free technique, achieving differential analysis of whole proteins <30 kDa from the proteomes of growing and senescent human fibroblasts. By integrating improved control software with more instrument time allocated for quantitation of intact ions, we were able to collect protein data between the two cell states, confidently comparing 1577 proteoform levels. To then identify and characterize proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180 unique Swiss-Prot accession numbers at 1% false-discovery rate. This coverage of the low mass proteome is equivalent to the largest previously reported but was accomplished in 23% of the total acquisition time. By maximizing both the number of quantified proteoforms and their identification rate in an integrated software environment, this work significantly advances proteoform-resolved analyses of complex systems.
g2pDB: A Database Mapping Protein Post-Translational Modifications to Genomic Coordinates
Sarah Keegan - ,
John P. Cortens - ,
Ronald C. Beavis *- , and
David Fenyö
Large scale proteomics have made it possible to broadly screen samples for the presence of many types of post-translational modifications, such as phosphorylation, acetylation, and ubiquitination. This type of data has allowed the localization of these modifications to either a specific site on a proteolytically generated peptide or to within a small domain on the peptide. The resulting modification acceptor sites can then be mapped onto the appropriate protein sequences and the information archived. This paper describes the usage of a very large archive of experimental observations of human post-translational modifications to create a map of the most reproducible modification observations onto the complete set of human protein sequences. This set of modification acceptor sites was then directly translated into the genomic coordinates for the codons for the residues at those sites. We constructed the database g2pDB using this protein-to-codon site mapping information. The information in g2pDB has been made available through a RESTful-style API, allowing researchers to determine which specific protein modifications would be perturbed by a set of observed nucleotide variants determined by high throughput DNA or RNA sequencing.
N-Glycoproteomics of Human Seminal Plasma Glycoproteins
Mayank Saraswat - ,
Sakari Joenväärä - ,
Anil Kumar Tomar - ,
Sarman Singh - ,
Savita Yadav - , and
Risto Renkonen *
Seminal plasma aids sperm by inhibiting premature capacitation, helping in the intracervical transport and formation of an oviductal sperm reservoir, all of which appear to be important in the fertilization process. Epitopes such as Lewis x and y are known to be present on seminal plasma glycoproteins, which can modulate the maternal immune response. It is suggested by multiple studies that seminal plasma glycoproteins play, largely undiscovered, important roles in the process of fertilization. We have devised a strategy to analyze glycopeptides from a complex, unknown mixture of protease-digested proteins. This analysis provides identification of the glycoproteins, glycosylation sites, glycan compositions, and proposed structures from the original sample. This strategy has been applied to human seminal plasma total glycoproteins. We have elucidated glycan compositions and proposed structures for 243 glycopeptides belonging to 73 N-glycosylation sites on 50 glycoproteins. The majority of the proposed glycan structures were complex type (83%) followed by high-mannose (10%) and then hybrid (7%). Most of the glycoproteins were either sialylated, fucosylated, or both. Many Lewis x/a and y/b epitopes bearing glycans were found, suggesting immune-modulating epitopes on multiple seminal plasma glycoproteins. The study also shows that large scale N-glycosylation mapping is achievable with current techniques and the depth of the analysis is roughly proportional to the prefractionation and complexity of the sample.
Protein-Specific Differential Glycosylation of Immunoglobulins in Serum of Ovarian Cancer Patients
L. Renee Ruhaak *- ,
Kyoungmi Kim - ,
Carol Stroble - ,
Sandra L. Taylor - ,
Qiuting Hong - ,
Suzanne Miyamoto - ,
Carlito B. Lebrilla - , and
Gary Leiserowitz
Previous studies indicated that glycans in serum may serve as biomarkers for diagnosis of ovarian cancer; however, it was unclear to which proteins these glycans belong. We hypothesize that protein-specific glycosylation profiles of the glycans may be more informative of ovarian cancer and can provide insight into biological mechanisms underlying glycan aberration in serum of diseased individuals. Serum samples from women diagnosed with epithelial ovarian cancer (EOC, n = 84) and matched healthy controls (n = 84) were obtained from the Gynecologic Oncology Group. Immunoglobulin (IgG, IgA, and IgM) concentrations and glycosylation profiles were quantified using multiple reaction monitoring mass spectrometry. Differential and classification analyses were performed to identify aberrant protein-specific glycopeptides using a training set. All findings were validated in an independent test set. Multiple glycopeptides from immunoglubins IgA, IgG, and IgM were found to be differentially expressed in serum of EOC patients compared with controls. The protein-specific glycosylation profiles showed their potential in the diagnosis of EOC. In particular, IgG-specific glycosylation profiles are the most powerful in discriminating between EOC case and controls. Additional studies of protein- and site-specific glycosylation profiles of immunoglobulins and other proteins will allow further elaboration on the characteristics of biological functionality and causality of the differential glycosylation in ovarian cancer and thus ultimately lead to increased sensitivity and specificity of diagnosis.
Testis-Specific Y-Centric Protein–Protein Interaction Network Provides Clues to the Etiology of Severe Spermatogenic Failure
Naser Ansari-Pour *- ,
Zahra Razaghi-Moghadam - ,
Farnaz Barneh - , and
Mohieddin Jafari
Pinpointing causal genes for spermatogenic failure (SpF) on the Y chromosome has been an ever daunting challenge with setbacks during the past decade. Since complex diseases result from the interaction of multiple genes and also display considerable missing heritability, network analysis is more likely to explicate an etiological molecular basis. We therefore took a network medicine approach by integrating interactome (protein–protein interaction (PPI)) and transcriptome data to reconstruct a Y-centric SpF network. Two sets of seed genes (Y genes and SpF-implicated genes (SIGs)) were used for network reconstruction. Since no PPI was observed among Y genes, we identified their common immediate interactors. Interestingly, 81% (N = 175) of these interactors not only interacted directly with SIGs, but also they were enriched for differentially expressed genes (89.6%; N = 43). The SpF network, formed mainly by the dys-regulated interactors and the two seed gene sets, comprised three modules enriched for ribosomal proteins and nuclear receptors for sex hormones. Ribosomal proteins generally showed significant dys-regulation with RPL39L, thought to be expressed at the onset of spermatogenesis, strongly down-regulated. This network is the first global PPI network pertaining to severe SpF and if experimentally validated on independent data sets can lead to more accurate diagnosis and potential fertility recovery of patients.
A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline
Paul A. Rudnick *- ,
Sanford P. Markey - ,
Jeri Roth - ,
Yuri Mirokhin - ,
Xinjian Yan - ,
Dmitrii V. Tchekhovskoi - ,
Nathan J. Edwards - ,
Ratna R. Thangudu - ,
Karen A. Ketchum - ,
Christopher R. Kinsinger - ,
Mehdi Mesri - ,
Henry Rodriguez - , and
Stephen E. Stein
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography–tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level (“rolled-up”) precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.
Novel Elements of the Chondrocyte Stress Response Identified Using an in Vitro Model of Mouse Cartilage Degradation
Richard Wilson *- ,
Suzanne B. Golub - ,
Lynn Rowley - ,
Constanza Angelucci - ,
Yuliya V. Karpievitch - ,
John F. Bateman - , and
Amanda J. Fosang
The destruction of articular cartilage in osteoarthritis involves chondrocyte dysfunction and imbalanced extracellular matrix (ECM) homeostasis. Pro-inflammatory cytokines such as interleukin-1α (IL-1α) contribute to osteoarthritis pathophysiology, but the effects of IL-1α on chondrocytes within their tissue microenvironment have not been fully evaluated. To redress this we used label-free quantitative proteomics to analyze the chondrocyte response to IL-1α within a native cartilage ECM. Mouse femoral heads were cultured with and without IL-1α, and both the tissue proteome and proteins released into the media were analyzed. New elements of the chondrocyte response to IL-1α related to cellular stress included markers for protein misfolding (Armet, Creld2, and Hyou1), enzymes involved in glutathione biosynthesis and regeneration (Gstp1, Gsto1, and Gsr), and oxidative stress proteins (Prdx2, Txn, Atox1, Hmox1, and Vnn1). Other proteins previously not associated with the IL-1α response in cartilage included ECM components (Smoc2, Kera, and Crispld1) and cysteine proteases (cathepsin Z and legumain), while chondroadherin and cartilage-derived C-type lectin (Clec3a) were identified as novel products of IL-1α-induced cartilage degradation. This first proteome-level view of the cartilage IL-1α response identified candidate biomarkers of cartilage destruction and novel targets for therapeutic intervention in osteoarthritis.
High-Throughput LC–MS/MS Method for Direct Quantification of Glucuronidated, Sulfated, and Free Enterolactone in Human Plasma
Natalja P. Nørskov *- ,
Cecilie Kyrø - ,
Anja Olsen - ,
Anne Tjønneland - , and
Knud Erik Bach Knudsen
Sulfation and glucuronidation constitute a major pathway in humans and may play an important role in biological activity of metabolites including the enterolignan, enterolactone. Because the aromatic structure of enterolactone has similarities to steroid metabolites, it was hypothesized that enterolactone may protect against hormone-dependent cancers. This led to numerous epidemiological studies. In this context, there has been a demand for rapid, sensitive, high-throughput methods to measure enterolactone in biofluids. Different methods have been developed using GC–MS, HPLC, LC–MS/MS and a fluoroimmunoassay; however, most of these methods measure the total concentration of enterolactone, without any specification of its conjugation pattern. Here for the first time we present a high-throughput LC–MS/MS method to quantify enterolactone in its intact form as glucuronide, sulfate, and free enterolactone. The method has shown good accuracy and precision at low concentration and very high sensitivity, with LLOQ for enterolactone sulfate at 16 pM, enterolactone glucuronide at 26 pM, and free enterolactone at 86 pM. The short run time of 2.6 min combined with simple sample clean up and high sensitivity make this method attractive for the high-throughput of samples needed for epidemiological studies. Finally, we have adapted the new method to quantify enterolactone and its conjugates in 3956 plasma samples from an epidemiological study. We found enterolactone glucuronide to be the major conjugation form and that conjugation pattern was similar between men and women.
Comparative Analysis of the Endogenous Peptidomes Displayed by HLA-B*27 and Mamu-B*08: Two MHC Class I Alleles Associated with Elite Control of HIV/SIV Infection
Miguel Marcilla *- ,
Iñaki Alvarez - ,
Antonio Ramos-Fernández - ,
Manuel Lombardía - ,
Alberto Paradela - , and
Juan Pablo Albar
Indian rhesus macaques are arguably the most reliable animal models in AIDS research. In this species the MHC class I allele Mamu-B*08, among others, is associated with elite control of SIV replication. A similar scenario is observed in humans where the expression of HLA-B*27 or HLA-B*57 has been linked to slow or no progression to AIDS after HIV infection. Despite having large differences in their primary structure, it has been reported that HLA-B*27 and Mamu-B*08 display peptides with sequence similarity. To fine-map the Mamu-B*08 binding motif and assess its similarities with that of HLA-B*27, we affinity purified the peptidomes bound to these MHC class I molecules and analyzed them by LC-MS, identifying several thousands of endogenous ligands. Sequence analysis of both sets of peptides revealed a degree of similarity in their binding motifs, especially at peptide position 2 (P2), where arginine was present in the vast majority of ligands of both allotypes. In addition, several differences emerged from this analysis: (i) ligands displayed by Mamu-B*08 tended to be shorter and to have lower molecular weight, (ii) Mamu-B*08 showed a higher preference for glutamine at P2 as a suboptimal binding motif, and (iii) the second major anchor position, found at PΩ, was much more restrictive in Mamu-B*08. In this regard, HLA-B*27 bound efficiently peptides with aliphatic, aromatic (including tyrosine), and basic C-terminal residues while Mamu-B*08 preferred peptides with leucine and phenylalanine in this position. Finally, in silico estimations of binding efficiency and competitive binding assays to Mamu-B*08 of several selected peptides revealed a good correlation between the characterized anchor motif and binding affinity. These results deepen our understanding of the molecular basis of the presentation of peptides by Mamu-B*08 and can contribute to the detection of novel SIV epitopes restricted by this allotype.
Quantitative Profiling of Combinational K27/K36 Modifications on Histone H3 Variants in Mouse Organs
Yanyan Yu - ,
Jiajia Chen - ,
Yuan Gao - ,
Jun Gao - ,
Rijing Liao - ,
Yi Wang - ,
Counde Oyang - ,
En Li - ,
Chenhui Zeng - ,
Shaolian Zhou - ,
Pengyuan Yang *- ,
Hong Jin *- , and
Wei Yi *
The coexisting post-translational modifications (PTMs) on histone H3 N-terminal tails were known to crosstalk between each other, indicating their interdependency in the epigenetic regulation pathways. H3K36 methylation, an important activating mark, was recently reported to antagonize with PRC2-mediated H3K27 methylation with possible crosstalk mechanism during transcription regulation process.1 On the basis of our previous studies, we further integrated RP/HILIC liquid chromatography with MRM mass spectrometry to quantify histone PTMs from various mouse organs, especially the combinatorial K27/K36 marks for all three major histone H3 variants. Despite their subtle difference in physicochemical properties, we successfully obtained decent separation and high detection sensitivity for both histone H3.3 specific peptides and histone H3.1/3.2 specific peptides. In addition, the overall abundance of H3.3 can be quantified simultaneously. We applied this method to investigate the pattern of the combinatorial K27/K36 marks for all three major histone H3 variants across five mouse organs. Intriguing distribution differences were observed not only between different H3 variants but also between different organs. Our data shed the new insights into histone codes functions in epigenetic regulation during cell differentiation and developmental process.
Technical Notes
Fast and Reliable Quantitative Peptidomics with labelpepmatch
Rik Verdonck *- ,
Wouter De Haes - ,
Dries Cardoen - ,
Gerben Menschaert - ,
Thomas Huhn - ,
Bart Landuyt - ,
Geert Baggerman - ,
Kurt Boonen - ,
Tom Wenseleers - , and
Liliane Schoofs *
The use of stable isotope tags in quantitative peptidomics offers many advantages, but the laborious identification of matching sets of labeled peptide peaks is still a major bottleneck. Here we present labelpepmatch, an R-package for fast and straightforward analysis of LC–MS spectra of labeled peptides. This open-source tool offers fast and accurate identification of peak pairs alongside an appropriate framework for statistical inference on quantitative peptidomics data, based on techniques from other -omics disciplines. A relevant case study on the desert locust Schistocerca gregaria proves our pipeline to be a reliable tool for quick but thorough explorative analyses.
RePLiCal: A QconCAT Protein for Retention Time Standardization in Proteomics Studies
Stephen W. Holman - ,
Lynn McLean - , and
Claire E. Eyers *
This publication is Open Access under the license indicated. Learn More
This study introduces a new reversed-phase liquid chromatography retention time (RT) standard, RePLiCal (Reversed-phase liquid chromatography calibrant), produced using QconCAT technology. The synthetic protein contains 27 lysine-terminating calibrant peptides, meaning that the same complement of standards can be generated using either Lys-C or trypsin-based digestion protocols. RePLiCal was designed such that each constituent peptide is unique with respect to all eukaryotic proteomes, thereby enabling integration into a wide range of proteomic analyses. RePLiCal has been benchmarked against three commercially available peptide RT standard kits and outperforms all in terms of LC gradient coverage. RePLiCal also provides a higher number of calibrant points for chromatographic retention time standardization and normalization. The standard provides stable RTs over long analysis times and can be readily transferred between different LC gradients and nUHPLC instruments. Moreover, RePLiCal can be used to predict RTs for other peptides in a timely manner. Furthermore, it is shown that RePLiCal can be used effectively to evaluate trapping column performance for nUHPLC instruments using trap-elute configurations, to optimize gradients to maximize peptide and protein identification rates, and to recalibrate the m/z scale of mass spectrometry data post-acquisition.
Site-Specific Identification of Lysine Acetylation Stoichiometries in Mammalian Cells
Tong Zhou - ,
Ying-hua Chung - ,
Jianji Chen - , and
Yue Chen *
Functional characterization of the lysine acetylation pathway requires quantitative measurement of the modification abundance at the stoichiometry level. Here, we developed a systematic workflow for global untargeted identification of site-specific Lys acetylation stoichiometries in mammalian cells. Our strategy includes an optimized protocol for in vitro chemical labeling of unmodified lysine with stable isotope-encoded acetyl-NHS ester, deep proteomic profiling with a high resolution mass spectrometer, and a new software tool for quantitative analysis and stoichiometry determination. The workflow was validated using in vitro chemically labeled BSA and synthetic peptides with multiple Lys acetylations at various positions. In the proof-of-concept study, we applied the strategy to analyze the proteome of HeLa cells and determined the stoichiometries of over 600 acetylation sites with good reproducibility. Sodium butyrate treatment induced a significant increase of acetylation stoichiometries in HeLa cells. Analysis of site-specific stoichiometry dynamics revealed the coregulation of closely positioned acetylation sites on histones H3 and H4 upon treatment.
Additions and Corrections
Correction to “Glycosylation of Human Plasma Clusterin Yields a Novel Candidate Biomarker of Alzheimer’s Disease”
Hui-Chung Liang - ,
Claire Russell - ,
Vikram Mitra - ,
Raymond Chung - ,
Abdul Hye - ,
Chantal Bazenet - ,
Simon Lovestone - ,
Ian Pike - , and
Malcolm Ward
This publication is free to access through this site. Learn More
Mastheads
Issue Editorial Masthead
This publication is free to access through this site. Learn More
Issue Publication Information
This publication is free to access through this site. Learn More