Integrative Metabolomics and Proteomics Allow the Global Intracellular Characterization of Bacillus subtilis Cells and Spores

Reliable and comprehensive multi-omics analysis is essential for researchers to understand and explore complex biological systems more completely. Bacillus subtilis (B. subtilis) is a model organism for Gram-positive spore-forming bacteria, and in-depth insight into the physiology and molecular basis of spore formation and germination in this organism requires advanced multilayer molecular data sets generated from the same sample. In this study, we evaluated two monophasic methods for polar and nonpolar compound extraction (acetonitrile/methanol/water; isopropanol/water, and 60% ethanol) and two biphasic methods (chloroform/methanol/water, and methyl tert-butyl ether/methanol/water) on coefficients of variation of analytes, identified metabolite composition, and the quality of proteomics profiles. The 60% EtOH protocol proved to be the easiest in sample processing and was more amenable to automation. Collectively, we annotated 505 and 484 metabolites and identified 1665 and 1562 proteins in B. subtilis vegetative cells and spores, respectively. We also show differences between vegetative cells and spores from a multi-omics perspective and demonstrate that an integrative multi-omics analysis can be implemented from one sample using the 60% EtOH protocol. The results obtained by the 60% EtOH protocol provide comprehensive insight into differences in the metabolic and protein makeup of B. subtilis vegetative cells and spores.


■ INTRODUCTION
Bacillus subtilis (B.subtilis) is a prevalent Gram-positive bacterium responsible for a wide range of food spoilage in the food industry. 1As an endospore-forming bacterium, B. subtilis is found in many adverse environments, such as soil, the gut of terrestrial and aquatic animals including mammals, industrial installations, and healthcare facilities. 2 Survival within these extreme conditions is accomplished by the integration of adaptive responses at the protein and metabolite levels leading to the generation of spores with their unique multilayered structural 3 and resistance features. 4B. subtilis is a go-to model species in work to elucidate the fundamental principles of spore formation. 5However, while B. subtilis sporulation has been studied for many years, it is only recently that researchers have combined multilevel high-throughput data to understand spores' survival mechanisms and adaptation to extreme environments at the molecular level. 6,7ith the advancement of high-throughput technologies, large-scale molecular omics data sets have been widely utilized in biological research and have made possible cost-efficient, highthroughput analysis of biologic molecules. 8The combination of multi-omics allows researchers to understand the functional changes of organisms caused by genes or environments through systematic biological information flow (DNA, RNA, protein, metabolite, and lipid), rather than just using a single omics content to find the reason for the change. 9Recently, genomics and transcriptomics research has made tremendous progress, but this information alone cannot predict or deduce essential information at the protein, metabolite, or lipid levels. 10For instance, changes in gene expression levels do not always correlate with translated protein abundance due to temporal shifts in the transcriptome and proteome. 11However, in multiomics studies that include transcriptomics or proteomics, or both, there are very few studies that also examine the metabolome. 12Importantly, metabolomics is a powerful tool for deciphering microbial metabolism and bridging the phenotype−genotype gap because it amplifies proteomic changes and provides better characterization of biological phenotypes than any other method. 13ulti-omics experiments often contain many different steps, each of which strongly affects the results obtained.As the first step of multi-omics research, sample preparation becomes particularly important, as the collection of reliable and highly reproducible data is crucial for drawing correct conclusions later.Due to the chemical diversity and complexity of biological components in an organism, the efficiency of the applied extraction method greatly affects the omics data, even apart from sample quality. 14Additionally, the amounts of samples are often limited, which means that it is desirable to obtain as much molecular information from a single sample as possible.The use of the same samples for multi-omics analyses will also increase consistency and comparability and decrease effort inherent in different parallel sample handling. 15Multi-omics studies often use methods originally developed to extract either metabolites or lipids, where protein precipitation also occurs.For instance, the Folch 16 and Matyash 17 methods are biphasic extraction methods in which samples are homogenized with two immiscible lipophilic and hydrophilic solvents to simultaneously extract polar and nonpolar compounds, separating them into two solvent layers.The aqueous phase contains hydrophilic metabolites, the organic phase contains lipids and other hydrophobic metabolites, and protein is precipitated in the interphase or at the bottom of the sample.Monophasic extracts isolate water-soluble metabolites or lipids preferentially using polar solvent(s) or nonpolar solvent(s), respectively.The biphasic methods allow the extraction of polar and lipid metabolites from the same sample, unlike monophasic extraction.However, monophasic protocols avoid the need to remove the biphasic liquid layers, which is difficult to automate and likely increases technical variation. 18ecent examples of use of biphasic extraction methods in pursuit of multi-omics analysis is the use of chloroform/ methanol/water (CHCl 3 /MeOH/H 2 O) to extract 1967 metabolites, 424 lipids and 1849 proteins, from a single Arabidopsis sample, 19 while SIMPLEX extraction used methyl-tert-butylether/methanol/water (MTBE/MeOH/H 2 O) to identify 75 metabolites, 360 lipids, and 3327 proteins in mesenchymal stem cells. 20However, simultaneous extraction of DNA (genomics), RNA (transcriptomics), proteins (proteomics), metabolites (metabolomics), and lipids (lipidomics) from a single biological sample is rare.This may be caused by the fact that optimal buffers, solutions, and protocols for extracting these very different molecules may be mutually exclusive, and longer extraction protocols can lead to the molecular degradation of more labile molecular species.
In microbial metabolomics of Gram-positive bacteria such as B. subtilis, sampling accuracy of the metabolome is key, as the highly dynamic nature necessitates fast sampling and quenching to preserve the metabolic state.The energy status of the B. subtilis vegetative cells, characterized by its adenylate energy charge (EC), commonly 0.8−0.85 in actively growing cells, can be used for comparing extraction conditions and downstream analytic methods. 21These authors found that 60% cold EtOH extraction of rapidly cooled B. subtilis cells isolated by vacuum filtration followed by cold water extraction gave an EC value of 0.81 ± 0.03.However, the EC in spores of several Bacillus species is ≤0.2 with little if any ATP present in these dormant and thought of as metabolically inactive bacterial survival structures. 22,23Metabolic and lipidomic profiling of Bacillus vegetative cells has been reported. 21,24,25−30 No studies to date globally compared the spore and vegetative cellular composition utilizing mass spectrometrybased metabolomics in conjunction with proteomics for multiomics analyses.Therefore, in this study, we compare several biphasic and monophasic extraction protocols to find a reproducible and robust methodology to obtain simultaneous intracellular metabolome and proteome data profiles for B. subtilis spore and vegetative cell integrative multi-omics analysis.We evaluated two biphasic methods (CHCl 3 /MeOH/H 2 O, MTBE/MeOH/H 2 O), and two monophasic extraction methods, i.e., 60% ethanol (EtOH) and the combination of separate extractions using acetonitrile/methanol/water (ACN/MeOH/ H 2 O) and isopropanol/water (IPA/H 2 O) for polar and apolar compound extraction.The quality and repeatability, as well as quantity and identity of metabolites extracted by different extraction methods, were investigated by profiling intracellular metabolites with liquid chromatography−mass spectrometry (LC-MS).Furthermore, we compared the proteome coverage of vegetative cells and spore samples to control samples prepared by standard proteomics sample preparation procedures.Together, these results provide a basis for the detailed multiomics profiling of B. subtilis spores and growing cells, which may lead to a deeper understanding of molecular mechanisms of spore formation and their unparalleled stress resistance properties.While our method is generally comprehensive and robust, we acknowledge that improvements can still be made as, compared to classical boiling propanol extraction and in contrast to expectation, 3-PGA and nucleoside triphosphates were not robustly detected in spores and cells respectively.

Vegetative Cell Culturing and Sporulation
A B. subtilis PY79 single colony was picked from an LB plate, 31 inoculated in 3−5 mL of LB medium (pH 7.5), and grown at 37 °C in a shaking incubator at 200 rpm until early log phase (OD 600 0.3−0.4).For vegetative cell preparation, 1 mL of early log-phase cells was inoculated into a 100 mL flask and shaken at 200 rpm, 37 °C.Upon reaching midexponential (OD 600 0.6− 0.7) growth in the LB medium, the cells were subsequently harvested.For sporulation, serial dilutions of the early log-phase cells were made in 5 mL of defined sporulation minimal medium, described previously 32 and incubated at 37 °C overnight.The culture with an OD 600 of 0.3−0.4 was diluted with the prewarmed MOPS medium and grown at 37 °C in a 500 mL flask shaking at 200 rpm for 3 days, and spores were harvested by centrifugation.The harvested spores were washed three times with chilled milliQ-water to reduce remaining vegetative cells, and Histodenz gradient centrifugation 28 was used to remove vegetative cells and phase dark cells.Only samples with >95% of phase bright spores as examined under a microscope were used for further research.

Experimental Design
There are four metabolite extraction methods used in this study, the monophasic 60% (w/v) EtOH, 33 a combination of two monophasic extractions (TMEC) 34 and the biphasic MTBE 15 and Bligh and Dyer 35 extractions.For each method, there were three biological replicates, each analyzing samples of OD 600 = 20.

Extraction Methods
Monophasic Method: 60% (w/v) EtOH Extraction.For metabolite and protein extraction and cell disruption, the harvested vegetative cell and spore pellets were transferred into 15 mL Falcon tubes containing 1 mL of cold 60% EtOH and quenched in liquid nitrogen.The whole extraction process must be carried out on ice or at 4°.Subsequently, samples were thawed on ice and split into three bead-beating tubes containing 0.5 mL of 0.1 mm zirconium-silica beads (BioSpec Products, Bartlesville, OK, USA).−38 In addition, bead beating has been demonstrated to yield superior extraction results compared to heat-based methods using detergent-based buffers for bacterial samples. 39,40The samples extracted from each replicate were combined in a 15 mL Falcon tube following cell disruption, the zirconium-silica beads were thoroughly washed two times with 1 mL of cold 60% EtOH to collect metabolites and cell debris from the beads.The washing solutions were combined with the metabolite extracts and centrifuged for 5 min at 4 °C and 8000 rpm to harvest the supernatant for metabolomics analysis and cell debris and precipitated proteins for proteomics analysis, and both were dried under nitrogen flow.The resulting samples can be stored at −80 °C until analysis.
Biphasic Method: Extraction with MTBE/MeOH/H 2 O. Twenty OD 600 units of vegetative cells and spores were suspended in 1 mL of cold MeOH separately and vortexed well for 30 s. Afterward, the quenching, bead beating, and washing steps were as described in the 60% EtOH extraction method, except for the different organic solvents.Then, 10 mL of MTBE was added, and samples were incubated on an orbital shaker at 100 rpm for 45 min at 4 °C.To induce phase separation, 2.5 mL of water was added to each sample tube and then vortexed vigorously.Samples were centrifuged at 10,000 g for 5 min at 4 °C, and the protein pellet was in the bottom of the tube.The upper phase, lipid-containing and lower phase (polar and semipolar metabolites) were transferred to a glass tube, and its subsequent evaporation in a speed-vacuum concentrator or nitrogen evaporator.After the supernatant was removed, the protein pellet was dried by evaporation and stored at −80 °C until analysis.
Biphasic Method: Extraction with CHCl 3 /MeOH/H 2 O.This extraction method was performed using the same procedure as the MTBE/MeOH/H 2 O extraction, with the difference that the MTBE was replaced by CHCl 3 and a ratio of CHCl 3 /MeOH/H 2 O of (2:2:1.8v/v/v).Subsequently the upper polar metabolites, the lower apolar metabolites, and the interphases containing protein precipitate were collected.Both different phases and pellet were dried in a centrifugal vacuum evaporator or a nitrogen evaporator and stored at −80 °C.

Extraction of Metabolites from Dormant Spores Using Boiling Propanol
To evaluate whether the bead milling procedure to disrupt spores and subsequent extraction using 60% EtOH had any effect on the composition of the metabolome extracted we also extracted 20 OD 600 units of spores as described previously 22 through direct extraction in boiling 80% 1-propanol for 5 min without mechanical disruption of spores.Extracts were lyophilized and then dissolved in cold water.Afterward, the fluids were centrifuged at 14,000g for 1 min, and all supernatant fractions were stored at −80 °C for analysis.

Single Tube Solid Phase Sample Preparation (SP3)
The vegetative cell or spore protein pellets obtained above were dissolved in 300 μL of 1% SDS in 100 mM ABC and vortexed thoroughly.The BCA assay was used to determine the concentration of protein according to the manual, and TCEP and CAA were added to 10 and 30 mM, respectively, and the mix was incubated for 0.5 h at room temperature.Samples were processed using the SP3 protein clean up, 41 and trypsin (protease/protein, 1:50, w/w) was added and protein was digested at 37 °C overnight.The supernatant was acidified to decouple peptides from the beads with formic acid (FA) (1% final concentration and a pH ∼ 2) and centrifuged at 3000g at room temperature, the supernatant was moved to a clean tube, the centrifugation was repeated, and the supernatant was collected for LC-MS analysis.

LC-MS/MS Analysis
Metabolomics.The dry apolar metabolites obtained by extractions by monophasic methods were reconstituted in 200 μL water while by biphasic methods were reconstituted in 200 μL IPA/ACN/water (4:3:1) and all the dry polar metabolites were in 200 μL ACN/water (1:1).10 μL was injected for nontargeted metabolomics onto a CSH-C18 column (100 mm × 2.1 mm, 1.7 μm particle size, Waters, Massachusetts, USA) for apolar metabolites analysis or a BEH-Amide column (100 mm × 2.1 mm, 1.7 μm particle size, Waters, Massachusetts, USA) for polar metabolites by an Ultimate 3000 UHPLC system (Thermo Scientific, Dreieich Germany).Using a binary solvent system (A: 0.1% FA in water, B: 0.1% FA in ACN) metabolites were separated on the CSH-C18 column by applying a linear gradient from 1 to 99% B in 18 min at a flow rate of 0.4 mL/min, while metabolites were separated on the BEH-Amide column by applying a linear gradient from 99% B to 40% B in 6 min and then to 4% B in 2 min at a flow rate of 0.4 mL/min.Eluting analytes were electrosprayed into a hybrid trapped-ion-mobilityspectrometry quadrupole time-of-flight mass spectrometer (TIMS-TOF Pro, Bruker, Bremen Germany), using a capillary voltage of 4500 V in positive mode and 3500 V in negative mode, with source settings as follows: end plate offset 500 V, dry temp 250 °C, dry gas 8 l/min and nebulizer set at 3 bar both using Journal of Proteome Research nitrogen gas.Mass spectra were recorded using a datadependent acquisition approach in the range from m/z 20− 1300 for polar and 100−1350 for the apolar metabolites in positive and negative ion mode using nitrogen as collision gas.Auto MS/MS settings are as follows: quadrupole ion energy 5 eV, quadrupole low mass 60 m/z, and collision energy 7 eV.Active exclusion was enabled for 0.2 min, reconsidering precursors if the ratio current/previous intensity >2.
Proteomics.Peptides were dissolved in 6 μL of water containing 0.1% FA and 3% ACN and then 1 μL of 200 ng/μL (measured by a NanoDrop at a wavelength of 215 nm) of the peptide was injected by an Ultimate 3000 RSLCnano UHPLC system (Thermo Scientific, Germeringen, Germany).Following injection, the peptides were loaded onto a 75 μm × 250 mm analytical column (C18, 1.6 μm particle size, Aurora, Ionopticks, Australia) kept at 50 °C and flow rate of 400 nl/min at 3% solvent B for 1 min (solvent A: 0.1% FA, solvent B: 0.1% FA in ACN).Subsequently, a stepwise gradient of 2% solvent B at 5 min, followed by 17% solvent B at 24 min, 25% solvent B at 29 min, 34% solvent B at 42 min, and 99% solvent B at 33 min, was held until 40 min, returning to initial conditions at 40.1 min and equilibrating until 58 min.Eluting peptides were sprayed by the emitter coupled to the column into a captive spray source (Bruker, Bremen Germany) that was coupled to a timsTOF Pro mass spectrometer.The TIMS-TOF was operated in the PASEF mode of acquisition for standard proteomics.In PASEF mode, the quad isolation width was 2 Th at 700 m/z and 3 Th at 800 m/ z, and the values for collision energy were set from 20 to 59 eV over the TIMS scan range.Precursor ions in an m/z range between 100 and 1700 with a TIMS range of 0.6 and 1.6 Vs/cm 2 were selected for fragmentation.Ten PASEF MS/MS scans were triggered with a total cycle time of 1.16 s, with target intensity of 2e 4 , intensity threshold of 2.5e 3 , and a charge state range of 0−5.Active exclusion was enabled for 0.4 min, reconsidering precursors if the ratio current/previous intensity >4.

LC-MS/MS Data Processing
The metabolite mass spectrometry raw files were submitted to MetaboScape 5.0 (Bruker Daltonics, Germany) used to perform data deconvolution, peak-picking, and alignment of m/z features using the TReX 3D peak extraction and alignment algorithm (EIC correlation set at 0.8).All spectra were recalibrated on an internal lockmass segment (NaFormate clusters), and peaks were extracted with a minimum peak length of 12 spectra (True for recursive extraction) and an intensity threshold of 500 counts for peak detection.In the negative mode ion deconvolution setting, [M − H] − was set for the primary ion, seed ions were  42 Features were annotated using SMARTFORMULA (narrow threshold, 3.0 mDa, mSigma:15; wide threshold, 5.0 mDa, mSigma:30), to calculate a molecular formula.Spectral libraries including Bruker MetaboBASE 3.0, Bruker HDBM 2.0, MetaboBASE 2.0 in silico, MSDIAL LipidDBs, MoNA VF NPL QTOF, AND GNPS export were used for feature annotation (narrow threshold, 2.0 mDa, mSigma 10, msms score 900, wide threshold 5.0 mDa, mSigma:20 msms score 800).Analyte lists containing 667 compounds with a retention time (RT) (narrow threshold, 1.0 mDa, 0.05 min, mSigma: 10, msms score 900; wide threshold, 5.0 mDa, 0.1 min, mSigma 50, msms score 700) was also used to annotate deconvoluted features.An annotated feature was considered to be of high confidence if more than two green boxes were present in the Annotation Quality column of the program and low confidence if fewer than two green boxes were present.Resulting data were exported for further analysis with MetaboAnalyst 5.0 (https:// www.metaboanalyst.ca/MetaboAnalyst/home.xhtml).At first, the data showing a poor variation were filtered on Inter Quartile Range (IQR) and then features were normalized by median normalization, scaled by auto scaling, and transformed to a logarithmic scale (base of 2).
Generated mass spectra for pellets (vegetative cells and spores) were analyzed with Maxquant (ver.1.6.14)for feature detection and protein identification.Tims-DDA was set in type of group specific, and other parameters were set as the default.Searches included variable modifications of methionine oxidation, and a fixed modification of cysteine carbamidomethyl and the proteolytic enzyme was trypsin with a maximum of two missed cleavages.A B. subtilis database (version 2019 downloaded from Uniprot) was used for database searches.To improve the mass accuracy of matching precursors, the "match between runs" option was applied within a match window time of 0.2 min and a match ion mobility window of 0.05.Proteins of label free quantification (LFQ) calculated for each pellet represented normalized peptide intensities correlated with protein abundances.Finally, all the quantification and annotation information were summed in the output proteinGroup.txt.
Perseus (1.6.15.0) was processed for analysis of Maxquant results (proteinGroup.txt).Briefly, the LFQ intensity of each sample was selected as the main data matrix and then potential contaminants, reverse and only identified by site were removed.Afterward, the LFQ intensity of proteins of which the unique peptides >1 was transformed to log2[x], and proteins in at least two of the three replicates were further analyzed.Normalization was achieved using a z-score with matrix access by rows (rows are proteins, columns are samples).K means was used for the hierarchical clustering of rows (proteins) and significant protein expression differences between different extractions were identified using P-value <0.05 from Student's t test.
Calculation and plotting for principal component analysis (PCA) of the metabolome profiles were implemented with SIMCA software.Kyoto encyclopedia of Genes and Genomes (KEGG) enrichment analysis of differentially expressed proteins was implemented by the ClusterProfiler R package.For differentially expressed metabolites, MetaboAnalyst 5.0 (https://www.metaboanalyst.ca/)was applied for KEGG pathway enrichment.Gene Ontology (GO) enrichment analysis was applied with geneontology (http://geneontology.org/).For metabolites and proteins correlation network calculation and visualization, R packages WGCNA and Cytoscape (Version 3.8.2) were used.

Comparison of the Metabolome and Lipidome Extracted from Cells and Spores
Given that extraction methods can alter the measurable metabolome and lipidome, we first set out to compare how different extraction methods affect mass spectrometry-based analysis of cells and spores of B. subtilis.To assess the four extraction methods, monophasic (EtOH, TMEC) and biphasic (MeOH/MTBE, MeOH/CHCl 3 ), we assessed the number of metabolite features extracted, the variability of quantitation, and Values between brackets show an interval of 90% of the CVs of an extraction method based on three biological replicates.   1 and Figure S1).Among the detected m/z features, some were putatively annotated with a molecular structure (Table 1).Following removal of redundancy (features being assigned the same putative structure), the four protocols shared 338 cellular metabolites of 598 total and 281 spore metabolites of 532 total (Figure 1A,B).Of the extraction methods, the monophasic EtOH method had the most putatively annotated metabolites in both vegetative cells (505) and spores (484) as well as most that were only found in this extraction method (34 in cells and 50 in spores).
To assess whether there was any specific bias in the type of molecules extracted by the different methods, we assigned the nonredundant putative annotations into molecular categories 43 (Figure 1C,D).Only metabolites identified in at least two replicates were considered in the following analysis.Strikingly both monophasic methods show a markedly higher number of molecules in the categories of organic acids and derivatives and nucleosides, nucleotides, and analogues in both spores and cells compared to the biphasic methods.On the other hand, the biphasic methods had a larger number of molecules in the category of lipids and lipid-like molecules in vegetative cells but not spores.We examined the categories of the lipids extracted by the four different methods from cell and spore samples more closely (Figure 1E,F).
The biphasic methods found a larger number of lipids (124 vs 97) in vegetative cells, while lipids detected differed little (43 vs  46) in spores between the biphasic and monophasic methods.Strikingly diacyl-and triacyl-glycerols (DAG and TAG) were only detected in samples extracted by biphasic methods, and phosphatidylethanolamines (PE) were mainly found in samples extracted by monophasic methods in spores (15 vs 4) and by biphasic methods in vegetative cells (30 vs 23).In terms of solvent polarity, MTBE and CHCl 3 are more nonpolar than EtOH and IPA, explaining why biphasic methods show a better coverage of lipid and lipid-like molecules.This was less obvious in spore samples, even though, as was shown in Bacillus licheniformis, the total lipids of vegetative cells (2.9% of the dry weight) is not very different of that of spores 44 (2.1% of the dry weight).We additionally compared the 60% EtOH extraction to extraction in boiling propanol, which is a method that has been used previously in small molecule analysis of Bacillus spores. 22e assessed whether mechanical spore disruption or residual enzyme activity had any large influence on the quantitation of tentatively annotated molecules from these spores, but no large bias was obvious, although on average nucleotide extraction seemed better with boiling propanol (Figure S2 and Table S1).Noticeably, compared to boiling propanol extraction and in contrast to expectation, 3-PGA and nucleoside triphosphates were not well detected in spores and cells, respectively.
Overall, depending on the background (spores or cells), different extraction methods seem to provide different benefits and drawbacks in terms of reproducibility, number of features detected, and types of molecules annotated without one clearcut best approach.For multi-omics, however, small molecule extraction is only one part of the performance to take into consideration; the other is protein extraction and proteome coverage.

Proteome Coverage of B. subtilis Cells and Spores Using Different Extraction Methods
Since the four extraction protocols assayed in the above were originally established for the extraction of small molecules, we next focused our efforts toward assessing their utility in extracting proteins for proteomics analysis.To establish an optimal protocol for multi-omics analyses from a single sample, we investigated if protein fractions isolated by the two monophasic and biphasic extraction procedures yield similar results compared to extracting proteins directly with 1% SDS (n = 3 for all), which is a common lysis buffer in proteomics.To do so we also dissolved the protein fractions of B. subtilis cells and spores in 1% SDS and used SP3 to clean up all samples prior to digestion with trypsin.Analysis of the protein fractions from the different small molecule extraction methods shows similar or more proteins identified and quantified in cells or spores (detected in all three replicates) compared to direct extraction using 1% SDS (Table 2).
Notably, most proteins identified were from the EtOH extraction in cells (1785 proteins) and from the IPA extraction of TMEC in spores (1659 proteins).Obtaining reproducible quantitative information is critical to identify differential protein expression for downstream analysis in proteomics studies.Therefore, we compared the variance in protein quantitation between the different methods, and median CV showed little difference (ranging between 0.10 and 0.14) between the Values between brackets shows interval of 90% of the CVs of an extraction method based on three biological replicates.different extraction methods compared to the standard extraction using 1% SDS (Table 2 and Figure S3A,B).As such, these methods seem on par or better when compared to the standard extraction method in proteomics when regarding these metrics.Apart from reproducible quantitation, the method of extraction can change the composition of the proteome under study, while digesting proteins using different buffers, surfactants, or denaturing agents can influence the proteome coverage. 45The extraction reagents used in this study are all organic solvents (varying in polarity), which means their precipitating effect on proteins is mediated by increasing the attractive forces between the protein molecules and causing a dehydration effect on the proteins to facilitate the interaction between them. 46,47The overlap of proteins identified from all different extraction methods is very high (Figure S3C,D In cells, 60% EtOH was highly similar in protein quantitation as 1% SDS in cluster 1 but had lower intensities of proteins in cluster 2, which was enriched in proteins of the cell periphery and membrane, and higher amounts of proteins in cluster 3, which was enriched for ribosomal proteins (Figure 2B).There was little difference in the other four extraction methods.In spores, the 1% SDS extracted a larger relative quantity of proteins in clusters 1 and 3, which are enriched in the GO-terms ribosomal subunit and cell wall but less in cluster 2 (Figure 2D).Overall, there is no significant evidence that the metabolite extraction methods affected the coverage of the proteome in our experiments, indicating it is feasible to do simultaneous metabolomic and proteomic analyses from the same sample in B. subtilis cells and spores.Considering performance on these multiple omics and convenience in sample preparation, we

Metabolome of Vegetative Cells and Spores
Many studies have examined the properties of B. subtilis that show significant changes in all kinds of resistance with the change of physiological form. 48However, no studies have tried to globally compare the metabolite content between vegetative cells and spores to obtain a complete view of the molecular composition of these two distinct states of the bacterial life cycle.Comparison of the m/z features that had a putative annotation of a molecular structure showed that 213 metabolites were detected in both data sets, with 292 metabolites uniquely found in cells and 271 metabolites only found in spores (Figure 3A).Metabolites that were annotated in vegetative cells but not in spores or vice versa are likely predominant in cells and spores and potentially important, we further referred to these as cell or spore specific metabolites.However, differences in dynamic range of the metabolome between these two distinct forms of B. subtilis can also lead to nondetection of some of these metabolites in one of the two sample types.Among the spore specific molecules with a putative annotation was dipicolinic acid which is a well-known and abundant molecule involved in spore resistance to a variety of stresses.
Next, we examined the differential expression of these 213 shared metabolites by performing a pair wise comparison.Log2 fold changes greater than 1 (or less than −1) and a −log10 Pvalue greater than 1.31 indicate metabolites that were predominant in cells (red) or spores (blue, Figure 3B).A total of 117 metabolites were differentially present in vegetative cells or spores.We performed KEGG pathway enrichment analysis with all the differentially present metabolites and specific metabolites.Enrichment analysis showed that the altered molecules belonged to 15 pathways, including arginine biosynthesis and purine metabolism (Figure 3C).The detailed relative levels of these metabolites in vegetative cells and spores of these KEGG pathways are shown in Figure 3D.These data show that some small molecules exhibit a significantly higher accumulation in spores compared to bacteria with active metabolism, presumably stored during sporulation.

Analysis of the Cell and Spore Proteome of B. subtilis
To further explore the differences between vegetative cells and spores, we quantitatively investigated the proteomes by the analysis of proteins precipitated during EtOH extraction.Following analysis, 1665 proteins were identified and quantified in vegetative cells and 1562 in spores, which is a distinct improvement over our prior study of B. subtilis spores and cells where we quantified 1086 spore and cellular proteins. 7ollowing stringent filtering, 568 proteins were found to be specific for vegetative cells and 465 were uniquely identified in spores (Figure 4A).Among the uniquely identified in spores unsurprisingly many structural spore and spore coat proteins were found as well as proteins related to spore revival, while the proteins detected in vegetative cells were related to cellular mobility among others.Of the 1097 proteins detected in both stages of the lifecycle of B. subtilis, 693 were predominant in vegetative cells (log2 fold change > 1, P-value < 0.05) while 94 proteins were predominant in spores (log2 fold change <−1, Pvalue < 0.05, Figure 4B).These life cycle specific and predominant proteins were searched for functional enrichment of GO-terms to obtain an overview of molecular function (MF), cellular component (CC), and biological process (BP) differences when Bacilli transition from vegetative cells to spores (Figure 4D).This identified the leading terms as mostly associated with cellular metabolic processes and CCs.To complement these analyses, we also looked for enrichment in the KEGG database to explore the unique and common functional pathways.The KEGG enriched data were classified into three groups (i.e., FDR < 0.05, P-value < 0.05, not significant) from which the top 10 pathways, (red dots Figure 4C) were used for further analysis.Starting from these significantly enriched pathways, a heatmap was constructed to show the expression level in vegetative cells and spores.Results showed that most proteins significantly enriched with the term "the biosynthesis of secondary metabolites" (Figure 4E) had a higher expression in vegetative cells.

Integral Analysis of Omics from Vegetative Cells and Spores
To illustrate the value of multiple omics measurements on the same sample, we performed integrative pathway analysis of differentially expressed metabolites and proteins found in the analysis based on metabolomics and proteomics results of vegetative cells versus spores.This analysis comprised three common KEGG pathways (pyrimidine metabolism, glycine, serine, and threonine metabolism, and methane metabolism), and a Spearman correlation between metabolomics and proteomics data sets was calculated.Only the pyrimidine metabolism pathway showed a Spearman correlation >0.7 and P-value <0.05 between changes in metabolites and protein abundance in vegetative cells versus spores (Figure 5A).The pyrimidine metabolism pathway was highly correlated at both the level of metabolite and proteins expression changes with correlation coefficient of 0.89 and P-value of 0.018.The correlation network shown in Figure 5B consists of 133 edges containing 28 proteins and 10 metabolites, demonstrating a significant coherency between changes in the composition of metabolites and proteins between these two states.
Through studying the comparison between vegetative cells and spores, the results show that 60% EtOH extraction is suitable to identify the cross talk between metabolites and proteins.The correlative effects between metabolome and proteome give additional information on the intermolecular regulation of biological pathways, which is not possible by using only a single omics analysis (Figure 5).

■ DISCUSSION
Although multi-omics studies have significantly increased researchers' potential toward understanding complex biological systems, at the same time, multi-omics studies face challenges in different aspects, such as study design and variable depth of analysis of different omics and the complexity of subsequent integrative data analysis.Among the types of integrated multiomics studies published, the combination of transcriptomics and proteomics is the most common, followed by the integration between transcriptomics and genomics. 49Metabolomics is a young field compared to other omics, in which untargeted metabolomics provides an unbiased, high-coverage metabolome, important for the discovery phase of research.However reproducibility of analysis can be problematic, and it requires complex data processing. 50Therefore, there are few studies integrating metabolomics, and none to date comparing vegetative cells and spores.However, for the most direct information about the physiological state or phenotype, integrating large-scale quantification of proteins and metabolites provides the best description of the biological system.Thus, finding an optimal and robust sample preparation method that can sufficiently address both the proteome and metabolome is a first essential step for robust multi-omics analysis.
This study provides a comprehensive and comparative analysis of the metabolome and proteome of B. subtilis vegetative cell and spore samples.We compared four different extraction methods regarding the composition of metabolites and proteins found as well as the reproducibility of quantification and the number of identified metabolites and proteins.We propose extraction with 60% EtOH, of which the advantages are (i) easy operation and avoid contamination between different polarity layers, (ii) fewer procedures leading to low sampling errors, (iii) a higher number of identified metabolites, (iv) low variation of the proteome, (v) no qualitative losses of the proteome composition, and (vi) no toxicity in line with lab safety and an environmental sustainability.In-depth classification with lipid composition indicated that the method identified fewer lipids than the biphasic approaches.Nonpolar solvents combined with polar solvents have the most potential to extract lipids. 51,52The neutral lipids are dissolved by nonpolar solvents, but neutral lipids, which are partly associated with polar lipids via hydrogen bonding, are extracted to a lesser degree, a polar solvent is used to break these hydrogen bonds extracting these lipids along with polar lipid from the sample. 53Here, the 60% EtOH extraction does extract lipids but is somewhat less suited for studies focused on lipid detection.
The application of 60% EtOH and identification of a close correlation between small molecules and protein abundance were shown by studying the composition of B. subtilis vegetative cells and spores.In contrast to the study of the transcriptome, proteome or both, far fewer studies have focused on combining intracellular metabolomics with proteomics in a multi-omics approach. 12,54However, as important as it is to correlate gene and protein expression to study post-transcriptional regulation and phenotypic-function, the availability of cofactors (e.g., calcium, zinc, ATP, NAD), flux and activities of each enzyme also need to be considered to fully describe the regulation of biological processes. 55Therefore, we attempted to elucidate the difference between B. subtilis vegetative cells and spores within the broader context of both protein and small molecular composition to gain a more complete overview of the interaction between the metabolite and protein content.Here, we profiled the B. subtilis vegetative cell and spore proteome and metabolome by 60% EtOH extraction, followed by LC-MS analysis.Our data showed low technical variation, enabling insight into metabolite and protein composition within spores and cells.We found three common differentially expressed pathways in both metabolomics and proteomics and one prominent example was the pyrimidine metabolism pathway, which showed high coherency in changes of metabolites and protein quantities.Through the construction of the association network between metabolites and proteins, we have attracted our attention the attention of metabolites to O31801(yncF) and P19079 (cdd), which were negatively correlated with metabolites, attracted our attention.In nucleotide metabolism, deoxyuridine 5′-triphosphate pyrophosphatase (YncF) produces 2'-deoxyuridine 5-monophosphate (dUMP) and decreases the intracellular concentration of dUTP, preventing DNA uracil incorporation.Uracil, however, can originate from cytosine deamination and is one of the most frequent erroneous bases in DNA. 56, and cytidine deaminase (CDA, encoded by the cdd gene) is responsible for deaminating cytidine to uridine.
As structural components of several key molecules, pyrimidines play an important role in a wide range of cellular functions, including DNA and RNA synthesis, as a component of triphosphates (UTP and CTP). 57Vegetative cells express a high level of metabolites and proteins involved in pyrimidine metabolism, which is in accordance with previous findings. 7To complete bacterial replication, vegetative cells growing logarithmically require considerable amounts of energy and nucleotides.In contrast, there has not been any report concerning the higher levels of the yncf and cdd enzymes in spores.The transition from vegetative cells to spores reflects the differences that affect developmental programs (such as sporulation), suggesting that these pathways are plastic in nature and may be under greater environmental selection. 58verall, we compared multi-omics sample extraction protocols for B. subtilis vegetative cells and spores, which provides a window on best practices for future systemic integrative studies on sporulation.Our research concluded that 60% EtOH extraction worked easily with more identified metabolites and low variation for an integrative metabolomics and proteomics study of B. subtilis vegetative cells and spores.While the current study did not utilize methanol/MTBE and methanol/CHCl 3 -based protocols, their application is highly relevant in metabolomics investigations that predominantly target lipids or other apolar metabolites.We are aware of the limitation of the 60% EtOH extraction method not resolving, in contrast to expectation, 3-PGA and nucleoside triphosphates in spores and cells, respectively.Nonetheless, we are convinced that this large-scale comparative study of four different extraction approaches for the extraction provides a comprehensive and detailed data set in B. subtilis multi-omics methodology studies that will be beneficial for further study on multimolecular cellular processes during spore formation and spore revival.

Data Availability Statement
All mass spectral data have been deposited into Massive (https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp), with the identifier MSV000092140.

Figure 1 .
Figure 1.Comparison of metabolites identified by different extraction methods in B. subtilis cells and spores.(A, B) Venn-diagram showing the overlapping (shared) and unique metabolites identified in the cells and spores in different methods.(C, D) Bar graphs of metabolites showing the difference in metabolite classes in cells and spores extracted by different methods.(E, F) Comparison of distribution of annotated lipids in major lipid classes identified by different extraction methods in B. subtilis cells and spores.

Figure 2 .
Figure 2. Differences in proteomic profiles of different extraction methods.(A,C) Heatmap of the quantified proteins from the different extraction methods.Proteins from cells and spores are clustered in three clusters (K1, K2, K3) in K-means cluster analysis.(B and D) Bar graphs of fractions of the cluster classified to GOCC categories.Only the top three categories are shown in the graphs.Colors of the bars correspond to the clusters in heatmaps.

Figure 3 .
Figure 3. Analysis of metabolome profiles of B. subtilis vegetative cells and spores.(A) Venn-diagram illustrating the total numbers of annotated compounds uniquely found in cells and spores and the shared metabolites.(B) Volcano plot of comparison between cell and spore metabolomes.Student's t test (P-value < 0.05) and fold changes >1 or <−1 were in red or blue dots.(C) KEGG pathway enrichment analysis of significant metabolites.The x-axis represents pathway impact, and the y-axis represents the−log10(P-value).The dot size represents the metabolite numbers in the pathway.The red color indicates the enriched pathways (FDR < 0.05).(D) Heatmap showing the expression levels of 15 enriched metabolic pathways in cells and spores.The color legend indicates the z-score transformed intensity of the pathway.
) with 1231 proteins and 1157 proteins in vegetative cells and spores found irrespective of extraction method.While 92 (cells) and 33 (spores) proteins were specifically obtained by one of the metabolite extraction methods compared to 1% SDS extraction, only 4 proteins were unique to the 1% SDS extraction.Proteins quantified following the different extraction methods are shown in a heatmap to show the overall differences in representation of individual proteins in the data sets (Figure 2A,C).The quantified proteins are classified into three clusters according to K-means clustering, and these clusters were also annotated by Gene Ontology terms regarding Cellular Component (GOCC).

Figure 4 .
Figure 4. Analysis of proteomics profiles of B. subtilis vegetative cells and spores.(A) Venn-diagram illustrating the total numbers of proteins identified uniquely or shared in cells and spores.(B) Volcano plot of comparison between the cell and spore proteomes.Student t test (P-value < 0.05) and fold changes >1 or <−1 were in red or blue dots.(C) X-axis represents fold enrichment, and the y-axis represents the −log10(FDR).The dot size represents the protein numbers in the pathway.The red color indicates the enriched pathways (FDR < 0.05).(D) GO analysis of differentially and unique proteins showing the items with FDR < 0.05.(E) Heatmap showing the expression levels of 10 enriched metabolic pathways.The color legend indicates the z-score transformed intensity of proteins.

Figure 5 .
Figure 5. Integrative pathway enrichment analysis of multi-omics data extracted by 60% EtOH.(A) Correlation of the expressions for the pyrimidine metabolism pathway between proteomics and metabolomics.(B) Correlation network of proteins and metabolites in the pyrimidine metabolism pathway.Each node represents one protein or one metabolite.The connections were established by Spearman's correlation (Student's t test, P-value < 0.05, and absolute Spearman's correlation value >0.7).

Table 1 .
Molecular Features Detected Using Different Extraction Methods a a compared if there is any bias in classes of detected molecules.For bacterial cells, the biphasic methods led to the detection of most features (MeOH/MTBE > MeOH/CHCl 3 > EtOH > TMEC), while in spores, monophasic methods (TMEC > EtOH > MeOH/CHCl 3 > MeOH/MTBE) detected most features when comparing m/z features detected in all three biological replicates (Table1).Reproducibility of detection of m/z features was also best for biphasic extractions in bacterial cells, with MeOH/CHCl 3 having the lowest median CV (0.24) and TMEC (0.31) being the highest.Conversely, the monophasic extraction TMEC had the lowest median CV (0.15) in bacterial spores for m/z features detected in all three replicates (Table

Table 2 .
Proteins Detected Using Different Extraction Methods a a 10.1021/acs.jproteome.3c00386.Evaluation of metabolite data quality using different extraction methods; relative representation of molecules extracted by 60% EtOH to 1-propanol in spores; and evaluation of proteome data quality using different extraction methods (PDF) Comparison of annotated metabolites extracted by 60% EtOH and 1-propanol (XLSX) Proteins identified and quantified in B. subtilis cells (XLSX) Proteins identified and quantified in B. subtilis spores (XLSX) Annotated metabolic features in B. subtilis cells and spores