QComics: Recommendations and Guidelines for Robust, Easily Implementable and Reportable Quality Control of Metabolomics Data

The implementation of quality control strategies is crucial to ensure the reproducibility, accuracy, and meaningfulness of metabolomics data. However, this pivotal step is often overlooked within the metabolomics workflow and frequently relies on the use of nonstandardized and poorly reported protocols. To address current limitations in this respect, we have developed QComics, a robust, easily implementable and reportable method for monitoring and controlling data quality. The protocol operates in various sequential steps aimed to (i) correct for background noise and carryover, (ii) detect signal drifts and “out-of-control” observations, (iii) deal with missing data, (iv) remove outliers, (v) monitor quality markers to identify samples affected by improper collection, preprocessing, or storage, and (vi) assess overall data quality in terms of precision and accuracy. Notably, this tool considers important issues often neglected along quality control, such as the need of separately handling missing values and truly absent data to avoid losing relevant biological information, as well as the large impact that preanalytical factors may elicit on metabolomics results. Altogether, the guidelines compiled in QComics might contribute to establishing gold standard recommendations and best practices for quality control within the metabolomics community.


■ INTRODUCTION
The metabolome encompasses a multitude of metabolites with diverse physicochemical properties, including substrates and end-products that participate in the endogenous metabolism, metabolites derived from absorption and biotransformation of exogenous compounds from diet, lifestyle habits, and environmental pollution (i.e., the exposome), and microbiota-related metabolites. 1,2Accordingly, mass spectrometry (MS)-based metabolomics generates vast and complex data, comprising hundreds to thousands of molecular features that exhibit wide concentration ranges and large interindividual variability.The acquisition of reproducible and meaningful data requires the application of standard operating procedures (SOPs) to minimize human errors (e.g., errors in pipetting during sample processing), random errors (e.g., fluctuations allocated to intrinsic method precision and other analytical limitations), and systematic errors (e.g., biases that persist throughout the analytical process).−5 On the one hand, metabolite levels can be influenced by a myriad of preanalytical factors related to sample collection and preprocessing, conditions that must be tightly controlled to guarantee the metabolic integrity of biological samples under study. 6Moreover, although metabolomics normally employs straightforward extraction protocols, the chemical complexity of biological matrices requires, at least, the removal of proteins and potential interferences (e.g., salts and lipids) and other additional method-specific steps (e.g., preconcentration and derivatization). 7Despite numerous efforts made for its automation, sample preparation still largely relies on human handling in many laboratories, thus being considered to be one of the most error-prone steps in metabolomics.Finally, instrumental stability issues and fluctuations in analytical performance also have a negative impact on reproducibility and accuracy, especially when dealing with large-scale epidemiological studies or complex samples.Along the experiment, the MS system may suffer from significant drifts in sensitivity, mass accuracy, retention time (RT), and peak resolution due to different reasons: (i) contamination with components from the matrix (e.g., lipids), mobile phase (e.g., buffers), or other impurities (e.g., plasticizers); (ii) deterioration and clogging of chromatographic columns; (iii) uncontrolled temperature in the laboratory and/or instrument components (e.g., autosampler trays), which may provoke degradation and precipitation of samples and mobile phases, as well as to strongly influence the performance of separation, ionization, and detection processes. 4−16 In conventional targeted analysis, variability factors are typically addressed by using spiked internal standards (ISs), most often consisting of isotopically labeled compounds.However, this approach is not viable in untargeted metabolomics as samples may contain thousands of a priori unknown metabolites.A compromise solution commonly applied in large-scale targeted metabolomics is to use a set of ISs representing the physicochemical space of the method coverage (i.e., at least one IS per metabolite class). 2,17As an alternative, the analysis of biological QC samples is nowadays the gold standard in metabolomics, as first proposed by Sangster et al. in 2006. 13These QC samples are usually prepared by pooling equal volumes of all samples under investigation, although surrogate QCs (e.g., commercially available biological samples or certified reference materials) can also be employed when the pooling strategy is not practicable (e.g., large epidemiological cohorts).Then, QC samples are analyzed at the beginning of the analytical run to equilibrate the instrument, as well as at intermittent points throughout the experiment to monitor system stability and correct experimental variability sources during data processing. 18,19Furthermore, some authors have proposed the use of serially diluted QC samples to discard molecular features lacking a linear correlation between the MS response and the relative concentration, which are expected to be artifacts rather than true biological signals. 20However, it should be noted that correlation analysis is inherently sensitive to the dilution strategy (e.g., inclusion of diluted QCs at very high/low concentrations).As metabolites can be present in very different and wide concentration ranges, this strategy should consider enough dilution points to properly address linearity, thereby considerably increasing analysis times and hindering subsequent data processing.−16 The main limitation of these previously published QC protocols is their subjective nature and lack of harmonization between laboratories, since predefined quality criteria and standard procedures have not yet been established, as recently highlighted by the mQACC. 11Furthermore, it should be noted that these methods mainly focus on addressing analytical variability (e.g., drifts along the sequence run) but frequently underestimate the great impact that preanalytical factors may elicit on metabolomics results.Considering the growing interest in the exposome and its relationship with health outcomes, it also becomes essential to adapt current QC practices for dealing with heterogeneous data sets comprising both endogenous metabolites and xenobiotics, which are normally present at very different concentration levels in the organism.
Herein, we present "QComics", a comprehensive protocol for QC assessment of metabolomics data based on a sequential multistep workflow: (i) initial data exploration (i.e., detection of contaminants, batch drifts, and out-of-control measurements), (ii) handling missing values and truly absent data, (iii) removal of outlying samples, (iv) monitoring quality markers to address preanalytical errors, and (v) final data quality assessment.The quality criteria employed in QComics have been adapted from guidelines for validating conventional bioanalytical methods 21 and from existing literature on current QC practices in metabolomics. 3,11To simplify its implementation, this QC protocol can easily be performed in software that is available to most researchers (e.g., Microsoft Excel and MetaboAnalyst webtool), without the need of advanced statistical and programming skills.

■ EXPERIMENTAL SECTION
Blank and Quality Control Samples.The implementation of QComics requires procedural blanks and QC samples, obtained as follows.Blank samples must be prepared by replacing the biological sample under study with water during the extraction process but using the same chemicals, labware, and SOPs as for real samples.In the case of simple extraction protocols (e.g., protein precipitation with organic solvents), blank extraction solvents can instead be employed for this purpose.On the other hand, the QC sample is prepared by mixing equal aliquots of each of the samples under investigation or by using a bulk representative sample when the pooling strategy is not viable.Before analysis, the QC sample must be treated by applying the same extraction procedure used for real samples.Optionally, study samples can also be spiked with a set of ISs, as traditionally done in targeted analysis.
Metabolomics Analysis.To develop and validate QComics, we leveraged metabolomics data that were generated using the untargeted approach described by Gonzaĺez-Domi ́nguez et al. as a case study. 22Briefly, plasma samples were treated with cold acetonitrile for protein precipitation, and metabolite extracts were then analyzed by reversed-phase ultrahigh-performance liquid chromatography coupled to highresolution mass spectrometry (UHPLC-MS), using the operating conditions described elsewhere. 22The injection order of samples in the MS system should follow this sequence: (1) Inject five consecutive procedural blank samples to stabilize the system (e.g., operating temperatures and chromatographic pressure) and to check the background noise.
(2) Inject several consecutive QC samples to condition the system for the study matrix (i.e., stable chromatographic pressure, reproducible RT, peak area, and peak shape for selected metabolites).This conditioning step usually requires at least five QC samples, although this number might be increased (e.g., 10 injections) when studying complex matrices (e.g., tissues) and when applying less robust analytical approaches (e.g., hydrophilic interaction liquid chromatography, HILIC).
(3) Analyze real samples in random order and intercalate QCs across the sequence (e.g., one QC after every 10 samples).If the sample size is small, the frequency of QC injection may be increased to ensure a minimum of 10% QC samples across the analytical run.(4) Inject five procedural blank samples at the end of the sequence run to assess carryover.We do not recommend intercalating blank samples throughout the sequence as this may result in partial deconditioning of the system (e.g., shifts in RT and peak symmetry) due to differences in matrix composition, which would make necessary injecting several reconditioning QCs after blanks before continuing with the analysis of real samples, thereby lengthening total run times.
After MS-based analysis, a set of metabolites that can regularly be detected in QC samples (hereinafter referred to as "chemical descriptors") must be selected to assess method reproducibility and data quality.These metabolites should preferably belong to different chemical classes representing the analytical coverage of the MS method, have diverse molecular weights (MW) and peak intensities, and be well-distributed along the chromatographic run.In targeted experiments, spiked ISs can be used as additional quality markers.Herein, for the particular case of reversed-phase UHPLC-MS analysis of plasma samples, we propose the set of chemical descriptors listed in Table 1.Note that m/z and RT values correspond to those obtained by applying the metabolomics method described elsewhere, 22 the user should adapt the set of chemical descriptors according to the analytical performance of their own methods.
Implementation of QComics.The QComics protocol operates through a multistep workflow to sequentially address various challenges that may strongly influence data quality, as detailed in Figure 1.In the next sections of this article, we discuss recommendations and guidelines for properly dealing with background noise and potential contaminants, batch drifts and "out-of-control" measurements, missing values and truly absent data, outlying samples, improper handling/storage of biological samples, and overall quality assessment.

■ RESULTS AND DISCUSSION
Initial Data Exploration.The first step in QComics involves a preliminary exploratory data analysis to check for potential contaminants, carryover, trends according to the run order, and "out-of-control" measurements.
Inspection of Procedural Blank Samples.Metabolomics data, especially when applying untargeted approaches designed to detect as many molecular features as possible, are prone to contain artifact signals originating from sources other than the biological matrix under investigation, such as additives and preservatives incorporated during sample collection and processing, ghost peaks derived from sample preparation (e.g., derivatization), impurities present in solvents and reagents, and contaminants coming from labware and the MS system (e.g., plasticizers or column bleeding).Furthermore, these contaminants and other matrix components may accumulate in the instrument (e.g., in the autosampler or in the column) as a result of inadequate washing between sample injections, leading to the appearance of carryover signals from the foregoing samples in subsequent injections.
To minimize the impact of this background noise, procedural blank samples must be injected at the beginning and at the end of the sequence run to identify potential artifacts in the data set. 23As the exclusion criterion, QComics flags molecular features as a "potential contaminant" when their mean values in real samples do not exceed three times the mean values detected in blanks, as recently reported by the mQACC. 11However, we recommend not removing these peaks prior to data analysis as some of them can be biologically relevant (e.g., free fatty acids that are frequently employed as slip agents in plastic consumables).Instead, if the "contaminant" is selected as of potential interest after data analysis, the researcher should determine to which extent the blank contribution might influence the quality of results.To this end, we propose discarding only those features with an RSD value higher than 15% in blank injections, as low blank-related variability is not expected to differentially affect group comparisons or statistical testing.
Detection of Batch Drifts and "Out-of-Control" Measurements.The most frequent origin of systematic errors in metabolomics is deficient instrumental stability (e.g., loss in MS sensitivity or deterioration of chromatographic columns), which is ultimately mirrored in signal drifts according to the run order.To verify the presence of gradual changes along the analytical run, and thus evaluating the need of implementing data normalization approaches, 19 the peak intensity of each chemical descriptor should be plotted with respect to their run order to check for time-related trends in the data (Figure 2A,B).Then, the examination of PCA scores plots enables exploring if samples show a continuous drift (Figure 2C) or a homogeneous distribution (Figure 2D) in the PCA space.If prepared by pooling, QC samples should ideally cluster in the center of the plot.
Besides the abovementioned gradual drifts in signal, the analytical system can also experience sudden deterioration (e.g., column clogging), thereby resulting in "out-of-control" measurements that are hard to correct through normalization approaches.In that case, the implementation of Shewhart control charts would facilitate the detection of abnormal QC samples to determine if a batch is acceptable or not (Figure 2E).To detect "out-of-control" measurements, create time series plots with upper and lower critical limits (±SD, ± 2SD, ± 3SD) for each chemical descriptor and set the control rules proposed by Westgard et al.: 24 (i) one QC sample exceeds the ±3SD limit; (ii) two consecutive QC samples exceed the ±2SD limit in the same direction of the control chart (i.e., above or below the mean); (iii) four consecutive QC samples exceed the ± SD limit in the same direction of the control chart (i.e., above or below the mean); (iv) 10 consecutive QC samples fall in the same direction of the control chart (i.e., above or below the mean).If QC samples do not meet these quality criteria, neighboring study samples should be scrutinized to evaluate the necessity of being discarded and reanalyzed.
Handling Missing Values and Truly Absent Data.Metabolomics data sets are typically characterized by high frequency of missing values (ca.20−30% of overall data), which poses additional challenges during QC assessment.The origin of missing values in MS-based metabolomics can be allocated to a myriad of instrumental reasons, including sensitivity limitations (i.e., metabolite levels below the analytical limit of detection), technical issues (e.g., matrix effects or coelution), and random errors (e.g., temporary reduction in ionization performance).This results in diverse types of missing values, including missing completely at random (MCAR, i.e., missing data is independent of observed and unobserved data), missing at random (MAR, i.e., missing data is dependent on observed data), and missing not at random (MNAR, i.e., missing data is dependent on unobserved data).Moreover, missing data can also arise from the true absence due to biological reasons (e.g., xenobiotics that are exclusively detected in exposed individuals).Nevertheless, common strategies for dealing with missing values in metabolomics do not account for this heterogeneity and simply rely on a two-step process for filtering variables containing a high proportion of missing values and subsequent imputation of remaining data. 25,26This may provoke the loss of relevant biological information during the filtering step (e.g., exogenous metabolites with low detection rate) and lead to inaccurate and biased results due to suboptimal imputation.
The QComics tool comprises a novel protocol for differentially addressing missing and truly absent values, of particular interest in exposomics and nutritional metabolomics.First, we compute the rate of missing values per study sample to confirm a consistent distribution along the entire analytical run and to discard data with abnormally lower detection rates.This and further steps should be performed separately in each study group to account for group-specific metabolite occurrences (e.g., xenobiotics coming from an intervention trial).Data must then be scrutinized with the aim of distinguishing missing values from potentially truly absent variables to differentially treat them in subsequent steps.For this purpose, we assume that a low proportion of missing values in any variable is likely to indicate false absence due to analytical issues, while high frequency of missing values could be allocated to true absence, as previously reported by Armitage et al. 27 Using synthetic data sets, they found that the proportion of missing values strongly affects subsequent statistical testing, so that the true-positive rate was drastically reduced as the number of missing values increased, but it was rapidly restored with more than 70% missing values.In that scenario, a compromise between true presence and true absence might be parametrized based on the information that is not lost (i.e., false negatives) nor gained (i.e., false positives) during the imputation process.Based on this rationale, it is proposed that molecular features with more than 70% missing values could actually be regarded as metabolites likely to contain real zero values.This categorization is even simpler in targeted metabolomics, where the identities of analytes are a priori known.In that particular case, we assume that missing values in endogenous metabolites may have a plausible technical origin, as they are expected to be regularly detected in all samples analyzed.In contrast, missing values in exogenous metabolites (e.g., dietary compounds, drugs, or pollutants) could be regarded as truly absent data.Once this categorization is accomplished, QComics handles missing values in various steps.For metabolites likely to be truly absent (i.e., molecular features with >70% missing values in untargeted metabolomics, exogenous metabolites in targeted metabolomics), missing data are replaced with real zero values.In such cases, special care must be taken during subsequent statistical analyses to properly deal with zero-inflated data.For the rest of the data, variables containing more than 20% missing values in all the study groups should be removed to discard spurious signals.Finally, the remaining missing values are imputed by using the method of choice (e.g., kNN or Random Forest).Although rather simplistic, as the unequivocal differentiation between missing values and absent data is difficult in practice, this categorization-based approach could represent a complementary alternative to traditional imputation strategies, 25,26 especially with the aim to get deeper insights into the role of the exposome in health status.This is of particular interest when studying pollutants and toxicants, which are usually present at extremely low concentrations in biological specimens and are hardly detectable using metabolomics approaches.In that case, traditional imputation methods based on filtering variables that contain high proportions of missing values (e.g., 80% rule) are expected to remove most exposome-related features from metabolomics data sets, while keeping these missing data as real zero values would prevent losing relevant information.
Detection and Removal of Outliers.The detection and removal of outliers (i.e., data points that significantly deviate from the remaining observations) are crucial steps in MS-based metabolomics as they can originate from multiple sources of analytical and biological variability, consequently leading to inaccurate results.A great number of methods have been proposed for these purposes based on the Mahalanobis distance, 28 Grubbs's test, 29 and nonlinear regression. 30mong them, PCA and the Hotelling T 2 test have emerged as the gold standard. 31As implemented in QComics, the analysis of PCA scores using the Hotelling T 2 statistics enables easily identifying outliers as those study samples located faraway the 95% Hotelling T 2 ellipse in the PCA scores plot (Figure 3A) and those with extreme values in the T 2 range plot (Figure 3B).Complementarily, observations showing high residual variance unexplained by the PCA model can be identified using the DmodX plot (Figure 3C).However, we recommend excluding potential outliers only in case that noticeable analytical or chemical anomalies in raw data could explain these behaviors, e.g., significantly different number of detected peaks, which could be indicative of human errors during sample preparation, technical issues during MS-based analysis (e.g., inefficient ionization), or sample contamination.
Quality Markers for Addressing Preanalytical Factors.The preanalytical phase is well-recognized to be a major source of variability and errors, as collection and preprocessing of biological samples is frequently performed in clinical settings by staff with limited research experience.This is especially critical in multicenter and biobank-based studies, where samples are collected at different laboratories and over longtime periods.Accordingly, the implementation of SOPs for proper sampling and preprocessing is crucial to avoid contamination, degradation, and metabolic alteration of biological samples, 32,33 thereby ensuring that subsequent metabolomics analysis provides an accurate reflection of the actual in vivo metabolic profile.This requires strict control of every step along the entire preanalytical phase, including sample collection, preprocessing (e.g., centrifugation), aliquoting, transport, storage, and thawing cycles.However, common strategies for QC assessment typically focus on addressing analytical quality, 13−16 without considering the impact of preanalytical factors on metabolic integrity of samples under investigation.
Inappropriate quenching of biological samples may result in ex vivo metabolic reactions that are mediated by residual enzymatic activities (e.g., release of protease-derived peptides, hydrolysis of lipids). 34,35Furthermore, exposure to air and light has been associated with chemical transformations in readily oxidizable and labile metabolites. 36,37−39 In this respect, hemolysis may also impact the serum/plasma metabolome as a result of the release of intracellular metabolites and the exacerbation of metabolic reactions triggered by erythroid enzymes. 40On the other hand, it has repeatedly been reported that urine samples are prone to suffer from profound metabolic alterations caused by bacterial overgrowth and chemical degradation. 41,42On this basis, we propose here a panel of metabolites known to be strongly influenced by the abovementioned preanalytical errors (Table 2), which can be monitored as markers of sample quality as a part of the QComics protocol.The visualization of data in the form of box plots facilitates the detection of observations showing abnormal levels for these quality markers (i.e., peak intensities over ±3 × IQR), which could be indicative of improper handling/storage of the biological sample.As these metabolites can be influenced by a myriad of physiological and pathological stimuli, we recommend monitoring various of them before considering the exclusion of samples.Although this panel of markers has been designed for blood and urine, which are the most commonly employed biological matrices in metabolomics, this QC strategy can easily be adapted by the user to other tissue-specific metabolites.Data Quality Assessment.After data processing and cleaning as explained above (i.e., flagging background noise and carryover signals, monitoring drifts and "out-of-control" measurements, handling missing values and truly absent data, outlier detection, and removal of samples affected by preanalytical factors), the last step in QComics involves evaluating overall data quality in terms of precision and accuracy.To this end, exploratory PCA can be first applied to the whole data set to confirm a tight clustering of QC samples in the scores plot as an indicator of system stability along the analytical run (Figure 4A).Then, method precision and accuracy must be estimated by applying the following acceptance criteria to chemical descriptors detected in QC samples: (i) RSD < 30% for peak intensity, (ii) RSD < 2% for RT, (iii) RSD < 15% for peak width, (iv) m/z error <10 ppm (this latter only for experiments conducted in high-resolution MS). 3,11Note that these acceptance criteria can be tailored according to the technical MS specifications.Additionally, time series plots with predefined tolerance windows can be used to visualize the reproducibility and stability of every quality metrics across the experiment (Figure 4B−E), thereby facilitating the identification of abnormal data points for potential exclusion or reanalysis.To conclude, the precision estimators can also be employed to discard peaks with a technical variation exceeding biological interindividual differences, as these molecular features are likely to introduce great variability in data and, consequently, hinder subsequent statistical modeling.

■ CONCLUSIONS
The inherent complexity and variability of metabolomics data demand the application of robust QC strategies.However, despite numerous efforts made to increase awareness and promote best working practices among the metabolomics community, there is a considerable lack of standardization in QC workflows.To address this gap, we have developed a sequential multistep QC protocol, termed as QComics, aimed to manage the most important challenges influencing data quality, including the correction of background noise and carryover, detection of gradual signal drifts and "out-ofcontrol" measurements, dealing with missing values and truly absent data, detection and removal of outliers, monitoring of sample quality markers to address preanalytical errors, and overall data quality assessment.This tool generates easily interpretable outputs in the form of figures (e.g., Figures 2−4) and tables (e.g., tabulated RSD estimations), which could be annexed to scientific publications for consistent reporting of the data quality.Although it was initially designed for MSbased metabolomics, it should be noted that QComics can also be used to manage other omics and MS data.In this sense, we would like to stress that some features of this QC protocol have successfully been applied in recent metabolomics and metallomics studies, 43−50 which highlights its reliability to deal with complex and heterogeneous data.Therefore, we strongly believe that QComics will facilitate the implementation of good QC practices, at both application and reporting levels, and thus become a gold standard in metabolomics research.

Figure 2 .
Figure 2. Detection of batch drifts and "out-of-control" measurements.(A) Time series plot showing signal drift in quality control samples; (B) time series plot showing stable signal in quality control samples; (C) principal component analysis scores plot showing signal drift in quality control samples; (D) principal component analysis scores plot showing stable signal in quality control samples; (E) Shewhart control chart for detecting "out-of-control" measurements.

Figure 4 .
Figure 4. Data quality assessment.(A) Principal component analysis scores plot showing tight clustering of quality control samples; (B) time series plot with predefined tolerance windows for peak intensity; (C) time series plot with predefined tolerance windows for retention time; (D) time series plot with predefined tolerance windows for peak width; (E) time series plot with predefined tolerance windows for m/z error.

Table 1 .
Set of Chemical Descriptors to Assess the Method Reproducibility and Data Quality

Table 2 .
Panel of Quality Markers to Assess the Influence of Preanalytical Errors.Normal concentration ranges were obtained from the Human Metabolome Database.Arrows indicate the effect that preanalytical factors have been reported to elicit on metabolite expression (↑ increased levels, ↓ decreased levels).