Development and Application of Ultra-Performance Liquid Chromatography-TOF MS for Precision Large Scale Urinary Metabolic Phenotyping

: To better understand the molecular mecha-nisms underpinning physiological variation in human populations, metabolic phenotyping approaches are increasingly being applied to studies involving hundreds and thousands of bio ﬂ uid samples. Hyphenated ultra-performance liquid chromatography − mass spectrometry (UPLC-MS) has become a fundamental tool for this purpose. However, the seemingly inevitable need to analyze large studies in multiple analytical batches for UPLC-MS analysis poses a challenge to data quality which has been recognized in the ﬁ eld. Herein, we describe in detail a ﬁ t-for-purpose UPLC-MS platform, method set, and sample analysis work ﬂ ow, capable of sustained analysis on an industrial scale and allowing batch-free operation for large studies. Using complementary reversed-phase chromatography (RPC) and hydrophilic interaction liquid chromatography (HILIC) together with high resolution orthogonal acceleration time-of- ﬂ ight mass spectrometry (oaTOF-MS), exceptional measurement precision is exempli ﬁ ed with independent epidemiological sample sets of approximately 650 and 1000 participant samples. Evaluation of molecular reference targets in repeated injections of pooled quality control (QC) samples distributed throughout each experiment demonstrates a mean retention time relative standard deviation (RSD) of <0.3% across all assays in both studies and a mean peak area RSD of <15% in the raw data. To more globally assess the quality of the pro ﬁ ling data, untargeted feature extraction was performed followed by data ﬁ ltration according to feature intensity response to QC sample dilution. Analysis of the remaining features within the repeated QC sample measurements demonstrated median peak area RSD values of <20% for the RPC assays and <25% for the HILIC assays. These values represent the quality of the raw data, as no normalization or feature-speci ﬁ c intensity correction was applied. While the data in each experiment was acquired in a single continuous batch, instances of minor time-dependent intensity drift were observed, highlighting the utility of data correction techniques despite reducing the dependency on them for generating high quality data. These results demonstrate that the platform and methodology presented herein is ﬁ t-for-use in large scale metabolic phenotyping studies, challenging the assertion that such screening is inherently limited by batch e ﬀ ects. Details of the pipeline used to generate high quality raw data and mitigate the need for batch correction are provided.


* S Supporting Information
ABSTRACT: To better understand the molecular mechanisms underpinning physiological variation in human populations, metabolic phenotyping approaches are increasingly being applied to studies involving hundreds and thousands of biofluid samples. Hyphenated ultra-performance liquid chromatography−mass spectrometry (UPLC-MS) has become a fundamental tool for this purpose. However, the seemingly inevitable need to analyze large studies in multiple analytical batches for UPLC-MS analysis poses a challenge to data quality which has been recognized in the field. Herein, we describe in detail a fit-for-purpose UPLC-MS platform, method set, and sample analysis workflow, capable of sustained analysis on an industrial scale and allowing batch-free operation for large studies. Using complementary reversed-phase chromatography (RPC) and hydrophilic interaction liquid chromatography (HILIC) together with high resolution orthogonal acceleration time-of-flight mass spectrometry (oaTOF-MS), exceptional measurement precision is exemplified with independent epidemiological sample sets of approximately 650 and 1000 participant samples. Evaluation of molecular reference targets in repeated injections of pooled quality control (QC) samples distributed throughout each experiment demonstrates a mean retention time relative standard deviation (RSD) of <0.3% across all assays in both studies and a mean peak area RSD of <15% in the raw data. To more globally assess the quality of the profiling data, untargeted feature extraction was performed followed by data filtration according to feature intensity response to QC sample dilution. Analysis of the remaining features within the repeated QC sample measurements demonstrated median peak area RSD values of <20% for the RPC assays and <25% for the HILIC assays. These values represent the quality of the raw data, as no normalization or feature-specific intensity correction was applied. While the data in each experiment was acquired in a single continuous batch, instances of minor time-dependent intensity drift were observed, highlighting the utility of data correction techniques despite reducing the dependency on them for generating high quality data. These results demonstrate that the platform and methodology presented herein is fit-for-use in large scale metabolic phenotyping studies, challenging the assertion that such screening is inherently limited by batch effects. Details of the pipeline used to generate high quality raw data and mitigate the need for batch correction are provided.
T he measurement of low molecular weight metabolites in biofluids is a fundamental tool for understanding human physiological phenotypic variation 1 and has wide application in molecular epidemiology and personalized healthcare. 2 The field has been propelled by advances in the analytical technology and data processing methods required to capture and interpret data derived from the metabolic pathways of complex biochemical systems. 3,4 Metabolic profiling, unlike conventional clinical chemistry analyses, is not intended to be selective and therefore generates simultaneous measurement of both expected and potentially uncharacterized metabolites, making the approach particularly fruitful in biomarker discovery. 5 Multiplatform profiling approaches are essential in the pursuit of achieving comprehensive analytical coverage of the human metabolic phenome, including supra-organismal metabolites from human and associated gut microbial action on nutrients, xenobiotics, and environmental contaminants. 6 Metabolic profiles reflect human individuality, 7,8 and phenotypic variation is linked to complex interactions of a person's genetically coded metabolic machinery with their environment. 9−11 Despite this metabolic individuality, the resulting phenotypes show similarities among groups of people with common genetic or environmental (e.g., dietary, gut microbiome) backgrounds when observed en masse. 2,12 In recent years, this study of phenotypic variation has taken the form of molecular epidemiology, whereby metabolic biomarkers of disease risk are sought via analysis of biofluids, typically urine and blood products, from large cohorts of samples such as those derived from biobanks. Application of statistical approaches, in particular metabolome-wide association studies (MWAS), 2,13,14 have been useful in elucidating subtle metabolic signatures of disease and disease risk because of the statistical power afforded by population-level sample collection and analysis. 15,16 When paired with broad metabolite profiling, these large-scale analyses are able to generate unprecedented power in phenotypic comparisons of populations. 13 The demand for biofluid profiling in biobanking and other large scale applications is therefore increasing rapidly.
Consequently, there is a fundamental requirement for high precision analytical data generation for the most widely sampled biofluids. Nuclear magnetic resonance (NMR) spectroscopy has long been a favored analytical platform for the generation of metabolic profiles with quantitative values and high reproducibility facilitating comparisons among individuals or groups of individuals within populations. 17−20 Nevertheless, the technique is limited when used for rapid profiling both in terms of its ability to discern individual molecules in complex mixtures and its sensitivity. Liquid chromatography coupled to mass spectrometry (LC-MS) offers a complementary approach for biofluid analysis, boasting multidimensional high resolution separations and sensitive detection across a broad range of chemical species. 21,22 However, exploratory LC-MS platforms can suffer from batch variations, run order effects, and lack of reproducibility, due in part to the complexity of the hyphenated system involving the distinct processes of high pressure liquid separation followed by analyte ionization and finally mass spectrometric separation and detection. High coefficients of variation from LC-MS measurements have been observed when attempting analysis of samples from large patient cohorts 23 indicating the difficulty in achieving stable metabolic signatures in large-scale analysis. Despite this, the allure of epidemiological-scale metabolic data sets continues to drive the development of LC-MS approaches for large-scale biofluid characterization 24,25 as well as the development of informatic approaches 22,26−30 to combat seemingly inevitable 24,31,32 analytical imprecision (e.g., sample batch effects).
Both blood products (i.e., plasma and serum) and urine are commonly collected and available in biobanks for molecular epidemiology studies. While the analysis of blood products has been the subject of recent advances fit for the purpose of largescale application, 22 the ultra-performance liquid chromatography−mass spectrometry (UPLC-MS) approaches used for analysis of human urine have not been sufficiently demonstrated for this purpose. Here, we report the adaptation of UPLC and orthogonal acceleration time-of-flight mass spectrometry (oaTOF-MS) systems for urine analysis, delivering a platform capable of continuous operation which minimizes the need for collection of data in distinct batches, maximizes efficiency, and produces high quality data with high analytical precision. Complementary reversed-phase chromatography (RPC) and hydrophilic interaction liquid chromatography (HILIC) separations coupled to oaTOF-MS are individually the most common LC-MS techniques used for metabolic phenotyping of urine 31 and were developed here for high precision chromatographic separation within a window of time defined by the practical constraints imposed by a high throughput working laboratory. The achievable degree of reproducibility is exemplified using two independent sets of human urine samples from distinct large-scale epidemiological studies: the Alzheimer's Disease Multimodal Biomarkers study (ALZ) 33,34 and the Airwave Health Monitoring study (AW) of police officers and staff in Great Britain. 35

■ EXPERIMENTAL SECTION
Since quality control (QC) is essential to the development and demonstration of this protocol, multiple reference materials were developed and used to monitor and evaluate data quality. Details of the chemical composition and sources of the method reference (MR) and internal standards (IS) chemical mixtures, study reference (SR) and long-term reference (LTR) pooled urine samples, sample preparation, and data acquisition procedures and parameters are contained within the Supporting Information. Key aspects of chromatographic and UPLC-MS system adaptation and control are described here.
Sample Preparation and Method Timing Considerations. The 96-well plate format is a standard utilized in many high throughput/high volume processes including cell screening, PCR amplification, and immunoassays (e.g., ELISA) and was therefore adopted as the fundamental block unit for biofluid sample preparation. To establish a regular period for preparation and analysis of sample plates, an analytical cycle of exactly 15 min was selected which allowed the analysis of a single block of 96 samples in 24 h. Further blocks may then be appended at convenient and regular intervals to facilitate continuous analysis.
UPLC-MS System Configuration. All UPLC-MS systems used herein for development and sample analysis utilized the following three components. The sample handling component was a Waters 2777C sample manager (Waters Corp., Milford, MA, USA) equipped with a 25 μL Hamilton syringe, a 2 μL loop used for full-loop injections of prepared sample, and a 3drawer sample chamber thermo-stated at 4°C with a constant flow of dry nitrogen gas to prevent the buildup of condensation. The LC component was an ACQUITY UPLC (Waters Corp., Milford, MA, USA) composed of a binary solvent manager and column heater/cooler module. Finally, the MS component was a Xevo G2-S oaTOF MS (Waters Corp., Manchester, UK) coupled to the UPLC via a Zspray electrospray ionization (ESI) source.

Analytical Chemistry
Article Reversed-Phase Chromatography (RPC) Method Development. The stationary phase and mobile phase conditions previously reported by Wong et al. 36 were adopted for use in biofluid profiling due to their established suitability for the retention of small polar species in highly aqueous environments and separation of less polar species over the course of a gradient elution. 21 Water and acetonitrile, each supplemented with 0.1% formic acid (mobile phases A and B, respectively), were chosen for the mobile phase because of their ease of volumetric preparation or direct commercial availability in large batches, mitigating concern for solvent preparation as a cause for batch effects. A 2.1 × 150 mm HSS T3 column thermostatted at 45°C was used with a mobile phase flow rate of 0.6 mL/min, generating a maximum pressure of approximately 12 000 psi (80% of the maximum achievable system pressure of 15 000 psi) in a water/acetonitrile gradient. The chosen flow rate represents a balance between chromatographic performance (maximized in UPLC at high mobile phase linear velocities 37 ) and observed MS intensity which is concentration sensitive and therefore inversely related to effluent flow rate. 38,39 The flow rate was well tolerated by the ESI source, even when the effluent was highly aqueous.
After a 0.1 min isocratic separation at initial conditions (99% A), a linear gradient elution (99% A to 45% A in 9.9 min) was applied, generating the data-rich portion of the separation, followed by a more rapid gradient (45% A to 0% A in 0.7 min) to final conditions. In the latter stage, the mobile phase flow rate was simultaneously increased to 1.0 mL/min, allowing faster column washing. Due to the relatively low viscosity and high volatility of the organic component of the mobile phase, no problems with LC system pressure or desolvation in the MS interface were observed during increased flow conditions. Changes in flow rate were applied gradually in order to utilize available system pressure without introducing large fluctuations. The duration of column equilibration was adjusted to provide sufficient retention and chromatographic precision of early eluting species in subsequent analyses at the minimal expense of time. The final gradient conditions for the RPC separation are summarized in Table SI-1, and an accompanying representative LC system pressure trace is provided in Figure  SI Hydrophilic Interaction Liquid Chromatography (HILIC) Method Development. The chromatographic retention and separation of small polar molecules was conducted using a 2.1 × 150 mm Acquity BEH HILIC column (Waters Corp., Milford, MA, USA) thermostatted at 40°C. The solvent system chosen was acetonitrile with 0.1% formic acid (A) and 20 mM ammonium formate in water with 0.1% formic acid (B). The flow rate of 0.6 mL/min established for the RPC separation was found to be equally well suited for application in the HILIC separation and was therefore used during sample loading and gradient elution. After a 0.1 min isocratic separation at initial conditions (95% A), a two-stage gradient was conducted to achieve approximately uniform peak shape in the elution of urinary analytes. First, a shallow linear gradient between 95% and 80% A was used followed by a more rapid gradient from 80% to 50% A in order to improve the peak shape of late eluting analyte species which otherwise appeared as broad and sometimes tailing peaks. Following a return to initial composition, the flow rate was increased to 1.0 mL/min to expedite equilibration of the chromatographic system, providing sufficient retention and chromatographic precision of early eluting species in subsequent analyses. The final gradient conditions for the HILIC separation are summarized in Table SI-2, and an accompanying representative LC system pressure trace is provided in Figure SI As HILIC methods are often reported to benefit from longer equilibration times relative to RPC, 31,40−42 the method herein was specifically designed to allow for extended equilibration without precluding the option for increased throughput. This was accomplished by ensuring the complete elution of analytes by 7.5 min (half of the total run time), making the method compatible with specialized chromatographic systems capable of column switching, mating two independent separations to a single mass spectrometer in staggered parallel operation. While capable of doubling the productivity of a single mass spectrometer, such a system was not implemented in the studies reported here.
Optimization of UPLC-MS System Configuration. The ESI source used to couple the UPLC to the MS allowed for adjustable angular positioning of the sample probe in relation to the inlet cone. The potential for accumulation of samplederived residue on the inlet cone and guard was minimized by using the most orthogonal setting that still allowed for a nearmaximum signal detected for reference standards within the RPC MR mixture. A setting of 7 mm on the adjustment micrometer was used for all assays. To further protect the cone from residue accumulation during operation, the cone gas flow was set to 150 L/h, representing the highest flow achievable while maintaining near maximal signal intensity for RPC MR standards. Optimized ion source and ion guide settings were established to maximize the observed signal and minimize fragmentation of small molecules using the standards within the RPC MR. These probe position, source, and ion guide settings were standardized for use within the laboratory for all assay types applicable to urine analysis. Furthermore, as standard practice, each instrument was tuned to achieve high resolution at maximum sensitivity immediately prior to conducting each assay (see the Supporting Information for details of tuning procedure and resolution values achieved). RPC analysis was performed in both positive and negative ion modes (RPC+ and RPC−, respectively) while the HILIC assay was performed in the positive ion mode only (HILIC+). Instrument-specific details are provided in the Supporting Information.
MS Detector Gain Control. A prototype software algorithm was developed to maintain consistent electron multiplier gain during sustained use. During instrument setup, the applied detector voltage was adjusted to give an optimum signal-to-noise ratio for digitized signals arising from individual ion arrivals, ensuring efficient recording of the majority of ion arrivals. The relative detector gain was measured at this optimum voltage, establishing the target value to be maintained throughout the series of sample analyses. This value was determined by calculating the ratio of summed intensities between mass spectral data digitized using an analogue-todigital converter and mass spectral data acquired in an ion counting mode of operation. This ratio provides a measure of the mean area of the signal arising from ion arrivals over a given mass-to-charge ratio (m/z) range after digitization and amplification. To compensate for a loss of gain observed under sustained operation, the algorithm automatically adjusts the applied detector voltage in response to relative gain measurements taken immediately prior to each new sample injection. By interpolating between the current relative detector gain value and the value measured after an arbitrary increase in detector voltage (+25 V), a new detector voltage value may be calculated prior to each injection such that the relative gain is maintained at the reference value throughout the analysis. The overall gain of an electron multiplier, under constant operating conditions, is dependent on ion velocity and hence on the m/z of the incident ions. Therefore, measuring relative gain over a wide mass range gives a value dependent on the relative population of ions at each m/z value. With this in mind, the chemical noise arising from ionization of LC effluent during chromatographic equilibration was used as a stable and reproducible signal when performing gain measurements. This method negates the requirement for separate introduction of a reference sample, providing highly reproducible spectra while avoiding the need for effluent diversion during measurement. In order to maintain a regular analytical cycle and to avoid variation in column equilibration times between sample injections (potentially detracting from analytical reproducibility), the algorithm was designed to run for a fixed time of 2 min, regardless of (typically in excess of) the time required to complete calculation and adjustment. This time corresponds to the column equilibration period, and therefore, this functionality did not extend the overall experiment time.
Human Population Studies for Method Exemplification. The methods and UPLC-MS system configuration described herein were applied to selected sample sets from two independent epidemiological studies in order to exemplify the quality of data produced across distinct experiments. The first set (ALZ) was derived from a UK epidemiological study of Alzheimer's disease and conversion from mild cognitive impairment. Urine samples were collected from Alzheimer's disease patients (n = 200), participants with mild cognitive impairment (n = 575), and control participants (n = 200). A subset of these samples (n = 655) was provided for molecular phenotyping by both HILIC and RPC methods. All samples remained unaliquoted and frozen (never thawed) at −80°C for an average duration of approximately three years (range = 1 to 10 years) following collection. The second set (AW) was derived from an epidemiological study of microwave radiation exposure from terrestrial trunked radio (TETRA) use by police personnel within the UK. Samples from 1040 (RPC) and 1000 (HILIC) participants were selected for use in this study. Urine samples were collected within various clinics local to participating UK police forces, aliquoted into 2 mL cryovials, shipped to a central laboratory at 4°C within 24 h, and then placed at −80°C and later at −180°C in liquid nitrogen for long-term storage. The maximum duration of storage was eight years, and during this time, samples were never thawed. No preservatives were added to the urine samples of either set. Biological end points of the ALZ and AW studies were considered to be outside the scope of analytical performance assessment and are therefore not presented.
Feature Extraction, Data Filtration, and Quality Assessment. For each study and each analysis type (RPC+, RPC−, and HILIC+), the quality of the data set produced was assessed using data extracted from the repeated regular injections of SR and LTR urine samples, alternating every 5 study sample injections, supplemented with the chromatographic method-appropriate RPC or HILIC MR mixture. This approach enables the complementary evaluation of both specific molecular targets and the aggregate of all detectable small molecule features. Targeted detection and integration of MR and IS analyte peaks were performed across all QC and QC dilution series samples using TargetLynx (MassLynx 4.1) and manually reviewed for accuracy. Untargeted peak detection, alignment, grouping, integration, and deisotoping were performed on each full data set (all study samples, QC samples, and QC dilution series samples) using Progenesis QI 2.1 software (Waters Corp., Manchester, UK). Inclusion of the study samples can potentially affect the alignment and grouping of features in untargeted feature extraction (versus performing the procedure on the QC samples only), and therefore while the study sample data were not used for the evaluation of analytical precision, they were included in the preprocessing. This was done in order to be consistent with the manner in which a profiling data set would be processed for biological interpretation. Parameters for both targeted and untargeted feature detection procedures are provided in the Supporting Information. Detected and integrated chromatographic peaks belonging to the same spectral feature of the same chemical species across all samples are referred to herein as "feature groups".
In an approach adapted from Croixmarie et al., 43 the response of each feature group (extracted by both untargeted and targeted means) to sample dilution was assessed by calculation of Pearson correlation coefficients between the QC sample dilution factor and the extracted signal intensity for the pre-and postsample analysis dilution series. The motivation for this approach is to identify feature groups that are not correlated to the gradient of concentration generated by the dilutions series and therefore should not be considered as reliable features. Thus, feature groups with correlation coefficient of less than 0.8 were removed from the data set. The resulting data is presented without application of data correction procedures (e.g., normalization or curve fitting) in order to facilitate clear reporting and evaluation of the data quality produced by the optimized analytical platform.

■ RESULTS AND DICUSSION
Consideration of the Importance of a Regular Analytical Cycle for Continuous Operation. The regularity of analysis achieved using 96-well plates paired with a 15 min analytical cycle grants efficiency to the measurement platform, producing a dependable schedule of sample preparation and analysis which can be easily managed to allow continuous batch-free operation of large sample sets (ca. 1000 samples). Such a cycle helps to avoid situations where variable numbers of sample plates must be prepared and submitted to ensure analysis does not stop outside of working hours. By avoiding the need for reactive sample management, the variation in sample age (between preparation and analysis) can be more easily controlled, limiting time-dependent chemical changes as a source of variance in the observed profiles. Therefore, cycle times of 15 min (1 plate per day), 10 min (3 plates per 2 days), 7.5 min (2 plates per day), 5 min (3 plates per day), and so on are advocated for ease of maintenance, allowing continuous batch-free analysis. Here, a 15 min analytical cycle was selected as providing both a generous amount of time for chromatographic separation of the complex biofluid sample and a minimal range of sample ages.
Chromatographic Performance of RPC and HILIC Methods. Working within the constraint of a 15 min analytical cycle, RPC and HILIC methods were developed for separation performance and precision. A special emphasis was placed on resolution of early eluting species, using 150 mm columns in both chromatographic modes to improve chromatographic efficiency (relative to 100 or 50 mm lengths more traditionally used for UPLC applications) in the earliest region of the urine chromatogram where the separation is occurring under virtually isocratic conditions and feature density is high with small polar analytes (particularly in RPC analyses). Base peak intensity chromatograms of LTR sample analyses by each UPLC-MS method (RPC+, RPC−, HILIC+) are shown in Figure 1, illustrating the separation and distribution of analytes within the urine matrix. The retention time precision of each chromatographic method was assessed using the MR standards and IS within the QC samples of each study as exemplary molecular targets. SR and LTR samples were assessed collectively, as retention time is expected to be independent of matrix composition. Feature-specific measurements across all 130 QC samples (ALZ study, all assays) and 208/200 QC samples (AW study, RPC/ HILIC assays, respectively) are summarized in Table SI-3. When considering these results, it is important to recall that QC samples were spread evenly throughout 780 total sample injections for the ALZ study and 1248/1200 injections for the AW study (RPC± and HILIC+ analyses, respectively). The retention time relative standard deviation (RSD) was less than 1% for all individual reference standards, and the mean RSD of all method reference standards for each assay did not exceed 0.30%. These results were achieved despite the long duration of the studies and the consequential need to regularly supplement mobile phase buffer and solvent with freshly prepared or newly opened stock. While the RPC method utilizes solvent formulations that are directly commercially available and obtainable from a single manufacturing batch, preparation of HILIC solvents is typically more operator dependent due to their tailored composition. Separation of the HILIC mobile phase into unblended aqueous and organic components (consistent with the recent work of Jacob et al. 44 but deviating from most other published HILIC implementations for urine analysis 21,31,45,46 ) and reliance on the UPLC hardware to establish a precise initial mixture of organic solvent with a small aqueous component (5%) resulted in a simple and repeatable mobile phase preparation procedure, contributing to the high degree of retention time precision observed.
Raw UPLC-MS Data Quality of Molecular Targets. The quality of the raw UPLC-MS data was assessed by integrating the extracted ion chromatogram (EIC) peak areas of the MR and IS molecular targets and calculating their RSD values within each QC sample type. These results are summarized in Table 1.
Separate treatment of the SR and LTR samples is representative of their potential use as independent control and validation QC samples, the merits of which have been discussed previously. 29 Consistent precision was observed between SR and LTR sample sets for each analysis type. Within the RPC analyses, negative mode detection generally produced slightly higher measurement variation than positive mode detection. As the samples analyzed on each RPC system were split aliquots from the same set of prepared samples, the difference in assay performance appears to be a consequence of either interinstrument variance or the mode of operation, the latter being consistent with previous reports of positive mode ESI-MS outperforming negative mode when using the same chromatographic method. 23,47 Nevertheless, the overall results demonstrate a high degree of analytical precision within the raw data across all assays and both independent large studies.
Feature Selection and Variance in UPLC-MS Profiling Data. The feature-extracted data set produced by automatic peak detection, grouping, and integration will invariably contain noise from the analytical system (e.g., signals from mobile phase chemical contaminants) as well as potential artifacts from the feature extraction process. Depending on the parameters used for feature extraction, such signals can amount to a substantial portion of the total number of features detected and   48,49 and are therefore not well suited to large epidemiological studies of populations without distinct subgroupings. We utilize an alternate strategy in which the intensity of features must be correlated to the matrix concentration in a series of diluted SR samples in order to be retained for further analysis. This filter has no dependency on the study design and is therefore applicable in both small discriminant studies and large phenotyping efforts. Beyond its use as a system noise removal tool, the dilution series filter also helps to ensure that the observed signal of a given feature and its relative concentration in the sample are positively correlated, benefiting the interpretation of profiling data. The effects of dilution series filtration are shown using a representative feature set (AW RPC+) as a selected example, illustrating the distribution of eliminated and passing signals, including the removal of chemical noise ( Figure SI-3).

Table 1. Peak Area Precision of Reference Standards within the Alzheimer's (ALZ) and AIRWAVE (AW) Studies for Reversed-Phase Chromatography (RPC) and Hydrophilic Interaction Liquid Chromatography (HILIC) Analyses a
a Results are expressed in terms of relative standard deviation (RSD%), and measurements with variance greater than 15% RSD are highlighted in blue. Different labeled versions of taurine and creatine were utilized between the ALZ and AW studies (see the Supporting Information for details).  Data sets may be further refined by removal of feature groups that do not meet an arbitrary threshold of peak area measurement precision prior to downstream analysis. This approach, utilizing RSD values derived from repeated measurements of a pooled QC sample, is becoming increasingly mainstream in molecular profiling literature. 50 However, such an approach fails to account for the relationship between the observed analytical and total (including biological) variation in each chemical species measured. Using the same selected example data set (AW RPC+), the comparison of analytical and total variance observed in the data set was assessed by juxtaposition of the RSD values calculated for each feature group within the SR, LTR, and study samples. The results are shown in Figure 2 (results from AW RPC− and HILIC+ assays are included as Figures SI-4 and SI-5, respectively). The observed analytical variance is generally consistent between SR and LTR sample measurements for each feature group, and for the vast majority of features, the variance observed within the study samples is greater than the variance observed within either QC sample type. This gives confidence that the study sample measurements contain information potentially relevant to phenotypic variation. For this selected case, the 30% RSD threshold often adopted in the literature as a data filter does appear to be sensible, albeit arbitrary, for eliminating feature groups that do not demonstrate substantial variance in study samples beyond that measured in the QC samples (indicating low biologically relevant variation). However, elimination of feature groups according to the ratio of observed variance between the QC samples and study samples may ultimately prove to be a more relevant criterion, allowing the RSD threshold to be adaptive with respect to the degree of biological variance implicit within each feature group. This approach is therefore proposed here as an alternative to data filtration based on a static QC sample RSD value.
Raw UPLC-MS Quality of Global Profiling Data. To complement the targeted assessment of data quality with a more global interrogation of measurement precision, the distribution of RSD values 51 for all features passing the dilution series filter was calculated for the SR samples. Figure 3 illustrates the distributions for each analytical method. For visual clarity, the x-axes have been truncated at 80% RSD; however, all data were incorporated in the calculation of quality statistics. The data have also been subdivided on the basis of the mean feature group intensity in SR samples, into the bottom, top, and middle-two quartiles (shown in blue, red, and green) to illustrate the relation between feature intensity and precision. The median RSD values for the RPC+ method were 16.1% and 12.7% for the ALZ and AZ studies, respectively, 19.3% and 18.3% for the RPC− assay, and 21.4% and 24.3% for the HILIC+ assay. It is notable that RPC remains the more robust and dependable chromatographic method, justifying efforts at expanding its molecular coverage of small polar molecules.
When considering these values in the greater context of values reported in the literature for metabolic profiling of urine, it is important to note that they differ from other reports where data correction, normalization, or selection is performed prior to reporting data set quality. 23,29,30,44,45,47,52−59 While these approaches may be fit for the purposes of comparing the combined analytical and informatic precision of the final data sets for biological interpretation, they make direct comparison of raw data quality difficult. Nevertheless, viewed as a whole, these results clearly illustrate a high degree of raw analytical precision throughout the feature measurements among all assays in two independent large studies, demonstrating the quality of data generated by the analytical platform and methods.

Analytical Chemistry
Article MS Detector Performance and Gain Control. Throughout each experiment, automatic control of the voltage applied to the MS detector was used to mitigate changes in detector gain during sustained analysis. The mean change in applied voltage was approximately 0.15 V per sample across all assays in each experiment (an exemplar scatter plot of detector voltage across an experiment is provided in Figure SI-6). In addition to contributing to the precision in peak area described previously, the stabilizing effect of this adaptive control system helps to maintain the fundamental ion detection efficiency of the instrument, mitigating the potential drift and eventual loss of low intensity signals below the limit of detection. This approach is conceptually different from data normalization or other methods of correction which are applied after data acquisition to correct for longitudinal trends in signal intensity, as the latter cannot correct for changes in relative response with respect to absolute signal intensity which arise from inefficient ion detection and they cannot restore lost signals. It is also important to note that the magnitude of voltage adjustment required per sample is calculated on background chemical noise from the LC effluent during column equilibration and is therefore reliant neither on the composition or the concentration of a specific analyte nor on the experimental data itself.
Relevance of Data Refinement Approaches. Reducing dependency on data treatment procedures in order to elucidate subtle effects in large populations is a key aim in the development of high precision analytical assays. Here, we have demonstrated that achieving batch-free analysis is key in producing high precision raw data, mitigating the need for computationally intense data correction analytics and thereby limiting the potential for overcorrection and artifact introduction. Nevertheless, the development of mathematical analytics for correcting sample run-order and batch effects remains necessary and beneficial to molecular profiling studies. While the work presented herein demonstrates a high degree of raw data quality achievable, it also confirms the persistence of analyte-specific longitudinal trends (drift) in such analyses.
A detailed investigation of the slightly increased variance observed in the negative mode analysis (relative to positive mode) indicated greater longitudinal trends in feature intensity as the cause of the higher RSD values, rather than increased random variance in measurement alone. Indeed, feature-specific trends are commonly observed throughout the data sets investigated here despite the high degree of overall precision. The behavior of labeled L-leucine in the ALZ RPC-data set (having the highest RSD value in the LTR samples) is illustrated in Figure 4 as an exemplary case. Feature specific LC-MS data correction procedures such as curve fitting 29 and LOESS regression 22 may be appropriately applied and expected to perform well, further improving the data precision in a feature-specific manner. To illustrate this in the selected example, a simple cubic spline was fitted to the SR samples and interpolated to correct the independent LTR samples, eliminating the longitudinal drift and reducing the measurement RSD from 13.8% to 3.8% in the LTR samples ( Figure 4). Additionally, it is important to note that the analytical continuity achieved here is subject to unexpected disturbance, as hardware or software errors may still introduce batch effects into an otherwise high precision analysis. In these instances, batch correction tools that produce high precision data sets without erroneously constraining meaningful biological variance remain valuable assets.

■ CONCLUSIONS
These results demonstrate the feasibility of collecting exploratory UPLC-MS data suitable for the elucidation of subtle metabolic effects within epidemiological studies. The system described is capable of continuous analysis, producing data with exceptional precision when applied at a large scale. Minimization of signal loss is pursued throughout the development of the UPLC-MS platform configuration and assay set, maximizing sensitivity in order to effectively trade it for longitudinal stability. This is achieved in part by adjustment of the MS ion optics and ion source parameters to maximize sensitivity with commensurate minimization of sample material injected. Together, these steps limit signal loss due to source and ion optic contamination with biological matrix and eliminate the need for related intervention and maintenance (with consequential batch effects) during large scale experiments. Additional measurement stability is provided by automatic MS detector gain control, adaptively compensating for trends in instrument performance without reliance on the experimental data. By mitigating signal loss end-to-end, the UPLC-MS system becomes a robust platform for molecular profiling of the imprinted metabolic processes observable in human urine. Acquisition of high precision data reduces the need for informatic correction but does not eliminate it entirely as longitudinal trends and batches due to software/hardware failure can still pose threats, highlighting system robustness as a key factor in large scale phenotyping applications.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.6b01481. Experimental section; gradient conditions for the RPC separation; system pressure traces; gradient conditions for the HILIC separation; optimization of the UPLC-MS system configuration; retention time precision of reference standards; illustration of SR sample dilution