Capturing Biochemical Diversity in Cassava (Manihot esculenta Crantz) through the Application of Metabolite Profiling

Cassava (Manihot esculenta Crantz) is the predominant staple food in Sub-Saharan Africa (SSA) and an industrial crop in South East Asia. Despite focused breeding efforts for increased yield, resistance, and nutritional value, cassava breeding has not advanced at the same rapidity as other staple crops. In the present study, metabolomic techniques were implemented to characterize the chemotypes of selected cassava accessions and assess potential resources for the breeding program. The metabolite data analyzed was applied to describe the biochemical diversity available in the panel, identifying South American accessions as the most diverse. Genotypes with distinct phenotypic traits showed a representative metabolite profile and could be clearly identified, even if the phenotypic trait was a root characteristic, e.g., high amylose content.


■ INTRODUCTION
Cassava (Manihot esculenta Crantz) is a woody perennial shrub with edible storage roots (further referred to as roots) which provide a major source of calories for many populations, especially those in Sub-Saharan Africa. 1 Cassava plants are able to grow on marginal soils and provide feasible yields under drought and other stresses. 2 Breeding for cassava varieties with improved yield and biotic and abiotic resistance, and more recently biofortification has been ongoing since 1937. 3 Despite these attempts to deliver new improved varieties, progress has been limited in comparison to other global staple crops. This is in part due to the disadvantages associated with clonal propagation, the common cultivation practice for cassava, which limits germplasm diversity. 4 In addition, improvements in one trait often adversely affect other traits, for example, higher yield has been shown to decrease protein content and high carotenoid to lower starch content, the most important bioproduct of the crop. 5,6 A focus of the CGIAR Research Program is to combine generic tools and resources that facilitate the implication of modern breeding techniques, which is a limiting factor in the development of better root, tuber, and banana bearing plants. For example, new breeding strategies, combining nextgeneration sequencing techniques and metabolite profiling, have been shown to increase the breeding efficiency and identification of candidate genes and marker defined regions for varieties with multiple traits. 7,8 Furthermore, the screening of available germplasm and wild/diverse genetic resources and incorporation of these plants into the breeding strategies increases the genetic gain and enriches the diversity of currently favored varieties. 9 This approach was successfully applied in other crops, such as tomato, resulting in identification of quantitative trait loci. 10 The organism's chemical phenotype, regulated by its genetic background, comprises all present chemical end-products associated with cellular processes and can be measured with metabolomics techniques. 11 The chemical properties of metabolites present in an organism and even a specific tissue can vary distinctly and demands the use of several analytical platforms for a broad view of the metabolome. Therefore, the primary objective of this study was to establish chemical screening methods for primary and secondary metabolites that can be widely applied to classify and assess cassava accessions. The second objective was to identify potential biomarker metabolites from the screening methods and link them to characteristic traits, in effect, elucidating the relationship between genotype and phenotype/traits. For this purpose, cassava varieties originating in the native regions Central America/Caribbean and South America as well as improved varieties from Africa, were included in a panel. The plants were analyzed as in vitro plantlets and the chemotypic differences elucidated. A comparison between five cassava accessions highlighted the difficulties comparing plants grown in vitro and cultivated using field practices.

■ MATERIALS AND METHODS
Plant Cultivation and Material Generation. Twenty-three cassava varieties (Table 1) were harvested twice from in vitro stocks, maintained at the cassava genetics laboratory at CIAT. The plantlets were grown on 4E medium, which included Murashige and Skoog (MS) salts, 12 0.04 mg/L 6-benzylaminopurine (BAP), 0.05 mg/L 4gibberelic acid (GA3), 0.02 mg/L α-naphthaleneacetic acid (NAA), 1 mg/L thiamine, 100 mg/L myo-inositol, and 2% sucrose, at pH 5.7− 5.8. 13 Six meristem apexes per cassava accession were harvested and individual meristems placed back on 4E medium (150 mL) in a 500 mL glass jar. A total of 138 jars (6 × 23) were placed in a complete randomized designed on a 12 h photoperiod for a 12 weeks growth period. The in vitro plantlets were carefully removed from the medium, frozen in liquid nitrogen and lyophilized.
Five cassava landraces (BRA1A, COL22, COL638, CUB23, and PER183) were taken to the field and grown under CIAT's standard field conditions. Ten months after planting, leaves, stems, and roots were harvested, immediately frozen in liquid nitrogen and lyophilized.
Extraction of Metabolites. Freeze-dried tissue (approximately 200 g) was ground into a fine powder. Samples, including quality controls (pool of all samples, QC), were weighed (10 ± 0.5 mg) into plastic tubes and extracted as described previously. 14 Metabolites, according to their hydrophilic properties, were separated in a polar and nonpolar phase and aliquots of each phase were immediately dried down after extraction.
GC-MS Analysis of Polar Extracts. An aliquot of the polar phase (200 μL) was removed and internal standard ([D 4 ]succinic acid, 10 μg) added before dry down. Dried samples were derivatized as previously described with methoxymation and silylation derivatization 14 and analyzed by GC-MS based on the literature, 15 using a 10:1 split mode and a heat gradient 70−325°C. Metabolites were identified with respect to an in-house library (Supplementary Table S1) based on retention time, retention indices, and mass spectrum 14 and quantified relatively to the internal standard.
LC-MS Analysis of Polar Extracts. Each dried aliquot of the polar phase (700 μL) was resuspended in methanol/water (1:1, 100 μL) and internal standard (homogentisic acid, 5 μg) added. Samples were filtered using syringe filter (nylon, 0.45 μm) before analysis based on a previously published method. 15 All solvents for analysis were of LC-MS grade. Solvent A (water and 0.1% formic acid) was held at 100% for 1 min, followed by a gradient up to 35% solvent B (acetonitrile and 0.1% formic acid) until 18 min and to 95% B until 19 min. Solvent B was then held at 95% for 4 min and the column returned to the initial conditions (100% A) within 1 min and equilibrated for 5 min. Detection of eluting compounds was performed in a high resolution ESI-q-TOF Bruker Zxx adaptation refers to the adaption of cassava plants to a specific edapho-climatic zone. Varieties indicated by * were grown in vitro and in the field.

Journal of Agricultural and Food Chemistry
Article maXis mass spectrometer. MS data was collected in negative centroid mode, from 50 to 1200 m/z for 0.5 s. Source settings were set as follows: end plate offset and capillary voltages at −500 and 3500 V, respectively, nebulizer gas (nitrogen) at 1.3 bar, dry gas to 8 L/min, and dry temperature at 195°C. Transfer settings were set as follows: ISCID, quadrupole, and collision cell energies at 0, 5, and 5 eV respectively; funnel, multipole, and collision rf at 200 Vpp; ion cooler rf at 40 Vpp; transfer time at 40 μs; and prepulse storage at 1 μs. Calibration was conducted at the end of each run. Data analysis was performed based on R package metaMS 16,17 using the default settings with a retention time window match set to 0.5 min.
Chromatographic Analysis with UPLC-DAD. For analysis of carotenoids and chlorophylls, an aliquot of the nonpolar phase (350 μL) was dried, resuspended in ethyl acetate/acetonitrile (1:9), and analyzed as previously described. 14 Metabolites were identified through specific retention time and UV/visible light spectrum and quantified from dose−response curves. 18 Data Processing and Statistical Analysis. Principal component analysis (PCA) with pareto scaling was performed with Simca P 13.0.3.0 (Umetrics, Sweden). All other statistical analysis was performed using XLSTAT 2017 (Addinsoft, Paris, France). For all statistical analysis, the number of biological replicates varied between three and nine depending on the materials provided (Supplementary Table 2). Discriminant analysis was based on traits as dependent variables and metabolites as explanatory variables and included validation with half the data set. Additional settings for discriminant analysis included, within-class covariance matrices are assumed to be equal and prior probabilities were used to describe the classification functions. Partial least-squares (PLS) regression was used to correlate traits and metabolites. This included the cross validation: Jackknife (LOO) and validation with random selection of half the data set. The correlation between five selected varieties, grown both in vitro and in the field, was confirmed with RV coefficient using P-value computation Extrapolation. For one-to-one comparison between varieties, significant metabolite changes in leaf and root were established through pairwise comparison with Student's t test (P < 0.05) and overlaid with biochemical pathways constructed specifically with BioSynLab (Royal Holloway University of London).

■ RESULTS AND DISCUSSION
The metabolite composition of various cassava varieties (Table  1) was analyzed with three different platforms in order to provide a comprehensive range of metabolites. All 23 varieties were analyzed at the in vitro plantlet stage. Five varieties were chosen for analysis of leaf, stem, and root material of cassava plants grown under field conditions. The methods for LC-MS, GC-MS, and UPLC-DAD analysis were adapted for cassava samples (extract volume dried, injection volumes, and dilutions) to create a metabolite profiling approach amenable for cassava plants. Targeted analysis included identified metabolites from all three platforms used.
Diversity Detected in in Vitro Plantlets. Over 9000 features were detected in the untargeted analysis by LC-MS of in vitro plantlet samples (Supplementary Table S2) and were scaled to the internal standard and the quality controls. The combined untargeted data was displayed by PCA analysis (Figure 1) and showed a trend of the varieties on the basis of geographical origin. African varieties were located toward the center surrounded by varieties from Central America/Caribbean and South America. The varieties from South America showed the widest distribution in the score plot which indicates a higher variance of metabolites and corresponds with several studies describing the Amazon basin as the origin of cassava and crop domestication patterns. 6,19,20 Three varieties, BRA488, PER496, and VEN77, were excluded from the combined untargeted analysis as their data was only available from one in vitro harvest, impeding confirmation of detected metabolite features as part of the chemotype or the growth conditions. Targeted analysis included analysis by GC-MS and UPLC-DAD as well as identification of metabolites from untargeted LC-MS analysis. The comprised data set included over 100 metabolites representing amino acids, sugars, organic acids of the TCA cycle, phenylpropanoids, isopentenyl pyrophosphate derived pigments (IPP), and metabolites of other chemical groups (e.g., linamarin) (Supplementary Table S3). The data set was subjected to (i) PCA analysis to show the similarity/ differences between cassava varieties and (ii) discriminant analysis with PLS regression to link the traits of varieties to metabolites/metabolite groups. The PCA score plot indicated that some traits have a more similar metabolite composition than others (Figure 2). Discriminant analysis showed that 76% of the identified metabolites were significantly different (P < 0.05) between traits and that each trait could be identified by its  Table S4a−c and Supplementary Figure S5).
Several varieties of the cassava panel were chosen to present contrary trait properties (e.g., high/low or resistant/susceptible) leading to the identification of metabolic similarities/differences between these varieties. In the case of amylose content, sugar content, carotene content, and zone adaptation, the respective varieties were located closest to each other in the PCA analysis ( Figure 2). Additionally, the four varieties with amylose and sugar content root traits were situated very close which indicates a similar metabolic leaf phenotype for those two traits. This would be expected as both traits were bred for their root carbohydrate content forcing changes in the same biosynthetic pathways. 21,22 Nevertheless, a clear separation between the amylose and sugar traits was observed in the PCA plot indicating the metabolic difference between phenotypes with bound or free sugar contents. Regression analysis was applied to verify correlation between traits and metabolites and showed a lack of distinct correlations for varieties with sugar content root traits. The varieties with a higher amylose root content was correlated with higher levels of TCA cycle intermediates (Supplementary Figure S5). The waxy potential variety showed no similarity to the low amylose content, which suggests that the processes for amylose-free starch in waxy varieties includes a different mechanism(s) compared to low amylose varieties. 23 In the case of thrips, bacteriosis, and cassava mosaic virus (CMD) traits, the respective resistant and susceptible varieties clustered away from each other (Figure 2). This could be related to the yet to be elucidated stress responses which can comprise constitutive and/or induced mechanisms. 24 If the biotic stress traits are constitutive, then the metabolic composition of resistant varieties can be distinguished from susceptible varieties even without the presence of the biotic stress. Only two biotic stress traits were associated with a particular group of metabolites. Regression analysis established a positive correlation of glycosylated phenolics with bacteriosis resistance and of free catechin and epicatechin levels with thrips susceptibility (Supplementary Figure S5). Catechin and epicatechin are polymer units for condensed tannins which act as feeding deterrents. 25,26 In the case of the thrips susceptible variety COL2436, free catechin/epicatechin levels were about five times higher compared to the resistant variety PAN139 (Supplementary Table S3). This could indicate a reduced level of condensed tannins present as a constitutive response and, therefore, the susceptibility of this variety. 24 Two other traits with respective phenotypes were carotene content and culinary quality. The high carotene trait was correlated with higher levels of IPP and showed a negative correlation to lipid/cell wall precursors and amino acids. The exception were amino acids involved in the arginine biosynthesis which had a positive correlation with the high carotene trait (Supplementary Figure S5) and are linked to IPP biosynthesis via glutamic acid in chloroplasts. 27,28 In cassava, culinary quality is related to soluble sugars and linamarin content defining the bitterness of the root. 29 The variety with high culinary quality correlated with higher levels of monosaccharides and intermediates of the TCA cycle which would suggest increased levels of glucose and fructose for transport to starch biosynthesis in the roots as observed previously. 22 This leads to the hypothesis that in vitro leaves can be used to screen for root phenotypes.
Ascertain the Correlation between in Vitro and Field Leaf Material of Five Varieties. Environmental differences, such as sunlight, soil properties, and watering regime, can directly and indirectly influence metabolite levels/composition of leaves through photosynthetic processes and nutritional uptake. 30 The plants in the present study were grown in vitro and in the field, two very different conditions depicting a sterile, controlled environment and an environment of fluctuating properties. Therefore, a comparison of the leaf tissue of five field varieties (BRA1A, COL22, COL638, CUB23, and PER183) was implemented to elucidate whether conclusions can be drawn from in vitro to field plants.

Article
The leaf tissue of both growth conditions showed expected metabolic differences. Quantitatively, these differences included changes up to 10-fold in a single metabolite and vary between metabolites (e.g., glutamic acid, valine and trehalose/turanose). Many metabolites showed no significant difference between leaf material from in vitro and field grown plants (Supplementary  Table S6).
Some metabolites were detected in only one of the two growth conditions. Hence, only metabolites present in both conditions were used for further analysis. Due to the difference in individual metabolite quantities, a separate PCA analysis was chosen to show the distribution of the five varieties within each growth condition (Figure 3a,b). COL22 was located in the center of both clusters, but while PER183 clustered away in the filed data set, the in vitro data showed a close metabolic similarity between PER183 and COL22. Correlation analysis (RV coefficient = 0.455 p-value <0.0002) was significant between the overall metabolite data measured for in vitro and field conditions and only showed significance for PER183 (RV coefficient = 0.726 p-value = 0.028) in an individual comparison of each variety.
The PCA plots of field and in vitro material highlighted a difference between metabolite compositions. Nevertheless, photosynthesis related pigments and phenylpropanoids showed a predominant influence on the cluster trends of varieties under both growth conditions (Figure 3). These quantitative differences were partially expected as the field grown leaf material was subjected to varying light conditions and unknown biotic/ abiotic variables, both influencing quantities and composition of secondary metabolites. 31,32 The location of COL638 in both score plots was linked to phenylpropanoids which has been associated with bacteriosis resistance of this variety previously and seems to be a phenotypic characteristic from an early growth stage. 33 Primary metabolites (amino acids, organic acids of the TCA cycle, and sugars) were associated with the same varieties in in vitro plantlets and field leaf material (Figure 3). Amino acids clustered with PER183 and intermediates of the TCA cycle with BRA1A. The association to sugars varied slightly and showed a correlation to COL638 and CUB23 under in vitro conditions and to CUB23 and COL22 under field conditions.
Despite the influence of different environmental growth conditions, several metabolic similarities between the growth stages could be detected. This suggests that some varieties have endogenous genetic mechanisms influencing the metabolism in a similar manner throughout the plant development impartial to the environment. 34,35 Overall, this suggests that a direct conclusions of the leaf metabolite composition cannot be drawn from in vitro to field plants.
Tissue Function Influences Metabolite Composition in Leaf, Stem, and Root. The targeted analysis of the field material revealed clear differences in the metabolite composition of leaf, stem, and root ( Figure 4a) and specific metabolite groups associated with one or more plant tissues (Supplementary Table  S9, Supplementary Figures S7 and S8). Leaf and stem material both showed a more diverse metabolite composition between the five varieties. Photosynthesis related pigments and cell wall precursors clustered with leaf samples, whereas phenylpropanoids and linamarin clustered with stem samples, which is consistent with housekeeping characteristics of these tissue types. 36 Amino acids and organic acids of the TCA cycle showed an even influence on the leaf and stem samples and sugars were associated with all three plant parts. The root material showed the least diversity between the five varieties, displaying the main function of roots−carbohydrate storage. 21 The location of the varieties to each other was different within the leaf, stem, and root cluster (Figure 4a). However, the direct comparison of leaf and root material (Figure 4b) revealed very similar allocation between the varieties, with PER183 and CUB23 as two comparative extremes. A predictive comparison from leaf to root material was attempted, but due to the small number of varieties used, the only predictive distinction could be made between PER183 and CUB23.
The stem is the transport organ between the leaves and roots. Hence, its metabolic composition varies throughout the day and growth stage 21 and can be influenced by the stem flow rate and the starch accumulation in the roots. 22 These findings were reflected in the metabolite data and emphasize the stem as an

Journal of Agricultural and Food Chemistry
Article unreliable source of information for profiling purposes analyzing only a snapshot of the metabolism. 37, 38 Characteristic Traits Show Significant Differences in Leaf over Root. The metabolite profiling method used facilitates direct comparisons between varieties to elucidate specific metabolic differences regarding single metabolites or metabolite pathways. Three of the varieties grown in the field have opposing traits regarding PPD properties and β-carotene content. PPD is a stress response of storage roots to a burst of reactive oxygen species after mechanical damage during harvest. The metabolic response is activated within 72 h and, similar to a wound healing response, does in general not show a distinct chemotype before the damage. 39 Nevertheless, the degree of PPD susceptibility/tolerance is influenced by the endemic content of linamarin (ROS production through HCN release) and of scavenging metabolites, e.g., β-carotene. 40 The variety PER183 is both PPD tolerant and has a low in βcarotene content. Hence, it can be compared to COL22, a PPD susceptible variety, and BRA1A, a variety with high β-carotene content. The identified metabolites of those varieties were compared within the leaf and root materials and significant changes shown with a pathway display ( Figure 5, Supplementary  Table S10). The metabolite comparison showed a higher number of significant changes in the leaf compared to the root material for both the PPD and the carotene trait.
The comparison between COL22 and PER183 (Figure 5a,b) did not reveal many changes in the root material. The root material of COL22 had lower levels of glutamic acid, serine, and glycine as well as itaconic acid, turanose, fructose, and glycerol. About a third of the metabolites identified in the leaf material differed between the two varieties, and the majority of differences were lower metabolite levels in COL22. These metabolites comprised five amino acids including phenylalanine, the precursor of phenolic compounds, half of the organic acids of the TCA cycle as well as intermediates of the feruloyl-malate pathway. In the case of COL22 and PER183, no significant difference was detected between the levels of two of the influencing metabolites for PPD reaction, linamarin, and βcarotene. This would suggest that either the metabolic composition in the roots before mechanical damage does not influence the PPD reaction, and the resistance/susceptibility is related to regulatory processes after mechanical damage or metabolites other than linamarin and β-carotene are responsible for the PPD reaction in cassava roots.
Interestingly, even though BRA1A had a higher β-carotene content in roots, the carotenoid and chlorophyll B content in the leaf material were significantly lower compared to PER183 (Figure 5c,d). Other differences in BRA1A roots were higher levels of linamarin, fumaric acid, and GABA and lower levels of serine and glutamic acid. The higher content of β-carotene and linamarin in BRA1A, both influencing the effects of PPD in an adverse manner, bears the question how BRA1A would perform under PPD induction, in comparison with PER183. 40 The comparison to COL22 also showed that PER183 had higher levels of ferulic acid and -malates, trans-caffeic acid, and neochlorogenic acid. These differences might be the determining feature for frogskin disease resistance. Ferulic acid and caffeic acid are classified as phenolic compounds with repelling/ inhibiting properties against herbivores and serve as precursors

Journal of Agricultural and Food Chemistry
Article for mechanical structures strengthening the leaf surface/cuticle against virus transmission from feeding insects. 41,42 In conclusion, the present metabolomics approaches have illustrated the utility of the methodologies to chemically differentiate cassava accessions, as a means of (i) classifying diverse and redundant genotypes in over populated gene banks complementing/validating the use of genotyping approaches and (ii) utilizing the approach to characterize parental materials used in future breeding efforts of the CGIAR Research Program.

Notes
The authors declare no competing financial interest. All relevant data are available within the manuscript and Supporting Information. For materials, please contact the corresponding author.

■ ACKNOWLEDGMENTS
The authors would like to thank Chris Gerrish and Sarah Duchmann for their excellent technical assistance.