Combining paired analytical metabolomics and common garden trial to study the metabolism and gene variation of Ginkgo biloba L. cultivated varieties

Secondarymetabolites play a pivotal role in plant physiology andmedicinal function.Muchwork has been carried out to uncover the genetic basis of plant secondarymetabolite variation, but direct screening of gene variation in the whole genome is extremely timeand labor-consuming. The prediction of a candidate locus of a single nucleotide polymorphism (SNP) will save much time and resource. In this work, we combined a paired analytical metabolomics and a common garden trial to bridge the association of plant metabolism and related gene variation of Ginkgo cultivated varieties. Firstly, the leaves of 30 cultivated varieties of Ginkgo biloba L. grown in the same garden since 1990 were analyzed by UHPLC-QQQ MS/MS. Thirty-six metabolites in the flavonoid biosynthetic pathway were quantified. The biosynthetic rate of flavonoids in different cultivated varieties could reflect the related enzyme gene variation, since the environmental influence was minimized. Thus, the role of SNPs in possible varied genes was further associated with flavonoid synthesis. Results showed that after long term environment and artificial influence, different accessions of G. biloba showed a varied ability in flavonoid aglycone synthesis due to some gene polymorphism; this difference may be heritable but not obvious. Compared with previous methods, this strategy is advantageous in accuracy, low sample requirement and easier operation, providing effective information in phenotype–genotype association, and it can also be used in the heritability study of artificial breeding before large-area introduction.


Introduction
Ginkgo biloba is the oldest gymnosperm. The leaves and standard extracts are consumed in both eastern and western countries as the best-selling functional food. 1 G. biloba is widely distributed in China and long-term effects of natural conditions as well as articial selections resulted in many cultivated varieties or strains. Generally, G. biloba was cultivated to show certain morphological traits, such as the size or shape of a kernel and leaf, as well as the texture of the wood. 2 However, its commercial value is dependent on the content of the functional avonoids, which play key roles in plant physiology as attractants that enhance pollination and seed dispersal, and as a part of the plant defense mechanism and also possess excellent health promotion abilities. [3][4][5][6] In China, many efforts were put into breeding the strain of G. biloba that accumulates a higher concentration of total avonoids. In G. biloba, most avonoids belong to the glycosides of quercetin, kaempferol and isorhamnetin, and structural modications of the avonoid skeleton have produced different avonoids. The types and concentrations of avonoids may be varied among different cultivated varieties. Thus, it is crucial to know whether these cultivated varieties differed in their abilities to form target avonoid metabolites, and whether this trait is decided by genetic variation or environmental conditions, because only inheritable traits can be introduced by other cultivars. 4, 5 Direct screening of gene variation in the whole genome is extremely time-and labor-consuming. If evidence is available to predict the candidate locus of single nucleotide polymorphism (SNP), then we can use simpler methods like direct sequencing, Taqman makers or restriction fragment length polymorphism PCR to save time and resources. 8,9 Metabolomics can reveal differently expressed metabolites which are reective of plants grown in different environments or diseased. 10,11 However, untargeted metabolomics may not provide detailed information for a concerned pathway due to non-individual optimized conditions. Metabolic ux analysis is an interesting tool to facilitate a better understanding of physiological processes. Isotopic enrichments of the metabolites are necessary to obtain metabolic ux, and this is an accurate way to map the overall cellular functions; 12 however, this experiment is still hardly operative. Considering the matrix effect, few analyses were conducted for the quantication of plant metabolites, while in most cases, the matrix effect of a plant extract is much lower than in bio-samples. In addition, the matrix effect can also be minimized by using structurally similar internal standards, and with a paired analytical metabolomics that calculates and compares the ratio of product/substrate of the enzymes in a pathway, a mapping ux might also be conducted in an accurate and easy way.
In the present work, we proposed a reliable strategy to rene the application of metabolomics in a plant genome study; an overview of our scheme is shown in Fig. 1. This study carried out a common garden trial where target plants were graed from different origins to minimize environmental inuence. A "paired analysis" of targeted metabolomics, which comprehensively considers the conversion rate of each biosynthetic reaction in a pathway, was used to analyze the samples. The different cultivate varieties of G. biloba L. graed from different origins since 1990 were selected as a case study of our experimental design. Thirty-six metabolites in the avonoid biosynthesis pathway were proled in each variety. Combining the proposed paired analytical strategy, metabolic ux and possible varied genes were predicted. Then the relationship between SNPs and avonoid synthesis was further associated by partial least squares regression analysis and homology analysis. The results demonstrated that Ginkgo cultivate varieties could be divided into 3 clusters based on avonoid synthesis ability and possible varied genes were predicted. Compared to a genome sequencing study, this method was accurate in method validation and has a low sample requirement, which proves that targeted metabolomics could also be applied as a convenient strategy in studies of plant physiology.

Plant materials
The tested materials were collected from National Ginkgo seed base in Pizhou (34 10 0 $34 40 0 N, 117 42 0 $118 10 0 E), Jiangsu, China. Tree branches with superior economic characters were introduced from the main Ginkgo producing areas since 1990, including Jiangsu, Shandong, Guangxi, Guizhou provinces, 2-3 meters high, 5 year-old seedlings were used as rootstocks, graing site starting between 1.8-2 meters, each rootstock graed 3-4 rootstock scions, each clone graed 3-6 strains. The origin of the graed strains is shown in Table S1. † Fresh leaves were collected from three branches from 30 accessions in the same time and the specimens were stored in the State Key Laboratory of Nature Medicines, China Pharmaceutical University.

Metabolomics analysis of avonoid biosynthetic pathway by UHPLC-QQQ-MS/MS
The powder of dried Ginkgo leaves was extract with 70% aqueous methanol, and diluted before MS analysis. The metabolomics analysis was performed by a Shimadzu LC-30AD UHPLC tandem Shimadzu 8050 (Shimadzu, Japan) QQQ-MS/ MS. The mass spectrometer was operated in negative ion mode. The ion source parameters were set as: nebulizing gas ow, 3 L min À1 ; heating gas ow, 10 L min À1 ; interface temperature, 300 C; DL temperature, 250 C; heat block temperature, 500 C; drying gas ow, 10 L min À1 . Shimadzu Labsolution provided instrument control, data acquisition, and data processing. Analytes were quantied in multiple reaction monitoring modethe selection of precursor ion, product ion, collision energy, Q1/Q3 pre bias was automatically optimized by Labsolution. The parameters are listed in Table 1. The sample vials were maintained at 4 C in a thermostatic autosampler. The chromatographic separation was achieved on an Agilent

Paired analysis of avonoid/modied avonoid compared with individual compound evaluation
In order to analyze the conversion rate of each avonoid biosynthetic reaction, the ratio of both avonoid/modied avonoid needed quantitation. Thus, we conducted an analysis of a nearly whole panel of avonoid biosynthesis (Fig. 2). Conversion rates of eleven reactions were monitored, including: avonoid reduction reaction at position 2-3 catalyzed by FLS (including the reactions between dihydrokaempferol/kaempferol, dihydroquercetin/quercetin and dihydromyricetin/myricetin); methylation of phenolic hydroxyl catalyzed by OMT (for quercetin/isorhamnetin, myricetin/syringetin), hydroxylation at position 3 catalyzed by F3H (the reaction between naringenin/dihydrokaempferol), hydroxylation at the benzene ring catalyzed by 4-coumarate 3hydroxylase, Coum3H, (the reaction between p-coumaric acid/ caffeic acid) and glycosidation catalyzed by FGlcT including the form of mono glycosides belonging to isorhamnetin, quercetin, kaempferol and apigenin.

Single nucleotide polymorphism (SNP) analyses
The total DNA was isolated from Ginkgo leaves described above using the Plant Genomic DNA Kit (Tiangen Biotech Co., Ltd, Beijing, China). The assay principle is described online. 13  Complete CDS of FLS were derived from the published data (GenBank: GQ994432.1). 6,7 The primers used were purchased from GENEWIZ, Inc. (Beijing, China) as: FLS-ex1-for:

Data analysis
The deduced amino acids of homologous sequences of all exons in FLS from various varieties were aligned using MEGA 5.1 and clustalx. 14 The gures were constructed by Graphpad prism 6.01. Multiple alignments and DNA translation were performed by DNAMAN 6.03. The concentration of avonoid metabolites was illustrated by a heatmap using HemI 1.0.3. 15 To reveal the relationship between SNPs and the GbFLS activity, a partial least squares regression (PLSR) method was conducted to calculate the coefficients between 14 SNPs and the relative ratios of dihydrokaempferol/kaempferol, dihydroquercetin/quercetin and dihydromyricetin/myricetin in 30 Ginkgo cultivated varieties (Minitab 17.1.0, Minitab Inc., USA). PLSR nds a linear regression model by projecting the predicted variables and the observable variables. In this model, the SNP locus with one amino acid was set as "0" in the model, whereas the SNP locus with two heterozygous amino acids was identied as "1". The ratios of dihydrokaempferol/kaempferol, dihydroquercetin/quercetin and dihydromericetin/myricetin were chosen as the "response".

Targeted metabolomics proling of Ginkgo avonoid biosynthesis pathway
Based on the scheme of avonoid biosynthesis, a target metabolomics analysis of 36 avonoids was achieved by UHPLC-QQQ MS, covering the most vital positions in the avonoid as well as anthocyanidin and hydroxycinnamate biosynthesis. As we know, many factors will inuence the accuracy of metabolomics. Among them, experimental design, controlled quenching of the metabolism, extraction from plant tissue and subsequent preparation remain the most crucial steps in carrying out a meaningful metabolomics study. 16 In order to standardize the varied techniques and experimental designs for such studies, the metabolomics standards initiative has provided guidelines for metabolomics experiments. 17 However, no analytical method can provide an absolute unbiased result because uncontrollable errors may happen in some necessary procedures. For the experimental design, individual differences must be minimized by a large amount of samples; usually hundreds to thousands of samples are needed. If the number of samples is limited, the outcome could be a false result. The metabolite extraction and sample preparation procedure can also easily result in errors. 18 Untargeted metabolomics usually provides a comprehensive identication of all the metabolites. It is limited in the sense that it is practically impossible to optimize the conditions for each metabolite. These include extraction solvents, solvents for redissolution and MS or NMR parameters. For instance, polar and non-polar compounds may not be extracted from tissues with the same solvent.
Thus, a target metabolomics method was chosen for the avonoid biosynthesis study, and a total run of 32 min was carried out for quantitative analysis all 36 analytes. The method was well validated for each compound. As shown in Table S2, † each calibration curve was linear over the studied concentration ranges with satisfactory correlation coefficients (R 2 > 0.99). The proposed method was sensitive with LOQs ranging from 0.001 to 16 ng mL À1 (Table S2 †). Precision was evaluated using quality control solutions at three concentrations. The RSD values across the various concentrations were less than 13.97% for intra-day precision analysis while the inter-day precisions were less than 14.5%. The repeatability RSD values of all the compounds were less than 8.78%, indicating that the method was reliable and repeatable. The overall recoveries fell within the range of 85.15-113.38% aer spiking high, middle and low amounts of analytes before extraction, with RSDs less than 6.69%.
Our strategy arouses several considerations to minimize the errors in metabolomics. Firstly, when a pathway is known to be signicant for a plant physical or commercial value, targeted metabolomics is superior in accuracy to comprehensive metabolomics. In this work, 70% aqueous methanol was used to extract avonoids of high polarity. MS data were acquired in the negative ion mode because avonoids are easier to dehydrogenate, leading to higher responses in this mode. All the methods were strictly validated to ensure the quantitation for each metabolite was reliable, stable, repeatable and precise. Secondly, a self-controlled trial can provide a great improvement in credibility and reduce the necessary sample number. In disease or diagnostic metabolomics, a paired experimental design is more recommended than an unpaired comparison to reduce the sample amount required. 19 The chromatograms of 36 avonoids in G. biloba leaves are shown in Fig. 3A. In Fig. 3B, the results showed that avonol glycosides and biavonoids were the major compounds in all leaves, while few aglycones and organic acids were present. The individual level of avonoids among Ginkgo accessions were also different, such as TM-2, TC231 and SNFS showed a higher content of avonol glycosides such as rutin, while YZYX and GL-9 proved to have more biavonoids.

Differences in conversion rate of avonoid biosynthesis among Ginkgo cultivated varieties
In a plant physiology study, the environmental inuence or genetic change needs considerable time, so a self-controlled trial is difficult to conduct. In this work, we evaluated the conversion rate of avonoid biosynthesis by calculating the relative ratio of avonoid/modied avonoid in many avonoid biosynthetic reactions, and all the data compared were from the same plant. This is also similar to a paired design to minimize individual errors. Fig. 4 shows individual levels and relative ratios of the avonoids involved in the reactions catalyzed by FLS, OMT and FGlcT. It shows that the absolute amount of avonoids is extremely varied among different samples (Fig. 4A-C), so it is difficult to nd their internal relationship. However, as each avonoid metabolite was formed by the catalysis of a certain enzyme, the conversion rate in a reaction may indirectly reect the activity of a target enzyme. Fig. 4A shows the individual level of aglycones and their glycosides, and their relative ratios are presented in Fig. 4B. We can see inconsistencies with the two results, for example in Fig. 4A and B, a sample with a higher glycoside level may not always have a higher glycoside/aglycone ratio, accession 9 had a middle level of Kaem-7-G, but the ratio versus kaempferol was the highest, which indicated this accession may have a high ability to synthesize glycosides. In addition, some reactions, such as that catalyzed by FGlcT ( Fig. 4A and B), FLS ( Fig. 4C and D) and OMT (Fig. 4E and F), have more than one substitute; the paired analysis also indicated that when the ratio of different substitutes changed consistently, the activity or expression level of that enzyme may simply increase or decrease. If this change was different in a sample, we might predict that the selectivity of the substitute varied. The phenomenon may be similar to the turnover of neurotransmitters in some human diseases. 20 Fig. 4E and F indicate that the selectivity on quercetin and myricetin of methyltransferase (OMT) was not changed in most samples, but the selectivity of FLS and FGlcT differed in many cases. Conventional metabolomics evaluate the compound level just based on the absolute concentration or response. However, a higher response of metabolite cannot result from the different enzyme catalytic abilities or further association with gene variation due to the individual plant differences. A systematic analysis of the conversion rate is necessary to nd the relationship between metabolomics result and gene variation.

Clustering of Ginkgo cultivated varieties based on the avonoid forming rate
As discussed earlier, the conversion rate may give more reliable information than the single compound level. In order to show the metabolic ux change in the avonoid synthesis, Ginkgo cultivated varieties were clustered based on these 11 monitored reactions (K-means clustering). Three clusters were assumed in this work, the nal cluster membership and centers are listed in Tables 2 and 3. The accessions in cluster 1 were characterized by lower levels of avonoid glycosides, with the ratio of glycosides/ aglycone ranging from 1.29 to 5.73. Correspondingly, these accessions showed high levels of aglycone due to lower glycosylation. Accessions in cluster 2 contained moderate levels of most avonoid synthetic ability. Only one accession in cluster 3, because trace amounts of aglycone was detected, was associated with an extremely high level of glycoside/aglycone ranging from 14.47 to 27.77. (The ratio of Isor-3-G/Isor cannot be calculated because Isor was not detected.) Clustering showed a metabolic ux preference to glycoside production and also showed a lower reverse rate of aglycone/dihydroavonoid, such as the samples in cluster 3, which indicated a high expression or enzyme ability of FGlcT and FRhaT. This difference probably resulted from their gene variations, since they were grown in the same garden. Ginkgo is a highly conserved species, but it has been planted for thousands of years in extremely different environments, and long-term environmental inuence may result in some gene variations. 21

Single nucleotide polymorphism (SNP) analyses of possible varied gene
We intended to investigate whether these Ginkgo cultivated varieties had inheritable variations in avonoid accumulation. Since the inuence of the environment was minimized by using a common garden trial, we assumed that there may be possible genetic variations in corresponding enzymes. Based on this assumption, GbFLS was used as a case study because FLS plays a central function at the branch point of the avonol derivative metabolism. 22,23 The GbFLS gene in different cultivated varieties  was sequenced and the coding SNP (cSNP) was analyzed. As in our prediction, 39, 9 and 5 SNPs were found, respectively, in three exons of GbFLS. Compared with the reported sequences, most of them were consistent in the test samples and turned out to be heterozygous mutations. These results were consistent with the report that FLS was encoded by a multi-copy gene in plants. 24 Since most of these SNPs were heterozygous mutations, the sequence was only partly reported. 25 Aer alignment, 34, 4 and 2 mutations were proven to be "truly" SNPs in these cultivated varieties. Among these SNPs, 12 and 2 SNPs were found as non-synonymous mutations in exons 1 and 2 of the GbFLS gene. There was no SNP in amino acid difference in the exon 3.
To dene the central role of these amino acid substitutions, in Fig. 5, the GbFLS cDNA sequence was translated into an amino acid sequence and then aligned with Scutellaria Baicalensis FLS (SbFLS), Vitis Vinifera FLS (VvFLS), Citrus Unshiu FLS (CuFLS) and Arabidopsis thaliana FLS (AtFLS). GbFLS is highly homologous with described proteins. The structure of FLS is composed of a b jelly-roll fold surrounded by ten different length a-helix. The iron metal may ligate with a bidentate group of cosubstrate 2-oxoglutarate on 3 side-chain residues, H228, D230 and H282, which conserved all FLS proteins. Hydrogen bonds may form between the other side of 2OG with 3 conserved residues, Y211, R292 and S294. The substrate might locate near an iron metal ion, in the possible residues of H137, F139, K207, F298 and E300. 26 These substrate and catalyzed residues may play a vital role in the function of GbFLS. From Fig. 5 we can see no SNPs occurred in these positions or neighboring alleles, which indicates that these gene variations might not heavily inuence the enzyme function.
Actually, 12 of 14 SNPs were found in exon 1, which is far from the substrate and catalyzed residues of GbFLS in the exon 2 and exon 3 (this region is still conserved). However, some of those SNPs located in the a-helices of exon 1 might affect the GbFLS structure and activity, including F62/L62, E71/Q71, P85/ S85, F97/V97, N169/K169 and R170/G170, so this indicated the  This journal is © The Royal Society of Chemistry 2017 probability of the enhancement or inhibition of enzyme activity in the formation of quercetin, kaempferol and isorhamnetin. The plot of the PLSR coefficient against predictors (14 SNPs, predictors 1-14, represent the 14 SNPs according to the order in Fig. 5) is shown in Fig. 6. No SNP showed extremely higher coefficients than others, which may be due to the absence of SNPs in the catalyzing or substrate binding allele. Among them, predictor 1, 10, 13 and 14 showed a relatively signicant inuence on the synthesis of myricetin and kaempferol, which represented the SNPs of I12/F12/S12, F97/ V97, N169/K169 and R170/G170, respectively. Three of these SNPs located in an allele of the protein form a-helices, which may have a higher possibility to inuence the structure and function of GbFLS. The results indicated the probability of an enhancement or inhibition of enzymes in the formation of quercetin, kaempferol and isorhamnetin. This may contribute to FLS displaying variable substrate preferences and loose catalytic activities. For instance, in Fig. 6, predictors 1 and 2, which correspond to the mutation from I12 to F12/S12 and C50 to F50, showed a positive inuence at converting dihydrokaempferol to kaempferol and dihydromyricetin to myricetin, but negatively regulated the reaction of converting dihydroquercetin to quercetin. This mechanism may be similar to a number of mutations of C. unshiu FLS reported in in vitro studies. 27 This FLS had a higher affinity for dihydrokaempferol than for dihydroquercetin. A functional FLS cloned from Z. mays also showed a different Km for converting dihydrokaempferol and dihydroquercetin. 28 The FLS protein in the substrate binding residue is evolutionarily conserved, including the residue ligate with iron metal, 2-oxoglutarate and avonoid. Mutations occurring in these alleles may signicantly affect the function of GbFLS. A SNP in these residues can be used as a general marker in Ginkgo breeding for improving nutritional quality and in the formation of target avonoids. 29,30 In addition, technology, like genomeediting, can be performed based on these potential functional SNPs as a novel control for the precise alteration of native genomes. 31,32 Genome-editing technology may be used to specically edit a single nucleotide, resulting in the change of the evolutionarily conserved GbFLS proteins in Ginkgo cultivated varieties. This could then trigger functional avonoid biosynthesis without any visible deleterious effect on plant performance.

Conclusions
In summary, compared with untargeted metabolomics, a paired analytical targeted metabolomics study can provide a higher accuracy by individual method optimization and self-controlled calibration. Comprehensive analysis of the conversion rate of biochemical synthesis can provide more information of plant physiology similar to a metabolic ux analysis. In addition, combining a common garden trial with accurate quantitation of a complete pathway panel meant it was easier to associate the gene variation with the metabolite level. In this work, we proved that aer long term environmental and articial inuence, different accessions of G. biloba showed a varied ability in avonoid aglycone synthesis. The proposed strategy can be used as a pre-experiment or a complementary study to genome screening in plant physiology which was useful in the heritability study of other articial breeding before large-area introduction. Based on the potential functional SNPs found by this method, genome-editing can be performed to precisely control the alteration of articial varieties for different purposes.

Conflicts of interest
There are no conicts to declare.