Overexpression of a serine hydroxymethyltransferase increases biomass production and reduces recalcitrance in the bioenergy crop Populus

BioEnergy Science Center and Biosciences D Oak Ridge, TN, USA. E-mail: chenj@ornl.g gov; mili.2010usa@gmail.com; acbryan11 ornl.gov; gunterle@ornl.gov; englenl@ornl. gov; puy1@ornl.gov; ragauskasaj@ornl.go 9939; Tel: +865-574-9094 Center for Bioenergy Innovation, Oak Rid 37831, USA ArborGen Inc., Ridgeville, SC, USA. E-mail gmail.com; cmcolli@arborgen.com U.S. Department of Energy Joint Genome I vrsingan@lbl.gov; EALindquist@lbl.gov; KW HudsonAlpha Institute for Biotechnology, hudsonalpha.org Department of Chemical and Biomolecula Knoxville, TN, USA Department of Forestry, Wildlife, and Fi University of Tennessee Institute of Agricultu † Electronic supplementary informa 10.1039/c8se00471d Cite this: Sustainable Energy Fuels, 2019, 3, 195


Introduction
Serine hydroxymethyltransferase (SHMT, EC 2.1.2.1) is a pyridoxal phosphate-dependent enzyme that plays an important role in cellular one-carbon (C 1 ) pathways by catalyzing the reversible conversions of L-serine to glycine and simultaneously tetrahydrofolate (THF) to 5,10-methylene THF. 1 Prokaryotes have single gene encoding SHMT. In animals and fungi, two SHMT isoforms (cytosolic and mitochondrial) are encoded by two distinct genes. 2 The structure of the SHMT monomer is similar across prokaryotes and eukaryotes, whereas its functional form is a dimer in prokaryotes and a tetramer in eukaryotes. 1 The structure of SHMT directly affects its function. For example, for sheep liver cytosolic SHMT (scSHMT), the mutation of D227 leads to formation of inactive dimers, E74 controls the conversion between "open" and "close" forms, and K256 plays a crucial role in maintaining the tetrameric structure. 1 In addition, the function of SHMT is affected by posttranslational modication, such as ubiquitination and a BioEnergy Science Center and Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA. E-mail: chenj@ornl.gov; mucherow@ornl.gov; zhangj1@ornl. gov; mili.2010usa@gmail.com; acbryan11@gmail.com; yooc@ornl.gov; jawdys@ ornl.gov; gunterle@ornl.gov; englenl@ornl.gov; yangx@ornl.gov; tschaplinstj@ornl. gov; puy1@ornl.gov; ragauskasaj@ornl.gov; tuskanga@ornl.gov; Fax: +865-576-9939; Tel: +865-574-9094 sumoylation. Anderson et al. (2012) reported that mammalian SHMT1 interacts with an E2 conjugase, Ubc13, which mediates competitional ubiquitination or sumoylation. The ubiquitination is required for SHMT1 nuclear export and increases its stability within the nucleus, whereas sumoylation of SHMT1 is involved in nuclear degradation. 3 Moreover, SHMT plays a role in transcriptional regulation. For example, human cytoplasmic SHMT (hcSHMT) protein can bind mRNA and displays increased affinity for the 5 0 untranslated region of its mRNA. 4 To date, SHMT in plants has been reported to function in the photorespiratory pathway and stress responses. In the photorespiratory pathway, SHMT is associated with the glycine decarboxylase complex (GDC), 5 which catalyzes the decarboxylation of photorespiratory Gly to yield NH 3 , CO 2 and a C 1 unit that is transferred to THF. 6 Mutants of mitochondrial SHMT or GDC exhibit symptoms of chlorosis. 7 In Arabidopsis, the circadian clock regulates the expression of genes encoding mitochondrial components of the photorespiratory pathway, including AtSHMT1 and AtSHMT4. 8 Potato plants with antisensed SHMT have lower photosynthetic capacity and accumulate glycine in light. 9 In addition, SHMT is involved in biotic and abiotic stress responses. The Arabidopsis shmt1-1 mutant is more susceptible to infection with necrotrophic and biotrophic pathogens, and salicylic acid-induced genes and H 2 O 2 detoxication-related genes are constitutively activated in shmt1-1 plants. Under abiotic stresses, shmt1-1 plants exhibit hypersensitivity to salt stress. 10 SHMT is degraded by 26S proteasome, but it can be stabilized by a ubiquitin-specic protease, UBP16, through deubiquitination of SHMT1 in Arabidopsis. 11 So far, the function of SHMT in cell wall-related processes has not been reported. Recently, several other enzymes in C 1 metabolism have been reported to play important roles in cell wall-related processes. In C 1 metabolism, methylenetetrahydrofolate reductase (MTHFR) catalyzes the conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate, while folylpolyglutamate synthase (FPGS) catalyzes the addition of glutamate moieties to folate and folate derivatives. Both MTHFR and FPGS can affect lignin biosynthesis in maize. 12,13 Cystathionine g-synthase (CGS) catalyzes the formation of cystathionine from cysteine and an activated derivative of homoserine whereas S-adenosylhomocysteine hydrolase (SAHH) is responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. In switchgrass, down-regulation of CGS enhanced lignin biosynthesis whereas down-regulation of SAHH reduced the lignin biosynthesis and increased cell wall saccharication efficiency. 14 In vascular plants, secondary cell walls (SCWs) are the most abundant renewable plant biomass, and are widely used for many applications including energy, textiles, pulping and paper-making. SCWs are composed of lignin, cellulose and hemicelluloses. The deposition of SCWs allows cells to function as mechanical tissues for structural support and protection. 15 Many transcription factors (TFs) have been identied controlling SCW biosynthesis in a hierarchical manner. In the rst layer of the SCW regulatory network, SND1 and NST1 are master regulators of secondary wall biosynthesis in bers. 16 The second-layer master switches including two functional redundant MYB genes, MYB46 and MBY83, which activate the expression of the third-layer TFs and enzymes which are directly involved in SCW biosynthesis. 17 MYB4 is a transcriptional repressor that negatively regulates C4H and itself in plants. 18 The Arabidopsis myb4 mutant accumulates sinapate esters in leaves. 19 In switchgrass, overexpressing its orthologous PvMYB4 reduced the lignin content and ester-linked p-CA : FA ratio and increased sugar release. 20 It has been reported that the expression of MYB4 can be induced by glucose. 18 However, whether other metabolites or enzymes in metabolic pathways are involved in the SCW regulatory pathway is largely unknown.
In this study, we characterized an SHMT gene in Populus encoded by the locus Potri.001G320400 (PtSHMT2), which has strong expression in the developing xylem during poplar wood formation. We overexpressed this gene in P. deltoides 'WV94' and found that it enhanced plant growth and altered cell wall composition and sugar release in transgenic plants. By analyzing the transcriptomics and metabolomics data using PdSHMT2 overexpression lines, we examined the potential molecular mechanisms of action of PdSHMT2 in the secondary cell wall biosynthesis and revealed the association of PdSHMT2 with the transcriptional master regulators of secondary cell wall biosynthesis. Collectively, these ndings provide new insights into the design of genetic engineering leading to cost-effective biomass conversion into biofuels.

SHMT family in Populus
To study the function of SHMT in Populus, we rst analyzed the whole gene family of SHMT in Populus and other plant species. The SHMT genes were queried using BLAST in Populus trichocarpa genome and 11 other plant species, including three other woody species Salix purpurea, Eucalyptus grandis and Vitis vinifera, three annual dicots Arabidopsis thaliana, Medicago truncatula, and Glycine max; four monocots Oryza sativa, Zea mays, Brachypodium distachyon and Sorghum bicolor, and a moss Physcomitrella patens. A total of 95 SHMT genes were identied in these 12 species, and the SHMT family in dicots (7-15 members) is larger than that in monocots (4-7 members) ( Fig. 1 and Table S1 in the ESI †). Among those nine SHMTs identied in Populus, namely PtSHMT1-9, three paralogous pairs (PtSHMT2/9, PtSHMT3/5 and PtSHMT7/8) were likely generated by a Salicoid genome duplication and rearrangement event (Fig. S1 in the ESI †).
On the basis of the expression data from different tissues of P. trichocarpa (Populus Gene Atlas) (https://phytozome. jgi.doe.gov), we compared the expression patterns of the nine PtSHMT genes. Noticeably, only PtSHMT2 and its paralog PtSHMT9 showed high abundance across poplar tissues, and PtSHMT2 was strongly expressed in the stem (Fig. 1B). Phylogenetic (Fig. 1A) and sequence similarity ( Fig. S2 and S3 in the ESI †) analyses indicated that PtSHMT2 is closely clustered with AtSHMT4 (At4g13930). The pairwise correlation of the nine PtSHMT expression patterns indicated that two PtSHMT paralogous pairs (PtSHMT2/9 and PtSHMT7/8) have highly positive correlation coefficients (r ¼ 0.72 and 0.95) compared to pair PtSHMT3/5 (r ¼ 0.66) (Fig. 1D). In contrast, the promoter similarity of pair PtSHMT3/5 is higher than that of the other two pairs (Fig. S4 in the ESI †). The divergence between the expression and promoter sequence implies that the expression of PtSHMT paralogous pairs may be controlled by different mechanisms. Furthermore, the expression patterns of the nine PtSHMTs developed from the AspWood gene expression database were compared in nm-scale stem tissues of P. tremula. 21 Among the nine PtSHMTs, PtSHMT2 has the highest abundance and was induced in expanding xylem (Fig. 1C), implying that it may play an important role in SCW-related processes.
To explore the possible regulatory mechanism of PtSHMT2 expression, we analyzed the cis-acting elements of the promoter region (3000 bp upstream of the translation initiation site). Based on the functional annotation, the identied cis-acting elements were classied into four groupsdevelopment, hormone, stress and others. Noticeably, seven circadian elements (involved in circadian control), three ABREs (involved in ABA responsiveness), two CAT-box elements (related to meristem expression) and two MYB binding sites (MBS and MRE) were identied in the promoter of PtSHMT2 (Fig. S5 in the ESI †). In Arabidopsis, the shmt1-1 mutant is defective in mitochondrial SHMT activity and displays a lethal photorespiratory phenotype when grown at the ambient CO 2 level. 22 The expression of genes of the photorespiratory pathway has been shown to be regulated by a circadian clock. Arabidopsis SHMT1 and SHMT4 exhibit circadian oscillation in mRNA abundance. 8 This is consistent with the highest abundance of circadian elements in the promoter of PtSHMT2.
We then constructed a co-expression network of PtSHMT2 based on the global gene expression patterns across various tissues and under different stresses in poplar. According to the functional classication, a number of cell wall-related genes were identied in the co-expression network ( Fig. S6 and S7 in the ESI †). These include several key transcription factors (TFs) such as NST1, SND3, MYB46, MYB103, NAC073, NAC075, KNAT7 and WOX14, and important enzymes such as LAC10, LAC17, PRX44, PRX53, MAN6, IRX2, XTHs and UGT80A2 which are known regulators of SCW biosynthesis. 15,23 These results supported the role of PtSHMT2 in cell wall-related processes.

Overexpression of PdSHMT2 increases plant growth
To directly examine the potential function of SHMT2 in Populus, we cloned the full-length open reading frame of SHMT2 from P. deltoides 'WV94' (PdSHMT2) and overexpressed it in 'WV94' using the UBIQUITIN3 constitutive promoter. Eight independent transgenic lines were generated and two independent lines (#1 and #2) with high expression of PdSHMT2 were selected for further analysis (Fig. S8 in the ESI †). Compared to the control plants, the two transgenic lines showed increases in aboveground biomass; the diameter increased by $25%, height increased by $46%, leaf area increased by 64% and diameter 2 Â height (D 2 H) increased by $133% ( Fig. 2A-E). Increasing the yield of bioenergy crops is one of the major goals of bioenergy engineering. 24 This results demonstrated that overexpression of PdSHMT2 can effectively increase the feedstock yield in Populus.

Cell wall chemical composition and sugar release
The cell wall composition is directly associated with the efficiency of biomass conversion to biofuels. To explore whether cell wall composition is affected by PdSHMT2, we investigated the carbohydrate composition in stem tissues by means of ion chromatography aer two-step sulfuric acid hydrolysis procedures. Compared to control plants, the two transgenic lines (#1 and #2) had signicantly higher glucose content (7.37% and 8.72% increase, respectively) and lower lignin content (6.05% and 7.57% decrease, respectively), whereas there were no signicant changes in the contents of arabinose, galactose, xylose or mannose (Fig. 2F). The reduced lignin content in transgenic lines was mainly due to the reduction of acid insoluble lignin (AIL) (Fig. 2F).
To assess the sugar release performance of the PdSHMT2 overexpression lines, glucose and xylose releases during the enzymatic hydrolysis were monitored. At 6 h hydrolysis, the two transgenic lines already showed higher glucose release than the control plants. At the nal time point of 72 h hydrolysis, the glucose release increased by 9.38% and 12.4% in #1 and #2, compared to the control, respectively (Fig. 2G). More xylose release was detected in #1 at 6 h and increased by 31.4% at 12 h (Fig. 2H). The total sugar release in the two transgenic lines was constantly higher than in the control plants from 6 h and increased by 9.4% and 11.4% at 72 h ( Fig. 2I) for lines #1 and #2, respectively.
The structural information on lignin, such as lignin S/G ratio, level of lignin subunit content and lignin interunit linkages, was analyzed through nuclear magnetic resonance (NMR) to examine the biomass characteristics. As shown in Fig. 3, lignin syringyl (S) and guaiacyl (G) subunits, p-hydroxybenzoate (PB), and dominant lignin inter-unit linkages b-aryl ether (b-O-4 0 ), phenylcoumaran (b-5 0 ), and resinols (b-b 0 ) were iden-tied in the aromatic and aliphatic regions of PtSHMT2 overexpression and control lines. The lignin S/G ratio is viewed as one of the factors affecting recalcitrance. 25,26 The HSQC analysis revealed that the two PdSHMT2 overexpression lines had a similar lignin S/G ratio to the control plants (Fig. 3), suggesting that the higher sugar release observed in the transgenic lines without pretreatment in this study was not due to lignin S/ G ratios (Fig. 2). In the two PdSHMT2 overexpression lines, the b-5 0 content was relative higher but the b-b 0 content was relative lower than those of the control plants (Fig. 3). These small variations revealed from lignin 2D NMR spectra suggested that the lignin structures in the transgenic lines are similar to those in the control plant and that overexpression of PdSHMT2 in Populus had little effect on the lignin compositional units and side-chain linkages. Sugar release has been related to both the lignin content and lignin S/G ratio as well as cell wall structure- related factors. 27 Collectively, the reduced lignin content (and increased glucose content) rather than lignin structural variations likely contributed to the enhanced sugar release observed in the PdSHMT2 transgenics plants.

Overexpression of PdSHMT2 leads to changes in the transcriptome
To further reveal the molecular mechanism of SHMT2 involvement in poplar cell wall modication, we performed RNA-Seq using mature leaves of two PdSHMT2 overexpression lines and control plants. Compared to control plants, a total of 483 (230 up and 253 down) and 1159 (467 up and 692 down) differentially expressed genes (DEGs) were identied in line #1 and line #2, respectively (Table S2 in the ESI †). To explore the potential function of DEGs affected by SHMT2, MapMan analysis was conducted to classify these DEGs combined from the two transgenic lines (Fig. S9 in the ESI †). According to the MapMan classication system, a total of 23 up-and 20 down-DEGs belong to cell wall-related processes, 3 up-and 20 down-DEGs belong to secondary metabolism, 7 up-and 13 down-DEGs are involved in development, and 13 up-and 18 down-DEGs play roles in signaling (Fig. 4A). In the metabolic pathway, cell wall component biosynthetic genes (cellulose, hemicellulose and lignin biosynthesis) were down-regulated, whereas degradationrelated genes (cellulose and beta-1,4-glucanases, mannanxylose-arabinose-fucoses, pectate lyases and polygalacturonases) were up-regulated. In addition, cell wall precursor synthesisrelated genes, lignin biosynthesis-related genes, simple phenolrelated genes, and raffinose-related genes were down-regulated (Fig. 4B). We further classied the genes on the basis of their description and their Arabidopsis orthologs' functional description. In total, 132 up-and 163 down-DEGs related to the cell wall, secondary metabolism, cell cycle or cell growth, development, hormone, proteolysis, TFs, etc. were selected as "core-DEGs" for further analysis (Fig. 4C and Table S3 in the ESI †). In the two transgenic lines, eight genes encoding glycosyl hydrolases (GHs) were signicantly up-regulated, whereas nine lignin biosynthesis related genes (one C3H, two C4H, two F5H, one MAX1, one CCoAMT and two CCoAOMT) and eight cellulose biosynthesis related genes (CesA4, CesA7, CesA8, CSLG2 and four RIC4 homologs) were down-regulated. Moreover, six cell cycle related genes (CYCA2;1, two CYCD3;1, CYCD3;2, CYCD6;1 and CNGC15) and eight cell growth related genes (DWARF1, GRF7, PRX33 and ve EXPA members) were up-regulated in the PdSHMT2 overexpression lines (Fig. 4C). GHs comprise a large family of enzymes with a broad range of structures and substrate speci-cities, are ubiquitous in plants and play essential roles in various biological processes. 28 GHs are known to be involved in the degradation of biomass such as cellulose, hemicellulose, and starch. 29 For instance, KOR1, a GH9 family endo-1,4-b-glucanase, can degrade b-glucan with an unbranched b-(1/4)-linked backbone which is the basic structure of cellulose. 30 The reduced cell wall recalcitrance in PdSHMT2 overexpression lines might be partially explained by the induced GH expressions. The decreased lignin content and accelerated growth rates in PdSHMT2 overexpression lines are consistent with the down-regulated lignin biosynthesis genes and up-regulated cell cycle related genes, respectively.
To seek potential hub TFs which play key roles in controlling gene expression in the PdSHMT2 overexpression lines, we analyzed the TF binding sites (TFBSs) in the promoter regions of core-DEGs. Based on the TFBS enrichment in the promoter of core-DEGs, a total of 27 TFs were identied as the hub TFs including four NST1 homologs (NST1a, NST1b, NST1c, and NST1d), two CIB1 homologs (CIB1a and CIB1b), one MYB4, and one TT8 (Fig. 4D and E).
To validate the RNA-Seq results, we selected eight hub TFs (MYB4, CIB1b, WRKY6, NST1a, NST1b, NST1c, NST1d and TT8) and seven TFs (MYB46, MYB83, MYB63, MYB7, NAC056, GT2 and HBI1) from the DEG list for expression analysis using qRT-PCR.  Most of these 15 selected TFs are known to be involved in the transcriptional regulation of secondary cell wall biosynthesis. NST1 is a primary regulator of the SCW formation, 31 which directly regulates the SCW biosynthetic master switches MYB46 and MYB83. 23 MYB63 is a direct target of MYB46/83 and directly regulates lignin biosynthesis. 32 In contrast, MYB4 is a repressor and it can bind to the C4H promoter and its own promoter. 33 As shown in Fig. 5, the expression of activator genes such as NST1s, MYB46, MYB83, and MYB63 was repressed, whereas the expression of the repressor gene MYB4 was enhanced in PdSHMT2 overexpression lines. CIB1 is a cryptochromeinteracting basic-helix-loop-helix type TF and interacts with CRY2 in a blue light-specic manner in Arabidopsis. 34 The upregulated CIB1 in PdSHMT2 overexpression lines is consistent with the potential involvement of PdSHMT2 in the circadian related processes. Taken together, these results suggested that overexpression of PdSHMT2 leads to changes in the expression of these TFs which in turn regulate the expression of cell wall structural genes, resulting in altered cell wall chemistry.

Overexpression of PdSHMT2 leads to changes in metabolite proling
To investigate the impact of PdSHMT2 overexpression at the metabolic level, we performed metabolomic analysis using the PdSHMT2 transgenic lines. Compared to the control plants, a total of 33 up-regulated and 17 down-regulated metabolites were identied in the two transgenic lines (Fig. 6A). Based on the systematic pathway analysis of the metabolome, the most relevant pathway of these 50 differential accumulated metabolites is "Galactose metabolism" (Fig. 6B). In the galactose metabolic pathway, the content of six metabolites was signicantly changed in the two PdSHMT2 overexpression lines. Sucrose, glucose, galactose and fructose contents were increased, whereas galactinol and raffinose contents were decreased in the PdSHMT2 overexpression lines (Fig. 6C). These results suggest that PdSHMT2 mainly affects the galactose metabolic pathway at the metabolic level. We then combined the transcriptomic and metabolomic results to explain the metabolite changes in the galactose metabolic pathway. As shown in Fig. 6D, the decrease in galactinol content was caused by the down-regulation of three galactinol synthase genes, which directly catalyze the biosynthesis of galactinols. Reduced galactinol content restricted conversion from sucrose to raffinose, so sucrose was accumulated and raffinose was inhibited. SUS4 controls reversible catalysis between UDP-Glc and sucrose, and the Arabidopsis sus4 mutant has high leaf sucrose content in the dark and high root sucrose content under both light and dark conditions. 35 With the increase of sucrose content, its downstream fructose and glucose were also accumulated. It has been reported that sugars such as sucrose and glucose can function as signaling molecules to trigger gene expression in plants. In Arabidopsis, 82 TFs from 22 families are responsive to the glucose signal with greater than three-fold changes, including bHLH, MYB, AP2, etc. 36 The expression of MYB4 can be activated by 3% glucose aer dark adaptation. 37 The up-regulation of MYB4 in PdSHMT2 overexpression lines is consistent with the increased glucose content. Taken together, the results from transcriptomic and metabolomic analyses support the role of PdSHMT2 in cell wallrelated processes.

Conclusions
We describe an alternative strategy for increasing biomass production and reducing recalcitrance in the bioenergy crop Populus by manipulating the expression of a serine hydroxymethyltransferase. On the basis of our ndings, we propose a regulatory model in which PdSHMT2 is associated with the repression of activators including the rst-layer master switches (NST1/2/3) and the second-layer master switches (MYB46/83) in the regulatory network of SCW biosynthesis. On the other hand, PdSHMT2 affects metabolism to enhance the accumulation of sucrose and glucose, which triggers the expression of repressor MYB4 to inhibit the expression of lignin biosynthetic genes (Fig. 7). A eld trial on agronomic performance demonstrated that cell wall modications via PdSHMT2 overexpression did not compromise rst-year performance under eld conditions. 38 Collectively, genetically engineered poplar plants with increased glucose content, reduced lignin content and increased sugar release resulting from PdSHMT2 overexpression have the potential to serve as a promising biomass feedstock for efficient conversion into biofuels.

Bioinformatics analysis
To identify Populus SHMT proteins, the full-length amino acid sequences of Arabidopsis SHMT proteins were subjected to a BLASTp search in the Populus trichocarpa genome from Phytozome (https://phytozome.jgi.doe.gov). In addition, the SHMT proteins were searched in 11 other plant species including three woody species Salix purpurea, Eucalyptus grandis and Vitis vinifera, three dicots Arabidopsis thaliana, Medicago truncatula and Glycine max; four monocots Oryza sativa, Zea mays, Brachypodium distachyon and Sorghum bicolor, and a moss Physcomitrella patens for phylogenetic analysis. The SHMT candidates were searched in the Pfam database to validate the presence of the SHMT motif (PF00464). Multiple sequence alignment of the full-length amino acid sequences was performed using Clustal X2. 39 The phylogenetic trees were constructed using the neighbor-joining method in the MEGA package V5.1 40 with bootstrap values from 1000 replicates indicated at each node. Three-dimensional structure prediction was performed by means of the I-TASSER (iterative threading assembly renement, v4.4) toolkit. 41 Normalized expression values of the PtSHMT genes from various tissues and organs were obtained from the Populus Gene Atlas database (https://phytozome.jgi.doe.gov/phytomine/ aspect.do?name¼Expression). The nm-scale expression of PtSHMTs was obtained from the AspWood gene expression database. 21 The co-expression network of PtSHMT2 was created using data obtained from the co-expressed biological process database for poplar. 42 Cytoscape 43 was used to visualize the resulting network.

Generation of transgenic poplar
The full-length open-reading frame of PtSHMT2 was amplied from Populus deltoides genotype WV94 and was cloned into the pAGW560 binary vector, in which expression was driven by the Arabidopsis UBIQUITIN3 promoter. Agrobacterium-mediated transformation into WV94 was conducted at ArborGen Inc. (Ridgeville, SC) as described previously. 44 These plants were transferred and grown in greenhouses at Oak Ridge National Laboratory (Oak Ridge, TN) at a constant temperature of 25 C and 16/8 h light/dark photoperiod. To estimate stem cylinder volume, plant height and stem base diameter of six-month-old plants were measured. We measured the primary stem length from the stem base (6 cm above the soil surface) to the shoot apex for plant height and measured the diameter of the stem base using calipers.

Chemical composition analysis
The size of poplar samples was reduced to 40 mesh using a Wiley mill (Thomas Scientic, Swedesboro, NJ) and Soxhletextracted with ethanol/toluene (1 : 2, v/v) for 24 h. The extractives-free sample was analyzed by a method consisting of two-step sulfuric acid (H 2 SO 4 ) hydrolysis according to published literature. 45 In the rst step, the extractives-free sample was hydrolyzed with 72% (w/w) H 2 SO 4 at 30 C for 1 h. In the second step, the hydrolyzed sample was diluted to a nal concentration of 4% H 2 SO 4 (w/w), followed by autoclaving at 121 C for 1 h. The hydrolysate was ltered from the solid residue. The ltered liquid fraction was examined using a Dionex ICS-3000 ion chromatography system (Thermo Fisher Scientic, Sunnyvale, CA) for quantifying the sugar content. The total lignin content was quantied via acid-soluble and acid-insoluble lignin separation from the hydrolysate and solid residue, respectively. Acid-soluble lignin was measured with the liquid fraction at 240 nm wavelength using UV/Vis spectroscopy. Acid insoluble lignin was quantied with the ltered solid residue as described in the NREL procedure. All analyses were technically duplicated for reported values and statistical analysis.

Sugar release test through enzymatic hydrolysis
The released glucose and xylose were measured according to published literature. 46 Dried and Wiley-milled (40 mesh) stems of the Populus control and transgenic plants were used for sugar release measurement. The dried sample (250 mg) was loaded into 50 mM citrate buffer solution (pH 4.8) with Novozymes CTec2 (gi product with 70 mg protein per gram of biomass from Novozymes, Franklinton, NC). The enzymatic hydrolysis was carried out at 50 C and 200 rpm in an incubator shaker. Liquid hydrolysate was periodically collected at 0, 3, 6, 12, 24, 48, and 72 h, and enzymes in the hydrolysate were deactivated in boiling water for 10 min before carbohydrate analysis. The released sugars in each hydrolysate were measured using a Dionex ICS-3000 ion chromatography system. Each analysis was conducted in duplicate.
Heteronuclear single quantum coherence (HSQC) NMR analysis NMR spectra were recorded at 30 C on a Bruker Avance III HD 500 spectrometer and spectral processing was carried out using Bruker Topspin 3.5 (Mac) soware. The whole cell NMR samples were prepared according to the literature. 47 Poplar stem powders (40 meshes) were extracted with an ethanol/ toluene mixture (1 : 2, v/v) for 24 h. The extractives-free samples (air-dried) were loaded into a 50 mL ZrO 2 grinding jar (including 10 Â 10 ball bearings) in a Retsch Ball Mill PM 100. The biomass was then ball milled at 600 rpm with a frequency of 5 min with 5 min pauses in-between for a total of 2 h. The ball-milled whole cell wall sample ($60 mg) and an NMR solvent mixture of 4/1 (v/v) DMSO-d 6 /HMPA-d 18 ($0.4 mL) were loaded in a 5 mm NMR tube. The biomass and NMR solvents were mixed by vortexing for forming uniform gel-state samples, and then placed in a sonicator for 1-2 h before the analysis. The 2D heteronuclear single quantum coherence (HSQC) NMR experiments were carried out with a standard Bruker pulse sequence (hsqcetgpspsi2.2) with the following acquisition parameters: a spectral width of 12 ppm in the F2 ( 1 H) dimension with 1024 data points (acquisition time 85.2 ms) and 166 ppm in the F1 ( 13 C) dimension with 256 increments (acquisition time 6.1 ms), a 1.0 s delay, a 1 J C-H of 145 Hz, and 128 scans. The relative abundance of lignin compositional subunits and interunit linkage were estimated using volume integration of contours in HSQC spectra. 47,48 For monolignol measurements of S, G, H, and p-hydroxybenzoate (PB), the S 2/6 , G 2 , H 2/6 , and PB 2/6 contours were used for relative quantitation. The contours of C a signals were integrated for interunit linkage estimation.

RNA sequencing and data analysis
For RNA-Seq, mature leaves from PtSHMT2 overexpression and control poplars were collected for RNA extraction. For each line, two biological replicates were used for library construction and sequencing. A total of 4 mg of total RNA was used to generate a cDNA library for Illumina sequencing. The RNA-Seq library was generated by a standard method as described in the Illumina sequencing sample preparation protocol (San Diego, CA, USA). Aer ltering out low-quality reads, RNA-Seq reads were aligned to the P. trichocarpa genome (version 3.0, https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias¼Org_ Ptrichocarpa) using TopHat2. 49 Differentially expressed genes (DEGs) were identied using the R package DESeq2. 50 Raw Pvalues were adjusted for multiple comparison effects using the q-value (false discovery rate) method. 51 The cutoff for signicant DEGs was set as >2 absolute fold change (FC) and q-value < 0.05. MapMan 52 and Gene Ontology (GO) were used for functional classication of DEGs. GO enrichment analysis was performed using agriGO. 53 For transcription factor binding site (TFBS) prediction, a 2 kb sequence upstream of the translation start site of identied DEGs was analyzed using PlantPAN. 54

Metabolomics analysis
For metabolite proling, 25 mg of leaf tissues lyophilized and ground with a Wiley mill were twice extracted from each transgenic line and controls with 2.5 mL 80% ethanol overnight and then the extracts combined prior to drying a 0.50 mL aliquot in a nitrogen stream. As an internal standard, 75 mL of sorbitol at 1.0 mg mL À1 was added to the rst extract. Dried extracts were dissolved in acetonitrile, followed by TMS derivatization and analyzed by GC-MS according to Li et al. 55 Metabolite peaks were extracted using a characteristic mass-tocharge (m/z) ratio and quantied by area integration, and the concentrations were normalized to the quantity of the internal standard (sorbitol) recovered and the amount of sample extracted, derivatized and injected. A large user-dened database of mass spectral electron impact ionization fragmentation patterns of TMS-derivatized compounds ($2300 signatures) was used to identify the metabolites of interest. Unidentied metabolites were represented by their retention time and key m/ z ratios. The metabolite data were presented as fold changes of the transgenic line vs. the average of the control lines from three biological replicates. Student's t-tests were used to determine whether differences were statistically signicant (P # 0.05).

qRT-PCR analysis
One mg of total RNA was used to generate cDNA by using the Rite aid reverse transcriptase following the manufacturer's instructions (Thermo Fisher Scientic, Hudson, NH). Gene-specic primers were designed using Primer3 soware (http://frodo. wi.mit.edu/primer3/input.htm) with an annealing temperature of 58-60 C and an amplicon of 150-250 bp. qRT-PCR was performed using a Maxima SYBR Green/ROX qPCR master mix (Thermo Fisher Scientic) according to the manufacturer's instructions. The relative gene expression was calculated by the 2 ÀDDCt method 56 using PtUBQ10b as an internal control. All experiments were performed by using three biological replicates and three technical replicates. The primers used in this study are listed in Table S4 in the ESI. †

Statistical analysis
Statistical analysis to determine statistical signicance was performed by Student's t tests of paired samples. The asterisk in each gure indicates a signicant difference compared to control samples (P < 0.05).

Authors' contributions
JZ performed experiments, analyzed data and wrote the manuscript. ML, CGY, YP and AJR performed chemical compositional analysis, lignin NMR analysis, and sugar release determination. WR, KAW and CMC generated Populus transgenic lines. VS, EAL, KB and JS generated and analyzed RNA-Seq data. ACB, SSJ and LEG measured biomass production. NLE and TJT generated and analyzed metabolomics data. XY designed the construct for Populus transformation. GAT, WM and JGC conceived the study, coordinated research and contributed to experimental design and data interpretation. All authors read and approved the nal manuscript.

Conflicts of interest
There is no conict of interest.
published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http:// energy.gov/downloads/doe-public-access-plan).