Sterol 3 b-glucosyltransferase biocatalysts with a range of selectivities , including selectivity for testosterone

The main objectives of this work were to characterise a range of purified recombinant sterol 3b-glucosyltransferases and show that rational sampling of the diversity that exists within sterol 3b-glucosyltransferase sequence space can result in a range of enzyme selectivities. In our study the catalytically active domain of the Saccharomyces cerevisiae 3b-glucosyltransferase was used to mine putative sterol 3b-glucosyltransferases from the databases. Selected diverse sequences were expressed in and purified from Escherichia coli and shown to have different selectivities for the 3b-hydroxysteroids ergosterol and cholesterol. Surprisingly, three enzymes were also selective for testosterone, a 17b-hydroxysteroid. This study therefore reports for the first time sterol 3b-glucosyltransferases with selectivity for both 3band 17b-hydroxysteroids and is also the first report of recombinant 3b-glucosyltransferases with selectivity for steroids with a hydroxyl group at positions other than C-3. These enzymes could therefore find utility in the pharmaceutical industry for the green synthesis of a range of glycosylated compounds of medicinal interest.


Introduction
Many lead compounds under development by the pharmaceutical industry are poorly water soluble and hence have non-ideal pharmacokinetic properties.One strategy, to overcome this problem is to link the drug substance to a water soluble moiety such as a carbohydrate, as the drug can be cleaved from this adduct by normal metabolic processes, ideally at the desired site of action.For example, the oral pro-drug testosterone-glucoside is currently under investigation by ProStrakan as an alternative to testosterone for patients suffering from hypogonadism. 1 Glycosidic derivatives are also of interest because they are commonly encountered in the in vivo metabolism of xenobiotic compounds as more water soluble products more suitable for excretion.Consequently, synthesis of appropriate carbohydrate derivatives might be employed to provide either more effective medicinal compounds or as a means to investigate drug pharmacokinetics through adsorption, distribution, metabolism and excretion (ADME) of a drug substance. 2hemical methods of glycosylation are challenging and usually require a laborious protection-deprotection strategy which drives down overall chemical yield especially with complex aglycones. 3,4n addition, this approach proves industrially unattractive due to the formation of intractable mixtures of side-products. 3,4In contrast, the chemo-and regioselective glycosylation of hydroxyl functionality is a common biotransformation and several glycosyltransferase (GTase) biocatalysts that perform efficient and selective conversions have been characterised. 5The majority of characterised GTases catalyze the transfer of the monosaccharide units from an activated nucleotide donor, such as a uridine diphosphate sugar, to specific acceptor molecules, during the formation of a glycosidic bond. 6lucosylation of steroids with a hydroxyl group at the C-3 position is a common biotransformation catalyzed by GTases known as sterol 3b-glucosyltransferases (systematic name, uridine diphosphate-glucose:sterol 3-O-b-D-glucosyltransferases; abbreviation, UDPG-SGTases; EC number, 2.4.1.173).UDPG-SGTase are members of GTase family 1 (GT1) of the Carbohydrate Active Enzymes database; a database that classifies enzymes that degrade, modify, or create glycosidic bonds into distinct protein sequence homology-based families. 7Characterized examples of UDPG-SGTases include UGT51 from Saccharomyces cerevisiae and UGT51B1 from Pichia pastoris. 80][11] Warnecke et al. 8 also showed, by successively deleting the N-terminus of UGT51, followed by in vitro analysis, that it is only the GT1 GT-like conserved domain that is responsible for UDPG-SGTase activity.
There are several reports of characterised UDPG-SGTases, but many of these were performed with crude homogenate, purified membrane or partially purified enzyme preparations that may have contained mixtures of UDPG-SGTases. 12nfortunately, there are only a few reports of characterised recombinant UDPG-SGTases. 12Our work therefore aimed to characterise a range of recombinant UDPG-SGTase catalytic domains, identified via sequence similarity to that of UGT51.Additionally, we aimed to show that rational sampling of the diversity that exists within sequence space, i.e. the number of possible amino acid sequences that can generate a particular enzyme activity, results in a range of enzyme selectivities.We show that a range of selectivities, including specificity for testosterone, a 17b-hydroxysteroid, were present among the six recombinant UDPG-SGTases that were characterised in this study.

Identification of putative GT1 GT-like domains
0][11] Therefore the N-terminally truncated catalytic domains, comprising of only the GT1 GT-like domain of several UDPG-SGTases were expressed in E. coli.This was achieved by conducting a BLASTP search using the amino acid sequence of the shortest C-terminal domain of the S. cerevisiae UGT51 that encoded a catalytically active UDPG-SGTase. 8The BLASTP search revealed similarities with the previously characterised UDPG-SGTase, UGT51B1 from P. pastoris, and several uncharacterised protein products.Fig. 1 shows the ClustalW2 multiple sequence alignment (version 2.1) comparison of the deduced N-terminally truncated domains of the sequences selected from the BLASTP search.The catalytic UDPG-SGTase domain of the S. cerevisiae UGT51, UGT51 0 , is comprised of a polypeptide of 475 amino acids. 8,11This domain has 61-64% identity to the C-terminal domains of the selected sequences: UGT51B1 of P. pastoris (termed UGT51B1 0 ), A7KAK6 of P. angusta (termed A7KAK6 0 ), A5DNB9 of M. guilliermondii (termed A5DNB9 0 ), Q6BN88 of D. hansenii (termed Q6BN88 0 ) and Q6CUV2 of K. lactis (termed Q6CUV2 0 ); with the selected proteins showing a good level of sequence space diversity, i.e. 58-66% identity amongst themselves.Gene fragments corresponding to deletions of 723, 747, 770, 1065, 1030 and 718 codons from the 5 0 end of the sequences encoding UGT51, UGT51B1, A7KAK6, A5DNB9, Q6BN88 and Q6CUV2, respectively, were amplified, cloned and expressed in E. coli.The N-terminal residues of these sequences showed low levels of sequence identity (between 18-35%).

Expression of putative UDPG-SGTases using a range of E. coli BL21(DE3) strains
The N-terminally hexa-histidine tagged protein products with a calculated molecular mass of 54.6 kDa for UGT51 0 , 53.5 kDa for UGT51B1 0 , 54.2 kDa for A7KAK6 0 , 60.6 kDa for A5DNB9 0 , 61.8 kDa for Q6BN88 0 and 55.9 kDa for Q6CUV2 0 were expressed in E. coli BL21(DE3) strains.In all cases the largest amount of soluble protein expression was achieved after 18 h of post-induction growth at 23 1C and no expression was observed in the absence of induction with isopropyl-1-thio-b-D-galactopyranoside.In two cases, UGT51 0 and A5DNB9 0 , low expression levels in E. coli BL21(DE3) were seen (data not shown).To maximise expression of these genes E. coli BL21(DE3) CodonPlus strains were transformed with the LIC-constructs encoding UGT51 0 and A5DNB9 0 .Following this, increase in recombinant protein expression levels was observed for A5DNB9 0 and UGT51 0 encoding plasmids transformed in E. coli BL21(DE3) CodonPlus (RP) and E. coli BL21(DE3) Codon Plus (RIPL), respectively (data not shown).All gene fragments encoded soluble protein products (with yields between 10-100 mg L À1 ) were purified using immobilised metal ion (Ni 2+ ) chromatography.Multiple nonredundant peptides were identified by peptide fragmentation fingerprinting of UGT51 0 , UGT51B1 0 , A7KAK6 0 , A5DNB9 0 , Q6BN88 0 and Q6CUV2 0 ; confirming the identity of the purified proteins (data not shown).

Identification of UDPG-SGTases with different selectivities
Each of the six recombinant proteins was assayed for activity against the following steroidal acceptors: ergosterol, cholesterol, estrone, estradiol, androstenedione and testosterone (Table 1).UGT51 0 from S. cerevisiae was shown to be active against the 3b-hydroxysteroids, cholesterol and ergosterol (Fig. 2A and B, respectively); confirming the results seen by Warnecke et al. 8 and revealing that activity is also observed with a purified N-terminally hexa-histidine tagged version of the enzyme.UGT51B1 0 from P. pastoris was also active against the 3b-hydroxysteroids (Fig. 2A and B); again confirming the results seen by Warnecke et al. 8 and revealing that it is only the GT1 GT-like domain of the P. pastoris enzyme that is responsible for UDPG-SGTase activity.The N-terminally truncated previously uncharacterised enzyme, A7KAK6, from P. angusta was also shown to be active against the 3b-hydroxysteroids (Fig. 2A and  B); adding further support to the suggestion that it is only the GT1 GT-like domain of UDPG-SGTases that is responsible for activity.The N-terminally truncated previously uncharacterised enzyme, Q6CUV2, from K. lactis was shown to be active against cholesterol (Fig. 2A) but not against ergosterol (data not shown).Interestingly, UGT51 0 , UGT51B1 0 and A7KAK6 0 were also shown to be active against the 17b-hydroxysteroid, testosterone (Fig. 2C) and to our knowledge this is the first report of UDPG-SGTases with activity against both 3b-and 17b-hydroxyl steroids.In fact, any reports of UDPG-SGTases with specificity towards glucosylation of steroids at positions other than C-3 are limited and importantly none involving the characterisation of a recombinant enzyme have been reported.For example, Madina et al. 14 have shown that a near to pure 27b-hydroxy glucosyltransferase from the cytosolic fraction of the medicinal plant, Withania somnifera, showed activity against testosterone; Poppenberger et al. 15  As expected, the UDPG-SGTases in this study exhibited no activity towards b-naphthol, a non-steroidal aromatic alcohol; estrone and estradiol, hydroxysteroids with the OH group at the C-3 position in a flat configuration; and androstenedione, a steroid with a similar structure to testosterone but with no OH group at the C-17 position (data not shown).Additionally, the other two N-terminally truncated UDPG-SGTases expressed in this study, A5DNB9 0 from M. guilliermondii and Q6BN88 0 from D. hansenii did not demonstrate activity towards any of the acceptors used, even though they have previously been implicated in autophagy and related processes. 17,18UDP-galactose was found not to be accepted as the sugar donor for any of the UDPG-SGTases with any of the screened acceptors.This was in accordance with previous studies where similar specificity for donor sugars has been reported for S. cerevisiae and other eukaryotic GTases. 8It is worth noting that while using testosterone as the steroidal acceptor, glucosylated products had similar R f values to an authentic b-testosterone glucoside standard (data not shown).This was further confirmed by co-spotting of reaction products with the standard (data not shown).
LC-MS analysis was used to confirm the results from the TLC experiments.Characteristic signals for the monoprotonated testosterone (m/z = 289 with a retention time of 2.0 min) and monoprotonated testosterone glucoside (m/z = 451 with a Fig. 1 Clustal W2 multiple sequence alignment of the deduced N-terminally truncated catalytic domains of the UDPG-SGTases expressed in this study.UGT51B1 0 , P. pastoris UDPG-SGTase; A7KAK6 0 , P. angusta UDPG-SGTase; A5DNB9 0 , M. guilliermondii UDPG-SGTase; Q6BN88 0 , D. hansenii UDPG-SGTase; UGT51 0 , S. cerevisiae UDPG-SGTase; Q6CUV2 0 , K. lactis UDPG-SGTase.Black boxes indicate identical amino acids in all 6 sequences.Grey boxes indicate identical amino acids in 4 or 5 sequences.The alignment was conducted with residues 724 to 1198 of UGT51 0 , 748 to 1211 of UGT51B1, 771 to 1241 of A7KAK6, 1066 to 1599 of A5DNB9, 1031 to 1574 of Q6BN88, and 719 to 1209 of Q6CUV2.retention time of 1.4 min) were determined using authentic standards.Both were observed in the reaction mixture after 2 h (Fig. 3) but after an overnight incubation only the peak for the testosterone glucoside could be observed.This was also supported by TLC analysis of an overnight incubation with testosterone (data not shown) and was also the case for cholesterol and ergosterol overnight incubations (data not shown).An added advantage to this analytical method was that testosterone and the corresponding glucosylated compound were separated by a chromatographic method with a run time of 5 min at room temperature, an assay which compares favourably with previous studies where separation was achieved with longer retention times at elevated column temperatures. 19,20netic analysis of UDPG-SGTases Kinetic analysis was performed for all active UDPG-SGTases using cholesterol and testosterone as acceptors and UDP-[ 14 C] glucose as the donor.All reactions demonstrated classic Michaelis-Menten kinetics with goodness-of-fit statistical analysis of the linear trend lines of the resulting Lineweaver-Burke plots producing R 2 values between 0.8052 to 0.9599 (data not shown).For enzymes active against cholesterol and testosterone, the binding capacity of the enzymes for cholesterol was higher (Z4.5-fold) than that for testosterone (Table 2).Similarly, the reaction rate of the enzymes active against cholesterol was higher (Z1.8-fold)  than that against testosterone (Table 2).It should be noted that the apparent K m values of the enzymes with testosterone were derived at less than saturating substrate concentrations due to the lack of solubility of testosterone at higher concentrations in the aqueous reaction mixture.UGT51B1 0 from P. pastoris displayed the highest level of apparent catalytic efficiency (k cat /K m value) on both substrates (Table 2) and its catalytic efficiency against testosterone was comparable to that of the plant 27b-hydroxy glucosyltransferase against testosterone.21 The presence of Mn 2+ ions was shown not to affect the activity Table 2 Kinetic parameters of the UDPG-SGTases. C,cholesterol; T, testosterone.Results are the means of triplicated determinations AE standard error.K m values of the enzymes against testosterone were derived at less than saturating substrate concentrations due to the lack of solubility of testosterone at higher concentrations in the aqueous reaction mixture of the enzymes in this study (data not shown) which is in agreement with that of the Madina et al. 14 who have reported that the presence of Mn 2+ ions did not affect the activity of the plant 3b-glucosyltransferase.

Conclusions
We have shown that the approach of rational sampling of the diversity that exists within sequence space identified via a BLAST search generated a range of UDPG-SGTases with the same activity but different selectivity.We suggest that this class of enzyme could become valuable biocatalysts in the pharmaceutical industry (and fine chemical industry) due to their ability to glycosylate a range of sterols without the need to use protectiondeprotection strategies.

Identification and cloning of UDPG-SGTase encoding genes
A BLASTP (version 2.2.27) 22 search of the databases was conducted using UGT51 0 (residues 724 to 1198 of UGT51) as a query to identify similar sequences.Gene sequences encoding N-terminal truncated versions of the proteins identified were selected via rationally sampling the diversity that exists within sequence space.The selected genes (Table 3 for UniProtKB accession numbers) were then cloned and expressed in Escherichia

Protein expression, purification and identification
Recombinant proteins were expressed from different E. coli BL21(DE3) strains as indicated in Table 3.Cells were grown at 37 1C with shaking at 200 rpm in 1 L LB medium supplemented with 100 mg mL À1 kanamycin for LIC-constructs transformed in E. coli BL21(DE3) and 100 mg mL À1 kanamycin plus 34 mg mL À1 chloramphenicol for LIC-constructs transformed in E. coli BL21(DE3) CodonPlus (RP/RIPL) to an absorbance of 0.6 at 600 nm.Induction was performed by the addition of isopropyl-1-thio-b-D-galactopyranoside, to a concentration of 240 mg mL À1 , followed by further incubation for 3 h and 18 h at 23 1C, 30 1C and 37 1C, 100 rpm.Cells were harvested by centrifugation (15 min, 4000 Â g, 4 1C), resuspended in 5 mL of 50 mM Na 2 HPO 4 -HCl, 0.5 M NaCl, 10 mM imidazole, pH 7.4 followed by cell disruption by sonication (6 Â 10 s at 14 microns).N-terminally hexahistidine tagged protein products were purified via immobilised metal ion (Ni 2+ ) chromatography (GE Healthcare, UK) using a linear gradient of 10 to 500 mM imidazole.Protein concentration and buffer exchange into 5 mM Tris-HCl (pH 8.0) of the purified proteins was achieved using 30 kDa cut-off concentrator units [Vivaproducts, USA].The purity of UDPG-SGTases was judged by SDS-PAGE and Coomassie blue staining. 24Peptide fragmentation fingerprinting of purified proteins was used to have shown that wild-type seedlings of Arabidopsis converted the brassinosteroid brassinolide to the 23-O-glucoside, but in the transgenic plants silenced in UGT73C5 expression no 23-O-glucoside was detected; and O'Reilly et al. 16 have shown that a baculovirus ecdysteroid UDP-glucosyltransferase produces ecdysone 22-O-b-D-glucopyranoside.This is c The Royal Society of Chemistry 2013

Fig. 3
Fig. 3 LC-chromatogram and MS spectrum of extracted reaction products.Extraction from P. angusta UDPG-SGTase A7KAK6 0 catalysed reaction following a 2 h incubation.A characteristic peak at m/z = 451 [M + H] + at a retention time (RT) of 1.38 min confirms the synthesis of testosterone glucoside (inset).Peak at retention time 2.06 (m/z = 289 [M + H] + ) shows the residual testosterone in the reaction mixture.Selected ion monitoring mode of m/z = 288.0-290.0 and 450.0-452.0 was used.
coli.Genomic DNA was prepared from overnight cultures using the DNeasy Blood and Tissue Kit [Qiagen, UK] and the primer pairs [Eurofins MWG Operon, Germany], shown in Table3, were used to amplify the gene fragments via PCR using KOD Hot Start DNA Polymerase [Merck Chemicals, UK].PCR was performed for 5 cycles at an annealing temperature of 50 1C then for 25 cycles at an annealing temperature of 73 1C.The amplified products were inserted into pET-YSBLIC (kindly provided by Dr Mark Fogg of the York Structural Biology Laboratory, York University, UK) via ligation independent cloning, as described by Bonsor et al.23

Table 1
Chemical structures of steroids that were used in this study

Table 3
Primer pairs used for PCR and E. coli (DE3) strains used for expression.LIC-specific ends are shown in bold