Biochemical characterization of a cyanobactin arginine-N-prenylase from the autumnalamide biosynthetic pathway

Cyanobactins are linear and cyclic post-translationally modified peptides. Here we show that the prenyl-d-Arg-containing autumnalamide A is a member of the cyanobactin family. Biochemical assays demonstrate that the AutF prenyltransferase targets the guanidinium moiety in arginine and homoarginine and is a useful tool for biotechnological applications.

Libraries were prepared with Nextera DNA flex library prep kit (recently renamed to Illumina DNA Prep) and Illumina MiSeq sequencing was carried out using the MiSeq Reagent Kit v3 (600 cycle). Obtained sequences were trimmed to remove adapters using Trimmomatic v0.39 [S1] and the assemblies were prepared from the trimmed fastaq files using SPAdes v3.12.0 with the -careful option. [S2] The resulting assembly was then further processed for taxonomic classification using Kraken v2 [S3] and contaminating scaffolds were removed with ZEUSS v1.0.2. [S4] Lastly, scaffolding and gap closing were done using Platanus 1.2.4 [S5] to yield a 6.74 Mb assembly with 171 scaffolds.

16S ribosomal RNA phylogenetic analysis
A phylogenetic tree based on the 16S ribosomal RNA gene was generated using 75 previously published cyanobacterial sequences selected from representative cyanobacterial orders to illustrate the phylogenetic position of Phormidium autumnale CCAP1446/1 ( Figure S1). The 76 sequences were aligned with MUSCLE alignment in MEGA11 using default parameters. [S6] A default BIC calculation with the program jModelTest v2.1.2 indicated that the evolutionary model HKY+I+G best fitted the data set. [S7] This model was used for the construction of a Bayesian inference using the program MrBayes v3.2.7a with 5,000,000 generations and default number of runs and chains. [S8] The posterior probabilities were calculated by the Markov chain Monte Carlo method implemented in the program. The visualization of the tree and collapsing of the clades was made in FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/) and figure optimization was performed with Inkscape 1.1 (https://inkscape.org). cells were extracted in 1 mL 100 % methanol and homogenized using 200 μL of cell disruption media (0.5 mm glass beads, Scientific industries, Inc.) by mechanical crushing with a Fastprep cell disruptor (Bio 101, Thermo Electron Corporation, Qbiogene, Inc.) at a speed of 6.5 ms -1 for 45 s. Suspensions were centrifuged at 20,000 g for 5 min, filtered through 0.2 μm filter (13 mm syringe filter, PTFE, VWR international) and analyzed by mass spectrometry (MS).
Samples were analyzed with LC-MS (Waters Acquity I-class UPLC and QTOF, SYNAPTG2 -Si, Waters) ( Figure S2, Table S1). The mobile phase consisted of solutions A (0.1 % solution of formic acid in water) and B (0.1 % solution of formic acid in acetonitrile/2-propanol (1/1)). Two methods were used. In the first method, the column was Kinetex® C8 (LC Column (50 x 2.1 mm, 1.7 µm 100 Å, Phenomenex) and gradient at the beginning of analysis consisted of 5 % liquid B which was increased linearly to 100% in 5 min, liquid B was kept at 100 % until 7 min, and from 7.1 to 10 min the proportion of liquid B was 5%. In the second method, the column was the same, but the length was 100 mm. Gradient at the beginning of analysis consisted of 5 % liquid B which was increased linearly to 100% in 10 min, liquid B was kept at 100 % until 14 min, and from 14.1 to 20 min the proportion of liquid B was 5 %. The injection volume varied from 0.1 to 1 μL.
QTOF was calibrated using sodium formate and Ultramark® 1621. Leucine Enkephalin was used at 10 s intervals as a lock mass reference compound. Data were accumulated in positive electrospray ionization Resolution Mode at scan range of m/z 50-2000. Additional MS parameters were as follows: Polarity, ES+; capillary, 3.0 kV; source temperature, 120 °C; sampling cone, 40.0; source offset, 80.0; source gas flow, 0.0 mL /min; desolvation temperature, 600 °C; cone gas flow, 50.0 L/h. Identification of autumnalamide B was based on the theoretical molecular weights calculated from the predicted core peptide from candidate strains using ChemBioDraw.
Stereochemistry of autumnalamide B was determined by comparing its retention time with those of the different synthetic stereochemical variants (1-4) ( Figure S3). Selected ion recording (SIR) chromatograms were carried out by injecting each of the synthetic peptide stereochemical variants 1-4 six times and autumnalamide B from Phormidium autumnale CCAP1446/10 extract three times into Waters Acquity Premier UPLC system equipped with PDA UV detector and QDa mass detector and recorded the retention time in each run. In addition, we compared the MS fragmentation pattern of autumnalamide B in the extract with those of the synthetic stereochemical variants 1-4 ( Figure S4). Marfey analysis of purified autumnalamide B was also carried out ( Figure S5).

Prediction and annotation of autumnalamide biosynthetic gene cluster
The 9.6-kb autumnalamide (aut) biosynthetic gene cluster was predicted through tBLASTn [S9] searches using AcyA, AcyG and AcyF protein sequences from the anacyclamide biosynthetic gene cluster [S10] as query sequences against a standalone BLAST database of the Phormidium autumnale CCAP1446/10 draft genome (Table S2). The genes encoded in the aut biosynthetic gene cluster were predicted using GLIMMER as implemented in Artemis. [S11] Start sites were predicted and proteins annotated manually using a combination of searches against the Conserved Domain Database and protein classification resources at NCBI and InterProScan searches and BLASTp [S9] searches against the non-redundant database at NCBI (Table S2). The annotated sequence of the aut biosynthetic gene cluster from Phormidium autumnale CCAP1446/10 was deposited in GenBank under accession number JAIGNI000000000.
The AutF prenyltransferase was aligned with 36 other cyanobactin prenyltransferases using MUSCLE (https://www.ebi.ac.uk/Tools/msa/muscle/). 127 positions were retained and used for the construction of a phylogenetic tree using PHYML ( Figure S6). A phylogenetic tree was constructed using the WAG amino acid substitution model, four substitution rate categories, an estimated proportion of invariable sites of 0.013, and an estimated γ-distribution shape parameter of 2.415. The stability of the in-group relations was assessed with 1000 bootstrap replicates. The resulting phylogenetic tree was rooted using midpoint rooting using RETREE and visualized using TREEVIEW.
The AutF prenyltransferase was to query the nonredundant database at NCBI. This analysis identified three complete cyanobactin biosynthetic pathways that encoded close homologs of the AutF prenyltransferase from the anacyclamide and piricyclamide biosynthetic pathway ( Figure S7A). Inspection of these cyanobactin biosynthetic gene clusters identified 2-4 precursor proteins that encode either Lys or Arg in the core region ( Figure S7B).

AutF Expression of in E. coli and purification:
Full-length autF gene was amplified by PCR using Phusion Tm High-Fedility DNA polymerase and the gDNA of cyanobacterial strain CCAP1446/10 as template. The gene was purified from agarose gel using QIAquick Gel Extraction Kit and cloned in pEHISTEV-SUMO plasmid (Gift from Dr Hunating Liu, University of St Andrews) in frame with an N-terminal Tobacco etch virus (TEV) protease-cleavable His6SUMO tag using In-Fusion ® cloning kit (Takara Bio) after linearisation of the vector by PCR ( Figure S8A). Primers used in PCR reactions are listed in Table S3. The protein was expressed in Escherichia coli BL21 (DE3) cells. A seed culture was grown overnight on Luria-Bertani broth (LB) medium containing 50 µg/mL kanamycin and incubated at 37°C with shaking at 200 rpm and an aliquot (10 mL) was used to inoculate each litre of LB medium. Cultures were grown at 37°C with shaking at 200 rpm until OD600 is 0.4-0.5. Cultures were then cooled down to room temperature and IPTG was added to a final concentration of 0.5 mM. Induced cultures were further incubated at 30°C for 6 hours.
Cells were harvested by centrifugation at 4,000g, 4°C for 15 min and resuspended in lysis buffer (200 mM NaCl, 20 mM Tris (pH 8.0), 20 mM imidazole and 3 mM β-mercaptoethanol (BME) with the addition of complete EDTA-free protease inhibitor tablets (Thermo Scientific) and 0.4 mg g −1 DNase wet cells (Sigma). They were then lysed using STANSTED SPCH-EP-10 pressure cell homogeniser (Homogenising Systems Ltd, UK) and the lysate was cleared by centrifugation (40,000g, 4°C, 45 min) followed by filtration through a 0.45 µm membrane filter and loaded onto an Ni-Sepharose 6 FF column (Cytiva, Sweden) equilibrated in lysis buffer. The column was washed with 20 volumes of washing buffer containing 200 mM NaCl, 20 mM Tris (pH 8.0), 20 mM imidazole and 3 mM BME and AutF was eluted with 250 mM imidazole in the same buffer. The elution peak was loaded onto a HiPrep TM 26/10 Desalting column (Cytiva, Sweden) equilibrated in 200 mM NaCl, 10 mM HEPES pH 8.0, 1 mM Tris(2carboxyethyl) phosphine hydrochloride (TCEP) and 10% glycerol. The protein was concentrated to 150 µM using Vivaspin concentrators, 10 kDa molecular weight cut-off. Integrity and identity were confirmed by SDS-PAGE ( Figure S8B) and MS. The purified protein was stored as flash-frozen aliquots at -80°C until used.
Site directed mutagenesis was carried out using In-Fusion ® cloning kit (Takara Bio) following the manufacturer protocol and after PCR amplification using Phusion Tm High-Fedility DNA polymerase and the primers listed in Table S3. The mutant protein was expressed and purified as above.
Linear peptides and cyclic precursor peptides were prepared using the standard Fmoc-based solid-phase peptide synthesis (SPPS) strategy on a Liberty Blue TM Automated Microwave Peptide Synthesizer (CEM Corporation, USA) at a 0.1 mmol scale. A Fmoc-Rink Amide ProTide (LL) resin (0.19 mmol/g) was used for the synthesis of all these peptides. Initial deprotection of the Fmoc protecting group from the resin and all subsequent deprotections were performed with programed cycles using a solution of 20% piperidine in DMF. Coupling cycles in the synthesizer were performed, after deprotection, with solutions of 0.2 M Fmocamino acids, 1 M DIC and 1 M Oxyma in DMF. Most Fmoc amino acids underwent one coupling cycle during attachment. Fmoc-arginine residues underwent two coupling cycles before deprotection of the Fmoc. For all linear and cyclic precursor peptides, the N-terminal Fmoc group was cleaved at the end of the synthesis.
Cyclo[-LGPFRFD] (7) and cyclo[-LGPFrFD] (6) peptides were synthesized on a Fmoc-Asp(NovaSyn TGA)-OAll resin (0.17 mmol/g) under the same conditions as the other linear peptides with the exception of no final deprotection step being performed on the N-terminal leucine. The resin was then transferred to a fritted SPE column (Hicrom, Avantor UK) and treated with Pd(PPh3)4 (0.1 equiv.) and phenylsilane (20 equiv.) in DCM. The mixture was allowed to react for 40 min at RT on an orbital shaker (Heidolph, 1350 rpm) to remove the allyl ester protecting group from the C-terminal carboxylic acid. [S12] The resin was then washed with DMF, 5% sodium diethyldithiocarbamate trihydrate solution in DMF, 1% DIPEA in DMF and DMF again to remove any Pd remaining from the deprotection cocktail. [S12] Fmoc deprotection was performed manually on the resin using 20% piperidine in DMF for 30 min on a shaker (1350 rpm) at RT. [S12, S13] A Kaiser test was performed to confirm full deprotection. Cyclisation was performed manually on-resin by treatment with HATU (2 equiv.) and DIPEA (2 equiv.) in DMF for 19 hrs. on a shaker (1350 rpm) at RT. [S13] After cyclisation, a Kaiser test was performed to confirm cyclisation.
All peptides not containing methionine were deprotected and cleaved from the resins by treatment with a cleavage cocktail of TFA/TIS/H2O (95:2.5:2.5) for 3 hrs. at RT on an orbital shaker (Heidolph, 1350 rpm). The cleavage mixture was concentrated by a stream of air. Peptides containing a methionine residue were treated with a cocktail solution of TFA/TIS/DODT/H2O (92.5:2.5:2.5:2.5) and then concentrated under N2 gas. The peptides were precipitated using cold diethyl ether, placed into a -20 freezer overnight, washed with ether (3x) and dried under vacuum to give a crude solid. The crude peptides were purified using reversed phase HPLC on an Agilent Technologies 1260 Infinity using a C18 column (ACE 5 C18-HL, 5 μm, 10x250 mm, 100 Å) through an acetonitrile (+0.1% TFA)/Water (+0.1% TFA) gradient (see Table S4 for details). The collected fractions were subsequently lyophilized on a LaboGene CoolSafe Freeze dryer to give the pure solid. Identity and purity were confirmed by HPLC-MS analysis.
Cyclo [-TLrESTAMYp] (2) and cyclo [-TLRESTAMYp] (3) were synthesized on a Cl-TCP(Cl) ProTide resin (0.5 mmol/g) under the same conditions as conventional synthesis with the exception of a special coupling performed on the C-terminal alanine which was coupled to the resin using a double coupling cycle and utilizing a base of 1.0 M DIPEA and 0.125 M KI solution in DMF. 0.1 eq DIPEA was also added to the Oxyma activator solution. After synthesis the resin was transferred to a fritted SPE column and washed 6x with DCM (Placed on shaker for third and last wash) the resin was then treated with a cleavage solution of 1% TFA in DCM 5x for 2 min each on the shaker at RT. After each treatment, the solution was filtered into a solution of 10% pyridine in MeOH. All filtrates were combined and concentrated down under vacuum. Ice cold H2O was added to the solution causing the protected peptide to precipitate out of solution. This solution was subsequently lyophilized to give a white powder. The identity was confirmed by LCMS. The protected peptides were cyclized in solution at a concentration of 2 mM with 3 eq of PyAOP and 5 eq of DIPEA at RT for two days using HPLC to monitor reaction completion. Once completed the reaction solution was concentrated down under vacuum and lyophilized.
The protection groups on the peptides were cleaved off using a cocktail solution of TFA/TIS/DODT/H2O (92.5:2.5:2.5:2.5) for 3 hrs. at RT on an orbital shaker. The mixture was concentrated under a stream of N2 and precipitated out using cold diethyl ether. The solid was washed, dried and purified under the same conditions as the linear peptides.
Macrocyclization reaction mixtures were then subjected to solid phase extraction using Strata C18-E (55 µm, 70 Å; 2 g/12 mL, Giga tubes; Phenomenex) and the organic components were eluted with 100% methanol and concentrated under vacuum using rotary evaporator.
The eluate was then purified on the semipreparative HPLC C18 column (ACE 5 C18-HL) and the identity of the purified cyclic peptide products were confirmed by LCMS.
Prenylation reactions with AutF contain 100 µM peptide substrate, 20 µM purified enzyme, 12 mM MgCl2, 1mM dimethylallyl pyrophosphate (DMAPP), 1% DMSO in buffer containing 150 mM NaCl, 10 mM HEPES pH 7.5 and 3 mM TCEP. Samples were analysed by LC-MS as described above (Figures S9-S32). Control samples were prepared by incubation of the peptide substrates in the aforementioned buffer. No prenylated peptides were detected in control samples, suggesting there was no spontaneous prenylation occurring independent of the enzyme.

Kinetics
The AutF steady-state kinetics parameters were assessed using 2 mM DMAPP and variable concentrations of the two substrates Fmoc-L-Homoarginine-OH (0.01 -0.5 mM) and H-FRFDLGPAYD-NH2 (0.05 -1 mM) on the basis of product yield under different conditions as monitored by LCMS. The assays were conducted in duplicate and all rates were confirmed to be linear. The kinetics curves were fit to Michaelis-Menten kinetics ( Figure S33) and were generated using prism 5.04 (GraphPad software, Inc. La Jolla, CA 92037 USA).
Marfey derivatives of the autumnalamide B acid hydrolysate and amino acid standards (from Sigma, methionine sulfone was from Fluka, 1 µL injections) were analyzed with UPLC UV/MS using Kinetex® C8 column (100 x 2.1 mm, 1.7 µm, 100Å) eluted 0.3 ml min-1 at 40 °C with 0.1%HCOOH (solvent A) and acetonitrile + 0.1% HCOOH (solvent B). Gradient started from 95/5 (A/B) going to 40/60 in ten mins, then to 0/100 in 0.1 mins and kept there 3.9 mins, then back to 95/5 in 0.1 mins with total run time of 20 min. Compounds were detected with mass detector using full scan negative ionization.
As arginine Marfey derivatives did not resolve with reversed phase C8 column, a HILIC method was used. One µL samples were injected to Acquity UPLC® BEH Amide column (2.1 x 100mm, 1.7 µm) eluted 0.3 mL.min -1 at 40 °C with 0.2% ammonium formate (solvent A) and acetonitrile (solvent B). Gradient started from 90/10 (A/B) going to 40/60 in nine mins, kept there 1 min, then back to 90/10 in 0.1 mins with total run time of 16 min. Results are shown in Figure S5.       Figure S1: Bayesian inference tree based on the 16S rRNA genes from 76 cyanobacterial strains constructed with 5,000,000 generations. Accession number of the sequences is shown in parentheses and node labels represent the posterior probability values. The strain Phormidium autumnale CCAP1446/10 is shown in bold.                                             LGHNFDFSKLEVLSAGLDLRTNLADSSLKIHIRIKD Models were generated using SWISS-MODEL web tool [S16] using PagF structure (PDB 5TU6) as a template.