Chemical synthesis of the EPF-family of plant cysteine-rich proteins and late-stage dye attachment by chemoselective amide-forming ligations

Chemical protein synthesis can provide well-defined modified proteins. Herein, we report the chemical synthesis of plant-derived cysteine-rich secretory proteins and late-stage derivatization of the synthetic proteins. The syntheses were achieved with distinct chemoselective amide bond forming reactions – EPF2 by native chemical ligation (NCL), epidermal patterning factor (EPF) 1 by the α-ketoacid-hydroxylamine (KAHA) ligation, and fluorescent functionalization of their folded variants by potassium acyltrifluoroborate (KAT) ligation. The chemically synthesized EPFs exhibit bioactivity on stomatal development in Arabidopsis thaliana. Comprehensive synthesis of EPF derivatives allowed us to identify suitable fluorescent variants for bioimaging of the subcellar localization of EPFs.

SPPS was performed on aminomethyl polystyrene resin or HMPB-ChemMatrix resin or 2chlorotriltyl polystyrene resin. Manual loading of the first amino acid residue onto the resin and subsequent Fmoc-SPPS followed established standard protocols. A summary of the utilized synthesis protocols: Fmoc-deprotections were performed with 20% piperidine in DMF (8 min ×2). Couplings were performed with Fmoc-amino acid (4.0 equiv relative to resin substitution), HCTU (3.8 equiv) and NMM (8.0 equiv) in DMF for 60 min. If required, the coupling step was repeated (double coupling) and LiCl washes (0.8 M LiCl in DMF) were performed before Fmocdeprotection and coupling. After coupling, unreacted free amine was capped by treatment with 20% acetic anhydride and 10% NMM in DMF for 10 min. Amino acid residues prone to epimerization such as cysteine were coupled using preformed HOBt esters. In a typical procedure, Fmoc-Cys(Acm)-OH (4.0 equiv relative to resin loading) was dissolved in DMF, and HOBt (4.0 equiv) and DIC (4.0 equiv) were added. The mixture was added to the resin and allowed to react for 2 h.

b) Manual coupling of special amino acids
Protected Fmoc-Val--ketoacid, Boc-(S)-5-oxaproline, Fmoc-Orn HA, and Gly-Ser isoacyl dipeptide were coupled manually. The monomer (1.5 equiv) was dissolved in a minimal amount of DMF (minimal concentration of monomer: 0.1 M), HATU (1.5 equiv) and NMM (3.0 equiv) were added. After a brief period of preactivation (2 min), the solution was added to the resin and allowed to react for 2 h. If required, the coupling was repeated with 1.0 equiv of monomers, 1.0 equiv of HATU, and 2.0 equiv of NMM.

c) Mutations and protecting groups
Norleucine Substitution: All methionine residues (Met) were substituted by norleucine (Nle) residues in the protein sequence, to avoid oxidation while handling, storage, and refolding.

General HPLC analysis and purification
Peptides and protein segments were analyzed and purified by reverse phase high performance liquid chromatography (RP-HPLC) on Jasco analytical and preparative instruments equipped with dual pumps, a mixer, an in-line degasser, and variable wavelength UV detector (simultaneous monitoring of the eluent at 220 nm, 254 nm, and 301 nm) or on a Gilson preparative instrument fitted with a 10 mL injection loop. If required, the columns were preheated using a column heater or a water bath. The mobile phase for RP-HPLC were Milli-Q water containing 0.1% TFA and HPLC grade CH 3 CN containing 0.1% TFA. In the described HPLC analysis and purifications, TFA was always used as solvent modifier. Jupiter C4 (5 m, 300 Å pore size, 21.2 mm I.D. × 250 mm). The following type of method was used: the column was pre-equilibrated at starting solvent composition for typically 10 min. After injection of the sample, the solvent composition was run to the final solvent composition (e.g., 50% CH 3 CN). After the gradient run time, the solvent composition was changed to 95% CH 3 CN within 1 min and the column was flushed for 5-7 min. Within 1 min, the solvent composition was changed to 10% CH 3 CN and the run ended. For the sake of simplicity, only the gradient time, the starting and end composition of the eluent will be stated at the individual experiments, although all experiments included the full cycle as described above.

Characterization
MALDI-MS data were obtained on a Bruker Microflex MALTI-TOF spectrometer using 4hydroxy--cyanocinnamic acid as the matrix. High-resolution mass spectra were recorded by the Molecular Structure Center at ITbM, Nagoya University on Thermo Scientific TM Exactive TM Plus Orbitratp Mass Spectrometer.

Synthesis of reduced EPFL9 protein 1a
The reduced EPFL9 1a was synthesized on the 2-chloro trityl chloride resin preloaded with Fmoc-Arg-OH (0.5 g, 0.39 mmol loading capacity). After automated SPPS, the resin was washed several times with DMF followed by CH 2 Cl 2 , dried and subjected to cleavage using a mixture of 95:2.5:2.5 TFA:TIPS:H 2 O (20 mL/g resin) for 2 h at room temperature. The crude TFA solution was separated from the resin by filtration and the filtrate was concentrated under reduced pressure.
The residue was triturated with cooled Et 2 O (ca. 25 mL/g resin), centrifuged and the supernatant was removed by decantation. This trituration/washing step was repeated twice (peptide precipitating out). The crude peptide was purified by preparative RP-HPLC using Phenomenex

Synthesis of reduced EPFL9 protein 1b
The reduced EPFL9 1b was synthesized on the 2-chloro trityl chloride resin preloaded with Fmoc-
After completion of the SPPS, the resin was dried and placed in a glass vial and mixture of

General procedure of Acm deprotection
Cysteine Acm protected proteins were dissolved in 50% aq. acetic acid (v/v) containing 1% (w/v) AgOAc (1 mM of the linear protein concentration) and the mixture was stirred at 45 °C for 2 h.
The mixture was quenched with 10% DTT in 50% aq. acetic acid (w/v/v), and the precipitation was separated by centrifugation. The precipitate was washed with 50% aq. acetic acid solution (v/v) and the combined supernatant was purified by preparative RP-HPLC.

Synthesis of reduced EPF2 protein 6a
The reduced peptide 6a was synthesized according to the general procedure 5.

Synthesis of reduced EPF2 protein 6b
The reduced peptide 6b was synthesized according to the general procedure 5.

Synthesis of reduced EPF1 protein 12a
The reduced protein 12a was synthesized according to the general procedure 5.1 using Cys(Acm) protected linear peptide 11a (10 mg, 1.6 mol, 1.0 equiv).

Synthesis of reduced EPF1 protein 12b
The reduced protein 12b was synthesized according to the general procedure 5.

Synthesis of EPFL9 protein 2a
The folded EPFL9 protein 2a was synthesized according to general procedure 5.6 using reduced linear protein 1a (10 mg, 1.9 M).

Synthesis of EPFL9 protein 2b
The folded EPFL9 protein 2b was synthesized according to general procedure 5.6 using reduced linear protein 1b (

General procedure of Dye-KAT synthesis by Huisgen cycloaddition
Alkyne functionalized dyes 14a-e (1.0 equiv), azido ethoxy ethyl KAT 15 (1.0 equiv), copper iodide (1.0 equiv) was dissolved in 50% aqueous CH 3 CN and triethylamine (3.0 equiv) was added, stirred at 65 °C. After 16 h, the reaction mixture cooled to room temperature and aqueous KF solution (0.5 mL) was added, stirred another 15 min at room temperature. The crude mixture was diluted with brine solution and extracted with CH 2 Cl 2 or ethyl acetate. The organic extracts were collected, dried with Na 2 SO 4 , and evaporated in vacuo. The crude residue was purified by column chromatography on silica gel (eluting with acetone/CH 2 Cl 2 or CH 3 CN/water) to give 16ae as solid.

Synthesis of pyrene-KAT 16c
The product 16c was synthesized according to the general procedure 6.

UV absorption spectra of Dye-KATs
The UV absorption was measured in 50% aqueous CH 3

Synthesis of coumarin-EPFL9 17b
The coumarin conjugated EPFL9 17b was synthesized according to general procedure 7.

Synthesis of pyrene-EPFL9 17c
The pyrene conjugated EPFL9 17c was synthesized according to general procedure 7.

Synthesis of pyrene-EPF2 18c
The pyrene conjugated EPF2 18c was synthesized according to general procedure 7.1 using folded photo-protected hydroxylamine EPF2 7b ( For each genotype or chemical treatment, sample size of 7 to 10 was used and over thousand epidermal cells were counted to provide statistical robustness. Statistical analysis (ANOVA after Tukey's HSD test) was performed using RStudio (www.rstudio.com) version 1.4.1717 for stomatal density.