Towards a prebiotic chemoton – nucleotide precursor synthesis driven by the autocatalytic formose reaction

The formose reaction is often cited as a prebiotic source of sugars and remains one of the most plausible forms of autocatalysis on the early Earth. Herein, we investigated how cyanamide and 2-aminooxazole, molecules proposed to be present on early Earth and precursors for nonenzymatic ribonucleotide synthesis, mediate the formose reaction using HPLC, LC-MS and 1H NMR spectroscopy. Cyanamide was shown to delay the exponential phase of the formose reaction by reacting with formose sugars to form 2-aminooxazole and 2-aminooxazolines thereby diverting some of these sugars from the autocatalytic cycle, which nonetheless remains intact. Masses for tetrose and pentose aminooxazolines, precursors for nucleotide synthesis including TNA and RNA, were also observed. The results of this work in the context of the chemoton model are further discussed. Additionally, we highlight other prebiotically plausible molecules that could have mediated the formose reaction and alternative prebiotic autocatalytic systems.


References
Electronic Supplementary Material (ESI) for Chemical Science.This journal is © The Royal Society of Chemistry 2023

Standard Curves of DNPH-Derivatised Formaldehyde and Glycolaldehyde
The formaldehyde stock solution was diluted to 100, 50, 25, 12.5 and 6.25 mM, and glycolaldehyde to 10, 5, 2.5, 1.25 and 0.625 mM with Milli-Q water.For the construction of standard curves, an 80 µL aliquot of each standard solution was added to 120 µL of Milli-Q water followed by 800 µL of the derivatisation mixture described above.The derivatisation reaction was allowed to proceed at room temperature for a minimum of 30 minutes before HPLC analysis as described in the main text.

DsCl Derivatisation of Cyanamide in Formose Samples
110 μL of each timepoint sample was transferred to a 1.5 mL microcentrifuge tube containing 10 μL of 2 M HCl for quenching followed by the addition 80 μL of 0.4 M Na2CO3/NaHCO3 buffer (pH 9).The mixture was mixed by pipetting up and down after which 200 μL of 10 mg mL -1 DsCl in HPLC-grade acetone was added using a 500 μL glass syringe (Innovative Labor Systeme GmbH).The resulting mixture was vortexed for 1 minute.The derivatisation reaction then was heated to ~45 °C for approximately 70 minutes followed by centrifugation for 1 minute.The supernatant was then transferred to HPLC vials for analysis as described in the main text.
Table S1.Composition of formaldehyde-containing reaction mixtures A. To investigate each condition (column 1), respective components were added in order from left to right into a 15 mL centrifuge tube.Following the addition of 1 M NaOH, the mixtures were vortexed until homogenous (~ 4 seconds).

Standard Curve of Dansyl Chloride-Derivatised Cyanamide
A stock solution of cyanamide was serially diluted to 15, 7.5, 5, 2.5, 1.25 and 0.625 mM with Milli-Q water.For the construction of standard curves, a 110 µL aliquot of each concentration was added to 10 µL of Milli-Q water followed by 80 µL of the 0.4 M NaHCO3/Na2CO3 (pH 9) buffer.The derivatisation reaction was heated at 45 °C for 70 minutes, and the derivatised mixtures were transferred to HPLC vials for analysis.The derivatised samples were stored at 5 °C in the autosampler to inhibit decomposition.

HPLC Analysis of DsCl-Derivatised Formose Samples
HPLC analysis was carried out using a Shimadzu Nexera 40 Series UPLC system with PDA detector (Kyoto, Japan).An aliquot of 1 μL of each derivatised sample was injected into the HPLC and eluted with a 1 mL min -1 binary gradient consisting of 10 mM NaH2PO4 (solvent A) and neat ACN (solvent B) for 30 minutes.The gradient started at 45% B and ramped up to 80% in 14 minutes.From 14 minutes to 15 minutes, the concentration of B went from 80% to 90% and remained at 90% for 7 minutes.From 22 minutes to 23 minutes, the concentration returned to 45%.Solvent B stayed at 45% for the rest of the run.The same binary gradient was used as a wash after every sample run.The stationary phase was a Shimadzu Shim-pak GIST C18 column (5 µm particle size, 4.6 mm I.D., and 150 mm length) with the oven temperature maintained at 25 °C.The derivatised samples were stored at 5 °C in the autosampler to inhibit decomposition.

Standard Curve of 2-NH2Ox
A stock solution of cyanamide was serially diluted to 1.25, 0.625, 0.5, 0.3125, 0.25, 0.0625 and 0.03125 mM with Milli-Q water.100 μL of each sample was transferred to a 1.5 mL microcentrifuge tube containing 900 μL of Milli-Q water.The formose samples were stored at 5 °C for 2-3 days to allow the cyanamide and 2-NH2Ox adducts with formose intermediates to reach equilibrium in the diluted condition before LC-MS analysis.

Data Analysis
Microsoft Excel was used to plot HPLC chromatograms.Origin 2022b was used to smoothen and plot LC-MS extracted ion chromatograms (EICs).The EICs were smoothened using an adjacentaveraging method with a window of 30 points and without weighted average.

Data analysis
MestreNova (MestreLab Research, Santiago de Compostela, Spain) was used to analyse and compile NMR spectra.

Time Course Experiments of Formose Reaction with 0, 7 and 9 mM Cyanamide
. Concentrations of formaldehyde and glycolaldehyde over time in triplicates of the formose reaction without (0 mM) cyanamide.Concentrations of formaldehyde (middle) and glycolaldehyde (bottom) were determined through derivatisation with DNPH followed by analysis with HPLC (see Figure S12 for extracted chromatograms at 360 nm).The regions are colour-coded based on their kinetic phase-lag phase (blue), exponential (green), and degradation (yellow).Their concentrations were calculated based on their respective standard curves (see Figures S2 and S3).At 0 mM initial cyanamide, the exponential phase started almost immediately after heating.

Figure S10.
Concentrations of formaldehyde and glycolaldehyde over time in triplicates of the formose reaction containing 7 mM cyanamide.Concentrations of formaldehyde (middle) and glycolaldehyde (bottom) were determined through derivatisation with DNPH followed by analysis with HPLC (see Figure S13 for extracted chromatograms at 360 nm).The regions are colour-coded based on their kinetic phase-lag phase (blue), exponential (green), and degradation (yellow).Their concentrations were calculated based on their respective standard curves (see Figures S2 and S3).At 7 mM initial cyanamide, the exponential phase was delayed to ~6 minutes after heating.

Figure S11.
Concentrations of formaldehyde and glycolaldehyde over time in triplicates of the formose reaction containing 9 mM cyanamide.Concentrations of formaldehyde (middle) and glycolaldehyde (bottom) were determined through derivatisation with DNPH followed by analysis with HPLC (see Figure S14 for extracted chromatograms at 360 nm).The regions are colour-coded based on their kinetic phase-lag phase (blue), exponential (green), and degradation (yellow).Their concentrations were calculated based on their respective standard curves (see Figures S2 and S3).At 9 mM initial cyanamide, the exponential phase was further delayed to ~8 minutes after heating.The amounts of C2-C6 sugars produced during the exponential phase were noticeably lower than the formose reaction without cyanamide (Figure S12).After the yellowing point (~10 min), the peaks gradually decrease in intensity.All chromatographic peaks shown display an absorption spectrum consistent with DNPH derivatisation.

Figure S14.
Extracted HPLC chromatograms at 360 nm absorbance of formose reaction timepoints containing 9 mM initial cyanamide.Formose reaction timepoints were derivatised with DNPH followed by HPLC analysis.The chromatograms are colour coded based on which phase of the formose reaction they belong to.Blue chromatograms represent the lag phase, green the exponential phase, and yellow the degradation phase following the yellowing point.With 9 mM initial cyanamide, the lag phase of the formose reaction was extended by ~6 minutes.
The amounts of C3-C6 sugars produced during the exponential phase were noticeably lower than the formose reaction without cyanamide (Figure S12) but comparable to the reaction with 7 mM cyanamide (Figure S13).After the yellowing point (~12 min), the peaks gradually decrease in intensity.All chromatographic peaks shown display an absorption spectrum consistent with DNPH derivatisation.

Figure S15.
Concentration of cyanamide over time in triplicates of the formose reaction containing 7 mM initial cyanamide.Concentration of cyanamide was determined through derivatisation with dansyl chloride followed by analysis with HPLC at 350 nm (see Figure S16 for extracted chromatograms).Concentration was calculated based on the respective standard curve (see Figure S4).Concentration of cyanamide prior to heating was determined to be ~6 mM instead of 7 mM which could be accounted for by its reaction with glycolaldehyde and other formose products that start to form immediately after mixing.The consumption of cyanamide was observed to reflect the kinetic phase of the formose reaction.Cyanamide concentration started decreasing significantly at the onset of the exponential phase (~6 minutes).Cyanamide consumption slowed down around ~10-12 minutes (the yellowing point) and concentration stabilised afterwards.

Figure S16.
Extracted HPLC chromatograms at 350 nm absorbance of formose reaction timepoints containing 7 mM initial cyanamide.Formose reaction timepoints were derivatised with dansyl chloride followed by HPLC analysis.The chromatograms are colour coded based on which phase of the formose reaction they belong to.Blue chromatograms represent the lag phase, green the exponential phase, and yellow the degradation phase.Integration of the cyanamide peak revealed that cyanamide consumption reflected the kinetic phase of the formose reaction.S1 for proposed structures of aminooxazoles derivatives.The concentration of the doubly-13 Clabelled species (m/zobs 117.0561), which formed predominantly at the start, remained relatively constant.The unlabelled species (m/zobs 115.0497) started at 0 and increased significantly at the onset of the exponential phase.After the yellowing point, the rate of increase slowed down.

Figure S25.
Extracted ion chromatograms of C4H8N2O3 (C3-cyanamide adduct) in formose reaction timepoints containing 7 mM cyanamide and 1 mM initial glycolaldehyde-1,2-13 C2.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.The structures shown in the scheme above are proposed and consistent with the observed masses.See Table S1 for proposed structures of aminooxazoles derivatives.The concentration of the doubly-13 Clabelled species (m/zobs 135.0664), which formed predominantly at the start, decreased continuously throughout the experiment.The unlabelled species (m/zobs 133.0599) started at 0, reached maximum concentration during the exponential phase (~8 min).For the rest of the experiment, the unlabelled species decreased, which could explain the increase of the anhydrate, which is more stable due to aromaticity.S1 for proposed structures aminooxazoles derivatives.The concentration of the doubly-13 Clabelled species (m/zobs 177.0776), which formed predominantly at the start, remained relatively constant.The unlabelled species (m/zobs 175.0702) started at 0 and increased significantly at the onset of the exponential phase.After 10 minutes, the rate of increase slowed down.Discussion.The 1 H NMR spectra (Figure S36) show the peaks for 2-NH2Ox and what is likely its hemiaminal product with formaldehyde.Figure S37 shows the mass chromatograms for m/z values of 2-NH2Ox and its hemiaminal and imine products with formaldehyde.Dilution is observed to shift the equilibrium towards free formaldehyde and 2-NH2Ox.These results suggest that some fraction of formose-derived aldehydes likely exist as hemiaminals during the formose reaction, a smaller fraction of which undergoes dehydration to form their respective imines.Therefore, HPLC and LC-MS data of formose reaction mixture likely reflects higher concentrations of free aldehydes and amines than what is present during the formose reaction.However, the concentration of hemiaminal observed is relatively small, and equilibria between free aldehydes and hemiaminals are reversible.As a consequence, the HPLC and LC-MS data on formose reaction mixtures nonetheless represents the effective concentrations of free aldehydes and amines that are ultimately available in the reaction system.

Figure S2 .
Figure S2.Standard curve of formaldehyde derivatised with DNPH determined by HPLC analysis.Top) Reaction scheme of CH2O derivatisation.Middle) HPLC chromatograms at 360 nm showing the different concentrations of formaldehyde standards derivatised with DNPH.Bottom) Standard curve of DNPH-derivatised formaldehyde.

Figure S3 .
Figure S3.Standard curve of glycolaldehyde derivatised with DNPH determined by HPLC analysis.Top) Reaction scheme of glycolaldehyde derivatisation.Middle) HPLC chromatograms at 360 nm showing the different concentrations of glycolaldehyde standards derivatised with DNPH.Bottom) Standard curve of DNPH-derivatised glycolaldehyde.

Figure S4 .
Figure S4.Standard curve of cyanamide derivatised with DsCl determined by HPLC analysis.Top) Reaction scheme of cyanamide derivatisation.Middle) HPLC chromatograms at 350 nm showing the different concentrations of cyanamide standards derivatised with DsCl.Bottom) Standard curve of DsCl-derivatised cyanamide.

Figure S6 .
Figure S6.Arabinose aminooxazoline standard determined by LC-MS/MS analysis.Top) Procedure of arabinose aminooxazoline LC-MS/MS analysis.The standard was diluted in 20 mM NaOH to mimic the NaOH concentration of formose reaction samples after dilution.Middle) Total ion chromatogram of m/z 175.0719MS/MS.Bottom) MS/MS spectrum of arabinose aminooxazoline at RT 1.023 min.

Figure S7 .
Figure S7.Ribose aminooxazoline standard determined by LC-MS/MS analysis.Top) Procedure of ribose aminooxazoline LC-MS/MS analysis.The standard was diluted in 20 mM NaOH to mimic the NaOH concentration of formose reaction samples after dilution.Middle) Total ion chromatogram of m/z 175.0719MS/MS.Bottom) MS/MS spectrum of ribose aminooxazoline at RT 1.023 min.

Figure S8 .
Figure S8.Xylose aminooxazoline standard determined by LC-MS/MS analysis.Top) Procedure of xylose aminooxazoline LC-MS/MS analysis.The standard was diluted in 20 mM NaOH to mimic the NaOH concentration of formose reaction samples after dilution.Middle) Total ion chromatogram of m/z 175.0719MS/MS.Bottom) MS/MS spectrum of xylose aminooxazoline at RT 1.023 min.

Figure S12 .
Figure S12.Extracted HPLC chromatograms at 360 nm absorbance of formose reaction timepoints containing 0 mM initial cyanamide.Formose reaction timepoints were derivatised with DNPH followed by HPLC analysis.The chromatograms are colour-coded based on which phase of the formose reaction they belong to.Blue chromatograms represent the lag phase, green the exponential phase, and yellow the degradation phase following the yellowing point.Without initial cyanamide, the formose reaction quickly reached the exponential phase, producing noticeable amounts of C3-C6 sugars, evident in the increased peak areas.After the yellowing point (~6 min), the peaks gradually decreased in intensity.All chromatographic peaks shown display an absorption spectrum consistent with DNPH derivatisation.

Figure S13 .
Figure S13.Extracted HPLC chromatograms at 360 nm absorbance of formose reaction timepoints containing 7 mM initial cyanamide.Formose reaction timepoints were derivatised with DNPH followed by HPLC analysis.The chromatograms are colour coded based on which phase of the formose reaction they belong to.Blue chromatograms represent the lag phase, green the exponential phase, and yellow the degradation phase following the yellowing point.With 7 mM initial cyanamide, the lag phase of the formose reaction was extended by ~4 minutes.The amounts of C2-C6 sugars produced during the exponential phase were noticeably lower than the formose reaction without cyanamide (FigureS12).After the yellowing point (~10 min), the peaks gradually decrease in intensity.All chromatographic peaks shown display an absorption spectrum consistent with DNPH derivatisation.

Figure S21 .
Figure S21.Extracted ion chromatograms (EICs) of C3H4N2O (2-NH2Ox) with 0, 1, and 2 13 C isotopes in formose reaction timepoints containing 7 mM initial cyanamide and 100 mM 13 CH2O.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.Non-labelled C3H4N2O (m/zobs 85.0396) increased only slightly from the start to the end of the observation window.In contrast, the concentration of doubly-13 C-labelled C3H4N2O (m/zobs 87.0456) started at 0 and increased to become the dominant species, demonstrating the majority of C3H4N2O was produced from formose-derived glycolaldehyde.Trace amounts of singly-13 C-labelled C3H4N2O (m/zobs 86.0422) were also detected.

Figure S23 .
Figure S23.Extracted ion chromatograms of C3H6N2O2 (interpreted as the 2-NH2Ox hydrate) in formose reaction timepoints containing 7 mM cyanamide and 1 mM initial glycolaldehyde-1,2-13 C2.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.The concentration of the doubly-13 C-labelled species (m/zobs 105.0561), which formed predominantly at the start, decreased continuously throughout the experiment.The unlabelled species (m/zobs 103.0493) started at 0, reached maximum concentration during the exponential phase (~8 min).For the rest of the experiment, the unlabelled species decreased, which could explain the increase in 2-NH2Ox, which is more stable due to aromaticity.

Figure S24 .
Figure S24.Extracted ion chromatograms of C4H6N2O2 (C3-cyanamide adduct anhydrate) in formose reaction timepoints containing 7 mM cyanamide and 1 mM initial glycolaldehyde-1,2-13 C2.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.The structures shown in the scheme above are proposed and consistent with the observed masses.See TableS1for proposed structures of aminooxazoles derivatives.The concentration of the doubly-13 Clabelled species (m/zobs 117.0561), which formed predominantly at the start, remained relatively constant.The unlabelled species (m/zobs 115.0497) started at 0 and increased significantly at the onset of the exponential phase.After the yellowing point, the rate of increase slowed down.

Figure S26 .
Figure S26.Extracted ion chromatograms of C5H8N2O3 (C4-cyanamide adduct anhydrate) in formose reaction timepoints containing 7 mM cyanamide and 1 mM initial glycolaldehyde-1,2-13 C2.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.The concentration of the doubly-13 C-labelled species (m/zobs 147.0663), which formed predominantly at the start, remained relatively constant.The unlabelled species (m/zobs 145.0592) started at 0 and increased significantly at the onset of the exponential phase.After 10 minutes, the concentration plateaued.

Figure S28 .
Figure S28.Extracted ion chromatograms of C6H10N2O4 (C5-cyanamide adduct anhydrate) in formose reaction timepoints containing 7 mM cyanamide and 1 mM initial glycolaldehyde-1,2-13 C2.A mass range of +/-10 ppm, based on found mass, was applied for each EIC.See TableS1for proposed structures aminooxazoles derivatives.The concentration of the doubly-13 Clabelled species (m/zobs 177.0776), which formed predominantly at the start, remained relatively constant.The unlabelled species (m/zobs 175.0702) started at 0 and increased significantly at the onset of the exponential phase.After 10 minutes, the rate of increase slowed down.

Figure S31 .
Figure S31.Extracted HPLC chromatograms at 360 nm absorbance of formose reaction timepoints containing initially 4 mM 2-aminoxoazole.The chromatograms are colour coded based on which phase of the formose reaction they belong to.Blue chromatograms represent the lag phase, green the exponential phase, and yellow the degradation phase.With 4 mM initial 2-NH2Ox, the changes in product distribution as a function of time were similar to that of the formose reaction without any cyanamide.After the yellowing point (~6 min), the peaks gradually decrease in intensity.

Figure S32 .
Figure S32.Extracted ion chromatograms of C3H4N2O (2-NH2Ox) in formose reaction timepoints containing initially 4 mM 2-aminooxazole.A mass range of +/-10 ppm, based on found mass, was applied to each EIC.[2-NH2Ox] started to increase after 2 minutes of heating and plateaued at 6 minutes.It is hypothesised that the increase in [2-NH2Ox] resulted from the breakdown of unstable adducts of 2-NH2Ox and components of the formose reaction.

Figure S33 .Figure S34 .
Figure S33.Extracted ion chromatograms of C5H8N2O3 (2-NH2Ox-glycolaldehyde adduct) in formose reaction timepoints initially containing 4 mM 2-aminooxazole.A mass range of +/-10 ppm, based on found mass, was applied to each EIC.The intensity of all observed peaks plummeted once the mixture was heated suggesting the adducts were unstable.Only the peak at ~ 1.1 minutes retention time was observed to gradually increase over the course of the experiment

Figure S35 . 40 Figure S37 .
Figure S35.Time-course experiment of formose reaction mixtures containing initial 100 mM CH2O, 1 mM glycolaldehyde, 0.2 M NaOH, 30 mM calcium acetate, and 10 mM NaCN heated to 50 °C.Sodium cyanide was mixed with CH2O first in the first time course (top row) and glycolaldehyde first in the second (bottom row).The mixtures in both experiments reached the yellowing point ~8-9 minutes, suggesting that this concentration of sodium cyanide does not significantly affect the formose reaction kinetics.

Table S2 .
Composition of glycolaldehyde-containing mixtures B. The components were added in order from left to right into a 15 mL centrifuge tube.Following the addition of 1 M Ca(OAc)2, the mixtures were vortexed until homogenous (~ 4 seconds).

Table S3 .
Structures of aminooxazole and aminooxazoline derivatives proposed to form from cyanamide-mediated formose reaction.