Pseudouridine and N1-methylpseudouridine as potent nucleotide analogues for RNA therapy and vaccine development

Modified nucleosides are integral to modern drug development, serving as crucial building blocks for creating safer, more potent, and more precisely targeted therapeutic interventions. Nucleobase modifications often confer antiviral and anti-cancer activity as monomers. When incorporated into nucleic acid oligomers, they increase stability against degradation by enzymes, enhancing the drugs’ lifespan within the body. Moreover, modification strategies can mitigate potential toxic effects and reduce immunogenicity, making drugs safer and better tolerated. Particularly, N1-methylpseudouridine modification improved the efficacy of the mRNA coding for spike protein of COVID-19. This became a crucial step for developing COVID-19 vaccine applied during the 2020 pandemic. This makes N1-methylpseudouridine, and its “parent” analogue pseudouridine, potent nucleotide analogues for future RNA therapy and vaccine development. This review focuses on the structure and properties of pseudouridine and N1-methylpseudouridine. RNA has a greater structural versatility, different conformation, and chemical reactivity than DNA. Watson–Crick pairing is not strictly followed by RNA that has more unusual base pairs and base-triplets. This requires detailed structural studies and structure–activity relationship analyses for RNA, also when modifications are incorporated. Recent successes in this direction are revised in this review. We describe recent successes with using pseudouridine and N1-methylpseudouridine in mRNA drug candidates. We also highlight remaining challenges that need to be solved to develop new mRNA vaccines and therapies.


Introduction
Deoxyribo-and ribonucleic acids (DNA and RNA) represent a class of natural compounds central to the life on Earth.Their main functions are to store, transmit genetic information and to allow species adopt, survive, and evolve.
Nucleosides are monomers of DNA and RNA (the latter illustrated in Fig. 1A).They have two distinct structural elements: nitrogen heterocycle, called nucleobase, and a ribose carbohydrate.Each element serves as a scaffold for multiple non-covalent interactions, vital to biological performance of nucleic acids.Nucleobase provides with p-stacking and hydrogen bonding interactions, forming nucleic acid helices and universal genetic code.
Given crucial biological importance of nucleic acids, interference with their synthesis and function creates powerful therapeutic interventions.This can be achieved via chemical modification of nucleosides.
Multiple nucleoside drugs with nucleobase modifications have been developed.Several examples are shown in Fig. 1B.Modified nucleoside monomers include drugs for treating viral infections (ribavirin, idoxuridine, trifluridine, brivudine), cancer (azacitidine, fludarabine, 6-mercaptopurine, decitabine), and immunosuppressive drug azathioprine, used in organ transplantation and autoimmune diseases. 1Due to their altered nucleobase structure, these drugs interfere with DNA synthesis, transcription and/or translation leading to cell death. 1 As to oligonucleotide drugs, it mostly ribose and phosphate backbone modifications that have been incorporated.Chemical modifications in these drugs result in enhanced stability, improved pharmacokinetics, reduced toxicity, increased target specificity, or resistance to degradation, thereby improving their therapeutic potential. 2ver the last two decades, there has been a growing interest in RNA drug development.RNA differs from DNA by having a 2 0 -hydroxyl group in ribose sugar, which gives rise to unique conformational features, specific hydration, and electrostatic properties of RNA (Fig. 1A). 1 Another structural feature of RNA is that uracil nucleobase does not contain the major groove 5-methyl group as that in DNA nucleotide thymidine.As a result of this, RNA duplexes are more compact than B-DNA helices.Ribose nucleotides tend to adopt the C3 0 -endo sugar pucker resulting in compact A-form RNA duplexes with eleven base pairs per turn. 1 RNA has various alternative base pairings, such as G:U wobble pair and A:C pair, which differ from DNA.G:U pair has an exocyclic amino group that acts as a prominent hydrogen bond donor in the minor grooves, important for recognising proteins and other ligands (Fig. 1C). 3 G:U pair frequently occurs, such as in codon-anticodon interaction which forms the genetic code between tRNA and mRNA.The hydrogen bonding of G:U are strongly dependent on their next neighbour pairs, ranging from very weak bonding when in tandem configuration to strong if located at the end of a double strand. 4G:A pairs are ubiquitous at RNA helices termini, in loops and folds of tertiary structure.Among G:A, G:G and A:A pairs, sheared G:A pairs that include a Hoogsteen interaction are most common (Fig. 1C). 3 RNA nucleotides form multiple interactions with nucleobases and sugars.G-tetrad and G-quartet are similar for DNA and RNA but more diversity is found in RNA.Quadruple interaction can be formed in highly folded RNA molecules as well. 3NA folding is complex and sensitive to temperature and buffer composition.Metal ions like Mg 2+ and K + stabilize RNA tertiary structure. 3As a result of stable structure, misfolding of RNA could become a severe problem with e.g., therapeutics and vaccine candidates. 3seudouridine (C) is an abundant post-transcriptional modification of mRNA which together with N1-methylpseudouridine (N1-Me-C), became a milestone in developing mRNA drug candidates.This review describes C and N1-Me-C nucleobase analogues, their impact on RNA structure and appealing properties in RNA vaccines and therapeutic candidates.We also give highlights on most potent applications of modified mRNA achieved so far, discuss future directions and remaining challenges.

Pseudouridine
The first known and one of the most abundant RNA modifications is the pseudouridine (C), or 5-ribosyluracil, which is a post-transcriptional modification and an isomer of uridine (U) (Fig. 2A).Pseudouridine contains a C-C base-sugar bond due to the uracil base is attached to the sugar by a C1 0 -C5 bond unlike a C1 0 -N1 glycosidic linkage (Fig. 2), enhancing the base rotation.Moreover, it has an additional ring nitrogen atom (N1 imino atom), which behaves as an additional hydrogen bond donor. 5,6The replacement of U by C promotes a C3 0 -endo sugar conformation and increases the local base stacking, thermodynamically stabilizing RNA duplexes. 5,7,80][11][12][13] Intracellular C formation is catalyzed by a class of enzymes known as pseudouridine synthases. 14Pseudouridine synthases can be classified into two main families: stand-alone pseudouridine synthases and pseudouridine synthase domains within larger proteins.Stand-alone pseudouridine synthases include TruA found in bacteria and TruB found in bacteria and yeast.In eukaryotes, several pseudouridine synthase domains have been found.Eucaryotic H/ACA box small nucleolar ribonucleoproteins (snoRNPs) have a dyskerin (Cbf5) component which catalyzes pseudouridylation in rRNA, snRNA, and telomerase RNA.Nop10 is another component of H/ACA snoRNPs which is involved in the pseudouridylation activity. 14n humans, there are multiple enzymes catalyzing pseudouridylation.RPUSD (RNA pseudouridylate synthase domaincontaining) family consists of several enzymes involved in C formation.RPUSD1 and RPUSD2 are involved in mitochondrial RNA pseudouridylation.The PUS family consists of enzymes with pseudouridine synthase activity towards multiple RNA species, represented by PUS1 and PUS7.
Quantitative C detection remains being challenging.Therefore its biological role is still not completely understood.

Review RSC Chemical Biology
Incorporated into mRNA, C increases its stability and modifies various cellular and biological processes, such as transcription, pre-mRNA splicing and mRNA translation. 14C was also found in stop codons. 15seudouridylation is a crucial post-transcriptional modification that influences the structural stability and function of transfer RNA (tRNA) as well.With nanopore sequencing it has been shown that C modifications are present in diverse positions of tRNA. 16As a result of a modification in the hisT gene, the tRNA E. coli and Salmonella typhimurium lacks C38, C39 and C40 positions and may have a reduced polypeptide chain elongation rate (20-25%) and longer cell division time (30%).In addition, inhibition of C38 and C39, as a result of DEG1 gene disruption, in S. cerevisiae leads to reduced growth rate. 9otably, pseudouridylation of tRNA affects its methylation at other positions.In tRNA Phe of S. cerevisiae, C55 positively influences the introduction of methylated nucleotides, m 5 U54 and m 1 A58. 168][19][20] For example, mutations in the PUS1 gene, which encodes a pseudouridine synthase, are associated with mitochondrial myopathy and sideroblastic anaemia. 19Abnormal pseudouridylation has been implicated in neurological disorders. 20Altered tRNA modifications, including pseudouridylation, have been observed in various cancer types. 9,10,17,18ucleic acid function is guided by non-covalent interactions between polynucleotides and with cellular protein machinery. 3herefore 2D and 3D structure of RNA is central to its interactions and function.Regarding impact on RNA structure, two hydrogen bond donors by N1 and N3 imine protons, and two hydrogen bond acceptors are present in C (Fig. 2B).Therefore, C acts as a universal nucleobase by enhancing stability when it pairs with A, G, U or C in a double helix.The relationship between C-modified mRNA structure and stability is supported by multiple examples of in vitro-synthesised, more stable, Ccontaining mRNAs, and the natural C-containing mRNAs' increased half-life time, as in eukaryotic parasite Toxoplasma gondii. 21,22MR structural studies found that C stacks better than U due to the C1 0 -C5 glycoside bond in C that has higher rotational lability than the C1 0 -N1 bond in U. 5,23 C enhances local RNA base stacking in both single-and double-stranded conformations and promotes an increase in neighbouring stacking stabilization at nucleotide level. 7,9][26][27] To investigate more the impact of C on RNA structure, thermodynamic studies were conducted. 5The data revealed that C stabilizes the RNA duplexes when U is replaced and C-A, C-G, C-U and C-C pairs are formed.The effect of pseudouridylation is more significant when it occurs at the internal positions of the RNA structure, usually releasing more negative free energy.Enhanced thermodynamic stability by C is around 0.5 kcal mol À1 , except in 5 0 -CCG/3 0 GAC and 5 0 -GCC/3 0 CGG, nearly 2.5 and 1.5 kcal mol À1 is recorded respectively.C contribution is, in most cases, weaker than a typical hydrogen bond (2-10 kcal mol À1 ), which indicates that the enhanced stability is a result of a different mechanism. 28This endorses the hypothesis that the contribution of C to the stability arises from the C1 0 -C5 bond properties.Current NMR data 5 suggest that the N1H in the 5 0 -CCG/3 0 -GAC trimer is in a weak hydrogen bond while there is no evidence for hydrogen bonding in the 5 0 -GCC/3 0 -CGG trimer. 5C biological roles, such as nonsense suppression of stop codons and non-synonymous translation, might be better understood by identifying contexts where N1H is in an active hydrogen bond. 29The 3D structures and computational studies of these outlier duplexes could be interesting to investigate due to little knowledge of the interactions that are favourable to stacking.
C-A, C-G, C-U and C-C base pairs have shown enhancement of base stacking and also sequence context dependence. 5,6,30High level quantum mechanical (QM) methods have shown a clear dependence of the change in the base stacking energies concerning the sequence context with a range from À1.59 to 0.23 kcal mol À1 .However, the change from U-A to C-A base pair not always stabilizes the stacking interactions of the duplex. 6,30Yet experimental data of internal and terminal C-A pair has shown greater stability in comparison to predicted data of U-A pair in the same duplexes, on average 1.7 and 1.0 kcal mol À1 , respectively. 30erminal C-A, -C, -G and -U pairs show stability enhancement in both 5 0 and 3 0 terminals in a similar way to U-A, -C, -G and -U pairs, respectively.Curiously, those C pairs at 3 0 terminal reach slightly higher energies than the U pairs, except C-A pair.C-C pair contributes with more stability than U-C pair only when added at 3 0 terminal.On the other hand, only C-U pair shows a more stable behaviour than U pair in 5 0 terminal.Similarly, C-G pairs have higher stabilization when at 3 0 terminal (3.22 kcal mol À1 ) in comparison to U-G pairs (2.44 kcal mol À1 ).Finally, any internal C pair enhances the thermodynamic stability of RNA duplexes.Mainly C-A and C-G pairs have large energies, within a 0.4-0.8kcal mol À1 , when compared to A-U and G-U pairs, perhaps due to the favourable base stacking and hydrogen bonds of C. 5,31 RNA N1-methylpseudouridine structure N1-Methylpseudouridine (N1-methyl-C) is another naturally occurring RNA nucleotide analogue 32 which has an extra hydrogen bond donor in the nucleobase (Fig. 2C).This N1-modified structure can be found in natural 18S rRNA and tRNA, not only in humans but also in archaea and eukaryotes.Research shows that RNAs of Thermococcales and Nanoarchaea 33 include N1methyl-C. 34,35N1-Methyl-C biosynthetic pathway begins with converting U to C, being catalyzed by the aforementioned pseudouridine synthase Pus10. 36,37S-Adenosylmethionine (SAM)-dependent pseudouridine N3-methyltransferase YbeA in eubacteria 38,39 and N1-specific pseudouridine methyltransferase

RSC Chemical Biology Review
Nep1 are two enzymes that can further methylate C. Nep1 catalyse the N1-specific pseudouridine methylation of position 1191 (Saccharomyces cerevisiae numbering, nucleotide 913 in M. jannaschii) in RNA. 40,41and N1-methyl-C modified RNA duplexes have different stability which has been confirmed with multiple thermal denaturation studies.5 N1-Methyl-C differs from C by having a methyl group to replace the extra hydrogen at N1-position, so it no longer has the universal base character like C (Fig. 2).N1-Methyl-C only forms the traditional Watson-Crick pair but both C and N1-methyl-C have the C5-C1 0 bond that allows rotation between the nucleobase and sugar, achieving better base-pairing, base-stacking and duplex stability.42 A recent molecular dynamics study shows that N1-methyl-C induces a higher stabilization effect of the dsRNA, due to stronger stacking and base pair interactions, than C. Mreover, N1-methyl-C:A pair have a stronger binding interaction than both U:A and C:A pairs in the majority of neighbours context.43 Overall, N1-methyl-C is more like uridine in translation coding but at the same time, behaves like C that let the mRNA not trigger the immune response which is of critical importance to RNA vaccines and therapies, described below.mRNA drug development mRNA holds a potential to treat a vast spectrum of diseases and to act as a vaccine, by producing the desired protein in vivo and acting as adjuvant.44 Key steps in mRNA drug development include: sequence design, chemical modification, formulation, testing in vitro and in vivo, and finally, trials.
Currently there are two approved mRNA vaccines, by Moderna and Pfizer. 45,46According to FDA, January 2024, there are 56 mRNA drugs in the clinical pipeline worldwide, with R&D mainly focused on vaccines, accounting for about 84%, while therapeutic drugs account for about 16%.Except for the mRNA COVID-19 vaccine, which is urgently marketed, most others are in the early stages. 47ver the last decade, therapeutic mRNAs and mRNA vaccines have been encountering several major obstacles.First, long single (ss) or double stranded (ds) RNAs in the cytosol are commonly derived from the genome of RNA viruses or intermediate products that are generated during viral replication, leading to immune response. 48Introducing C and N1-Me-C has been a breakthrough to overcome this challenge.Immune responses to RNA and their inhibition with C are illustrated in Fig. 3. TLR7 and TLR8 recognize single stranded (ss) RNA.These TLRs preferentially recognize polyuridine (polyU) and guanosine/uridine-rich (GU-rich) sequences.TLR7 and TLR8 also recognize RNA degradation products and require free guanosines and uridines, respectively, for maximum activation.C and N1-methyl-C could optimise mRNA performance by reduced immunogenicity and effective protein translation.Reduced binding to TLRs due to C.
Delivery has been another main challenge to mRNA therapeutics and vaccines, that could not be solved solely with chemical modification of mRNA.mRNA is rapidly degraded in vivo and poorly uptaken by cells.Delivery systems like lipid nanoparticles (LNPs) have high efficiency, low toxicity and are applicable to various cell types.][46] LNPs have encapsulation efficiency reaching 90%, and can be targeted through surface modification. 49,50n-going research on LNP mRNA covers multiple potential applications.Main direction in current trials is cancer vaccines.Very recently Moderna and Merck published successful results of phase IIb study on melanoma cancer vaccine. 51However it is a limiting factor that MC3 dLin-MC3-DMA (also known as C12-200) cationic formulation used in COVID-19 vaccine is patented and cannot be broadly used for other mRNA drug candidates.To overcome this, there are attempts to develop alternative formulations.Cationic lipid formulations using DOTAP (1,2-dioleoyl-3trimethylammonium-propane), PEGylated lipids and DOPE (dioleoylphosphatidylethanolamine) are being tested. 52,53he delivery tools for mRNA are not limited to LNP.Naturally occurring vesicles, such as exosomes, can be used to carry mRNA. 52These vesicles are derived from cells and besides mRNA encapsulation, can facilitate cell-specific targeting.Encapsulation efficiency of exosomes is lower than for LNP, however exosomes are less toxic and have higher target specificity. 52olymeric nanoparticles and dendrimers have been actively explored for the mRNA delivery. 53Polymeric nanoparticles, such as those made from polyethyleneimine (PEI), poly(lacticco-glycolic acid) (PLGA), or chitosan, can form complexes with siRNA through electrostatic interactions.These nanoparticles protect mRNA from degradation and can enhance cellular uptake. 53,546][57][58] They are extensively reviewed elsewhere.
Overall, there is no universal solution to all nucleic acid delivery tasks.Choosing the most suitable delivery system depends on the specific requirements of the therapeutic application, including the targeted tissue, desired release kinetics, and safety considerations.

N1-Methylpseudouridine impact on COVID-19 mRNA vaccine
A key principle of the mRNA-based vaccination is that under low dosage, non-modified mRNA encodes the antigen while acting as an adjuvant (Fig. 4).Restricted to a maximum of 12 mg dosage by patients' tolerance in late-stage clinical trials, unmodified mRNA vaccine CVnCoV maintained only 48% efficacy, regardless of the disease severity. 45In contrast, 30 mg Pfizer-BioNTech or 100 mg Moderna's mRNA vaccines could demonstrate around 95% high protection rate against COVID-19 after modification with N1-methyl-C. 45accines composed of unmodified and modified mRNA compared after being injected into the muscle of the upper arm. 45An immune response would be created and N1methylpseudouridylated mRNA demonstrates over 90% efficacy against COVID-19 symptoms, much higher than the unmodified one which is beneath 50%.
Clearly, N1-methyl-C has different impact on product protein translation compared to C. According to Kim et al. (2022), N1-methyl-C does not necessarily change the decoding accuracy in a reconstituted system. 59Neither would it increase the probability of miscoded peptides, nor stabilise the mismatched RNA-duplex formed.It only has a slight tendency in increasing errors when reverse transcription occurs.
Remaining challenges in therapeutic mRNA development mRNA development needs to be supported by research on mRNA structure since it is closely related to the function of mRNA.Nonetheless, mRNA structure prediction is still unreliable despite combining the usage of thermodynamic stability and evolutionary covariation information. 60The conserved mRNA structure could be predicted by the combination of three features: using significant covariation, negative evolutionary information and a plethora of probabilistic folding algorithms which incorporate those positive covariations into a single structure. 39nother challenge is how to produce exogenous long mRNA while incorporating N1-methyl-C flexibly on a large scale.Getting inspiration from nature and utilizing enzymes is an effective method.In this approach, fragments can become the building blocks of plasmids for COVID-19 vaccine synthesis.According to past studies, T7 polymerase could effectively build RNAs longer than 20 000 nucleotides perfectly without error 61,62 while tolerating non-natural NTPs.m1C triphosphate can be incorporated into RNA by polymerases providing long modified RNA molecules (Fig. 5). 63This allows producing large quantities of ling modified RNA, of tremendous benefit to RNA drug development and commercialization.

Conclusions
Overall, mRNA therapy is a cutting-edge medical approach that uses synthetic mRNA molecules to prevent and treat various conditions.This technology has gained significant attention and recognition, particularly due to its role in the development of COVID-19 vaccines. 455][66] By introducing synthetic mRNA into the body, it's possible to instruct cells to produce specific proteins that can correct or combat disease.As to vaccine development, mRNA technology gained widespread attention during the COVID-19 pandemic when companies like Pfizer-BioNTech and Moderna developed highly effective COVID-19 vaccines using this approach. 45The vaccines contain mRNA that encodes the spike protein of the SARS-CoV-2 virus,

RSC Chemical Biology Review
which triggers an immune response, providing protection against the virus. 45ne of the advantages of mRNA therapy is its rapid development and adaptability.Creating synthetic mRNA for a new target can be faster and more scalable than traditional vaccine or drug development processes.][66] Despite its promise, mRNA therapy faces challenges, including ensuring the stability and delivery of mRNA molecules to target cells, managing immune responses, and addressing potential side effects.These pitfalls can be overcome by using chemical analogues of RNA nucleotides.Pseudouridine (C) is an important modification that is conserved naturally in the RNA structure.Studies showed that C-modification allows mRNA to resist intrinsic immune responses 67 and C-derivatives could further improve mRNA properties, such as stability and efficacy of translation.
To apply this technology to vaccine and therapy development, deeper knowledge on the impact of C and its analogues on RNA structure is required.Computational approaches, thermodynamics and structural investigations with e.g., NMR would be significant next steps in this direction.

Fig. 1
Fig. 1 Chemical structures of natural RNA nucleotides (A), examples of nucleoside drugs (year of FDA approval) (B), and wobble RNA base pairs: GU and GA (C).

Fig. 2
Fig. 2 Chemical structure of C (A), its base pairing with G, A and U nucleobases (B); chemical structure of N1-methyl-C (C).