Route Efficiency Assessment and Review of the Synthesis of  Nucleosides via N -Glycosylation of Nucleobases

of  -nucleosides by N -glycosylation sustainability assessment of these routes via an E-factor analysis. Our the current methods and protocols in general, laborious and inefficient. yields achieved in many cases, at the cost of long routes, leading to high overall E-factors (primarily composed of solvent contributions). Shorter routes using fewer protecting groups perform better regarding their route E-factors, yields in available protecting methods bypass these limitations but suffer from poor substrate solubility and unfavorable reaction equilibria. To enable more efficient and sustainable nucleoside synthesis via N -glycosylation, future efforts should focus on using non-chromatographic purification steps, running shorter routes and higher substrate loading to minimize (solvent) waste accumulation.


Introduction
Nucleosides are highly functionalized biomolecules essential to life on earth and were among the first organic molecules on our planet. [1] Now -nucleosides serve as the building blocks of DNA and RNA, part of cellular energy transfer systems, and as enzymatic cofactors in all known organisms on earth. In recent years, nucleoside analogs have become indispensable as pharmaceutical agents against various cancers as well as viral infections and as molecular biology tools. [2][3][4] For example, fluorinated nucleoside analogs such as floxuridine and islatravir are used for the treatment of colorectal cancer and HIV infections, respectively. [5,6] Alkyne-containing nucleosides such as 5ethynyluridine have been broadly applied for the labelling of nucleic acids, including the analysis of RNA synthesis and visualization of cellular localization. [4] Consequently, the demand for these molecules in nearly all areas of life science has necessitated the development of chemical methods for their synthesis. More than six decades of research in the field have yielded a variety of robust methods to access these compounds.
Nucleoside synthesis is generally performed in a convergent manner via N-glycosylation of a nucleobase, which installs a ribosyl moiety on a heterocyclic base (Scheme 1). Although the glycosylation of a nucleobase as a key step may appear rather simple at first glance, it is complicated by challenges in regio-and diastereoselectivity. [7] These issues typically arise from the low nucleophilicity of nucleobases as well as the density of functional groups decorating the ribosyl moiety. Thus, the desired linkage of a nucleobase to the anomeric center of a ribosyl moiety to yield a -nucleoside often competes with several side reactions, including unselective nucleophilic attack (forming the -nucleoside) and attack of other nucleophilic functional groups, affording complex mixtures Scheme 1. Convergent synthesis of nucleosides with Nglycosylation as the key step. N-heterocyclic nucleobases such as pyrimidines and purines with variable substitution patterns (blue circles) can be accessed directly from cheap precursors via condensation reactions while sugar synthons need to be prepared by multistep routes from unprotected sugars. of products. To address these obstacles, a variety of creative approaches have been developed to prepare -nucleosides in high yield and selectivity. However, these methods vary drastically regarding their strategy, number of steps, yield, reagents, and conditions employed, making it difficult to compare and evaluate different approaches.
Driven by the increasing global effort to establish a sustainable economy, most branches of chemistry have questioned their practices and aimed to design "greener" processes and reagents. [8,9] Among others, these efforts include the use of sustainably sourced solvents (and recycling thereof), the development of more concise and high-yielding routes to pharmaceuticals and the establishment of waste-minimizing cascade syntheses, to name just a few. The pharmaceutical industry (and related fields) has been quick to adopt green chemistry principles and environmental concerns are increasingly recognized in this area. [10] Subsequent research has made a wealth of information publicly available to benchmark and predict various metrics of sustainability and efficiency of chemical syntheses. [11][12][13][14][15][16][17][18][19] Several comprehensive assessments of different routes for the preparation of pharmaceutically relevant target molecules have been published, which have provided further insights into the pitfalls of some synthetic approaches and highlighted successful strategies from a sustainability perspective. [20][21][22][23][24][25] Beyond that, assessments of individual newly published reactions or approaches versus established methods have become a common sight in the literature and continue to provide a valuable and critical evaluation of the state of the art. [26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45] However, most of the above approaches focus either on i) drug-like molecules and, consequently, the heterocycle chemistry commonly involved or ii) single transformations. To the best of our knowledge, a comprehensive assessment of sustainability or efficiency for glycosylation-type chemistry is missing from the literature. Spurred by this lack of information, we were curious which of the available methods for nucleoside synthesis would be most compatible with a low-waste economy and be efficient in a sense of resource usage.
Since nucleosides will undoubtedly continue to be central to all areas of life science, we aimed to provide an honest evaluation and specifically investigated which routes to nucleosides would yield the most efficient and sustainable synthesis. Rather than starting an assessment at a randomly chosen synthon, we opted to consider the entire route necessary for a given strategy. Since preparation of the sugar synthon is generally the most time-and materialintensive part of these routes, [7] we assumed that all routes had to start from unprotected readily available materials. To this end, we surveyed the literature for glycosylation methods for the synthesis of -nucleosides and extracted experimental data and protocols for several nucleoside examples to calculate representative environmental factors (E-factors, EF) [9,[46][47][48][49][50] for the entire synthetic routes. The Efactor is a mass-based metric to assess the amount of waste produced during a synthetic process or route, where = with being the weight of the pure product and being the mass of all materials involved in a synthesis that are not the product. In our analysis, we both considered the simple E-factor (sEF) which, as pioneered by Sheldon, [9] only considers the reagents used as as well as the complete E-factor (cEF) [9] which considers all materials used as where ∑ is the sum of the masses of all reagents (including starting materials), ∑ is the sum of the masses of all auxiliary materials (such as inorganic salts and silica gel, for example) and ∑ is the sum of the masses of all solvents, including organic solvents and water. Considering the number of reactions and routes we aimed to assess, as well as their heterogeneity, we herein opted for the use of the E-factor as a simple and accessible metric. A full life cycle assessment [51][52][53] would have far surpassed the scope of this work and, given the (sometimes) incomplete literature data, also proven unrealistic. Likewise, we did not consider energy consumption (e.g. for inclusion in an E + -factor [54] ) as this data was not available from the literature.
This article reviews the state of the art for -nucleoside synthesis via N-glycosylation and provides an assessment of the performance and efficiency of all available methods and their routes. Since the last comprehensive review of nucleoside synthesis by Vorbrüggen, [7] there have been some notable additions to the toolbox, which we briefly introduce along with some general considerations. We further present a collection of 80 route E-factors (covering up to 11 total steps) which were extracted from over 30 papers using 12 different N-glycosylation methods. Our data highlight prominent sources of waste, reveal the inefficiency of some strategies and underscore that route and reaction design tremendously impact the overall Efactor of the synthesis. Based on these findings, we outline current obstacles and bottlenecks which future synthetic efforts should seek to address. Lastly, our freely available data allows straightforward benchmarking of future syntheses to evaluate their efficiency.

Nucleoside synthesis via N-glycosylation
All methods for N-glycosylation of nucleobases are united in employing a reactive (activated) glycosyl intermediate that is subjected to nucleophilic attack by a nucleobase (Scheme 2A). All approaches published to date proceed via one of three key intermediates for attack by the nucleobase (Scheme 2B). Selective attack at the anomeric center is encouraged either by i) generation of a 1,3-dioxolane cation through recruitment of the neighboring protecting group, ii) formation of a reactive glycosyl cation with charge delocalization across the ring oxygen or iii) employment of a good leaving group that sets the right configuration upon SN2-type substitution at the anomeric position. These intermediates can be accessed from various synthons, all of which typically bear one or more protecting groups and need Scheme 2. Synthesis of nucleosides via N-glycosylation, including key intermediates and sugar synthons. Under varying conditions, a sugar synthon is transformed to a reactive intermediate primed for nucleophilic attack by an (activated) nucleobase, which furnishes a -nucleoside. LG = leaving group, R = protecting group, B = nucleobase, SM = starting material, *EPP = (phenylethynylphenyl)phenyl. a to be prepared from (deoxy)ribose in 1−7 steps (Scheme 2C). Despite these shared basic strategies, conditions and methodologies employed by the available methods differ significantly and are generally guided by the sugar synthon employed for N-glycosylation. It should be noted that the following overview only includes methods which predominantly yield the -nucleoside in reasonable yield and selectivity. Therefore, nucleoside syntheses which favor -nucleosides, afford minimal yields and/or poor selectivity, or do not employ an N-glycosylation step were not included. Furthermore, methods whose reports did not include sufficient detail to reconstruct the experimental procedures are also not included below or in our E-factor assessment.

Glycosyl Acetates
Vorbrüggen's classic synthesis of nucleosides built on silyl Hilbert-Johnson conditions and was originally only described for pyrimidine nucleosides. [55] A fully protected glycosyl acetate is subjected to Lewis acid catalysis, yielding a glycosyl cation intermediate upon displacement of the anomeric acetate (Scheme 3). This labile species is then reacted with a silylated nucleobase to afford a nucleoside after global deprotection. Its exceptional substrate scope, easy adaptability and reliability have made it the most popular method for nucleoside synthesis in academia and industry. Nonetheless, downsides of this method include the need for silylated nucleobases, harsh reaction conditions and the stoichiometric amounts of Lewis acid generally used. [56][57][58][59] a Scheme 3. Nucleobase glycosylation with glycosyl acetates. a

Halogenoses
Direct glycosylation of nucleobases with halogenoses can be achieved through nucleophilic substitution at the anomeric center. This method is particularly popular for the synthesis of 2'-deoxy nucleosides that are otherwise difficult to access due to the lack of anchimeric assistance. Halogenoses can be accessed from the respective methoxyriboside and are subject to direct nucleophilic attack by the nucleobase (Scheme 4). To this end, different methods for nucleobase activation, including silylation and deprotonation by strong bases, have been employed. [60][61][62][63] However, the functional group tolerance of this approach is hampered by the harsh conditions needed for this transformation. Further, the regioselectivity for purine nucleoside synthesis, as well as glycosylation yield, are generally limited. Despite its shortcomings, glycosylation with halogenoses offers a favorable atom economy compared to other methods. a

Scheme 4. Nucleobase glycosylation with halogenoses. a o-Hexynylbenzoates
Spurred on by the difficulty to glycosylate purine bases via Vorbrüggen-type conditions, Yu and colleagues developed a glycosylation method based on ortho-hexynylbenzoic esters. [64] Under gold catalysis, the benzoic ester at the anomeric position reversibly rearranges to an isocoumarin scaffold and yields a glycosyl cation (Scheme 5). This a Scheme 5. Nucleobase glycosylation with ohexynylbenozates. effectively minimizes competition of the leaving group with the nucleobase for attack at the anomeric center, enabling productive attack by weak nucleophiles such as purine nucleobases. Although this method allows glycosylation under mild conditions without the use of stoichiometric activating reagents, it demands a long reaction sequence and extensive use of protecting groups on both the sugar and the nucleobase.

o-(1-Phenylvinyl)benzoates
A similar strategy that eliminates competition of the leaving group with the nucleobase relies on irreversible sequestration of a vinylbenzoic ester (Scheme 6). [65] Esterification of tribenzoylated ribose with o-(1phenylvinyl)benzoic acid accesses a stable sugar synthon. When subjected to an iodine source this synthon affords a glycosyl cation which can be intercepted by a silylated nucleobase. This method provides excellent glycosylation yields and regioselectivity under mild conditions. However, the need for a long reaction sequence with multiple protecting group manipulations make this approach rather laborious.

Pentenyl Glycosides
Sequestration of a n-pentenyloxy group as an iodomethyl tetrahydrofuran to avoid competition of the leaving group with weakly nucleophilic nucleobases is a strategy developed recently by Fraser-Reid and colleagues. [67] Starting from a perbenzoylated methyl riboside, a n-pentenyl orthoester is prepared through recruitment of the 2-benzoyl group. Similar to the above approaches, treatment of this ester with an iodonium source and a silylated nucleobase affords the corresponding -nucleoside after global deprotection (Scheme 8). While this approach also suffers from a rather long reaction sequence (7 steps from unprotected starting materials) and a strict limitation to 2'hydroxy nucleosides, it does provide excellent stereoselectivity and little problems with regioselectivity. Nonetheless, the substrate scope is limited and the need for the extensive use of protecting groups raises concerns from an efficiency perspective.

Propargyl-1,2-orthoesters
A similar methodology to Fraser-Reid et al. was developed by Rao and coworkers [68] to facilitate high yields in the Nglycosylation of nucleobases. A propargyl-1,2-orthoester can be obtained from the respective perbenzoylated riboside through base-promoted attack of the alcohol to enable subsequent glycosylation under mild conditions. Addition of a silylated pyrimidine nucleobase under Lewis acid catalysis affords exclusively the -anomer of the nucleoside (Scheme 9). Despite the method's promise of excellent stereo-and regioselectivity, it has only been applied to two nucleobases to date, presumably since the preparation of the sugar synthon is rather lengthy and labor-intensive.

Trifluoroacetimidate Glycosides
N-glycosylation under mild conditions can also be achieved through activation of glycosyl donors as trifluoroacetimidates, which are excellent leaving groups. [69,70] Nucleophilic attack of a silylated nucleobase provides near-quantitative yields of the thermodynamically favored -nucleoside under Lewis acid catalysis (Scheme 10). However, this method has only been demonstrated for a small selection of pyrimidine nucleosides thus far. The extensive use of protecting groups, stoichiometric application of activating agents and unfavorable atom Scheme
economy make the viability of this approach rather questionable, despite its impressive yields.

Thioglycosides
Thioglycosides have been employed in carbohydrate synthesis for various glycosylations and are notable for their versatility. Their use in nucleoside synthesis is rather rare, yet there are examples in the literature. [71] Similar to other approaches, this method relies on in situ formation of a charged five-membered ring through recruitment of the protecting group at the 2-position after treatment of the thioglycoside with a triflate source (Scheme 11). This reactive intermediate is then intercepted by an activated nucleobase to generate the favored -anomer. While yields of the glycosylation step are good to excellent, this approach requires a rather lengthy synthesis of the sugar synthons (4−5 steps) and suffers from similar drawbacks concerning protecting groups and silylating agents to analogous methods. It is noteworthy that even 2'-deoxynucleosides can be accessed with this approach, although not in high yield or anomeric selectivity. Scheme 11. Nucleobase glycosylation with thioglycosides.

Anhydroses
Building on Mitsunobu glycosylation conditions, Hocek and colleagues developed [72] and optimized [73] a glycosylation strategy that relies on in situ tributylphosphine-mediated formation of a monoprotected anhydrose. Subsequent nucleophilic attack of a deprotonated nucleobase at the electronically favored 1position exclusively provides the -anomer of the nucleoside (Scheme 12). This concise route profits from employing only one protecting group (which can be installed in one high-yielding step from D-ribose), a respectable substrate scope and moderate to good yields of only the desired anomer, which significantly simplifies purification. Drawbacks of this method are few and mainly comprise the strict limitation to 2-hydroxy sugars.

1-Phosphates
The biocatalytic synthesis of nucleosides via nucleoside phosphorylases (NPs) is well established and has recently attracted renewed interest. [74] NP-catalyzed nucleoside transglycosylations employ an easily accessible nucleoside, such as natural uridine or thymidine, as the synthetic starting point. [75] A pentose-1-phosphate is generated by phosphorolytic cleavage of the starting nucleoside and then serves as the glycosyl donor to a second nucleobase, which furnishes the nucleoside of interest (Scheme 13). [76][77][78][79][80][81][82] These reactions capitalize on their promise of mild reaction conditions, perfect regio-and stereoselectivity of NPs, and overall excellent functional group tolerance. Both ribosyl and 2'-deoxyribosyl nucleosides with various nucleobases can be accessed easily with this methodology. While the substrate scope is somewhat limited by the capabilities of the available enzymes, further expansion of the substrate scope by enzyme engineering can be expected. [83] Recent progress in the thermodynamic characterization of these reactions has further enabled robust optimization of reaction conditions by leveraging principles of thermodynamic reaction control. [79,84] Thus, considerable progress in this field may be anticipated.

Protected Nucleosides
Transglycosylation can also be achieved through heat-and Lewis acid-mediated dissociation of the nucleobase from a donor nucleoside (Scheme 14). [85][86][87] This yields an instable glycosyl cation which can be intercepted by a silylated nucleobase, affording the nucleoside of interest after global deprotection. Owing to several drawbacks of this approach, literature examples are rare and date back to the late 20 th century. These include, among others, the need to protect every functional group, the limitation to purine nucleosides, poor control of the configuration at the anomeric center, as well as exceptionally harsh and hazardous reaction conditions.

E-factor assessment of glycosylation methods for nucleobases
Considering the route length, methodology and yield of the individual steps of the available approaches, it is to be expected that these routes for nucleobase glycosylation would differ significantly regarding their efficiency and waste production.

Data collection
We sought to provide a transparent and honest assessment of the efficiency and sustainability of the available methods for nucleobase glycosylation. Therefore, we surveyed the literature for examples of applications for these approaches and calculated the E-factor for the entire routes. In all cases, performing these calculations for every example in the literature (or even every example in a given publication) would have far surpassed the scope of this work. Thus, we selected representative examples that cover multiple pyrimidine and purine nucleosides each with various functional groups, as far as available from the literature (please see Chart S1 for an overview of all nucleosides considered herein).
To generate a level playing field for all methods, we opted to have all routes start from readily available unprotected starting materials. Consequently, we assumed that all routes started either from (deoxy)ribose or a natural nucleoside. Whenever possible and provided in the literature, we considered the E-factor for the synthesis of the sugar donor employed for glycosylation based on the route and procedures reported by the authors of that paper. However, this information was not available in most cases (i.e. the explicit methodology for synthon preparation was not always reported). Therefore, we assumed that whatever synthon these authors used for their synthesis was prepared according to classic literature procedures. [88][89][90][91][92][93][94][95][96][97][98][99][100][101] Similarly, many nucleoside syntheses ended with the protected nucleoside and in those cases, we assumed that a suitable classic deprotection protocol from the literature was used. [97] To avoid favoring one method over another, we applied these same assumptions equally across the board to all methods that built on a given synthon or required a given deprotection.

Calculations
To calculate the sEF and cEF of each route over all steps, we extracted experimental details and procedures from the reports of these methods. However, many reports across different journals did not provide a sufficient description of the experimental procedures to allow a precise reconstruction of the protocol. To still permit a calculation of cEF (which includes, for example, reaction solvents and solutions for quenching and extractions), several quantities had to be estimated. We based these estimates on previous reports of this kind, [37] original papers on the matter [36,102] as well as our own experience with typical laboratory procedures. A complete and transparent description of all calculations and estimates is given in the Supporting Information.

Route cEF and sEF
As expected, the sEF and cEF of the available routes differed significantly, both within and between methods. The sEFs in our dataset of 80 route E-factors (please see the Supporting Information for details) were as low as 1.8 and as high as 73.3, with most methods scoring between 10 and 30 (all kgwaste/kgproduct which is omitted from hereon for clarity). In contrast, the cEFs covered more than 2 orders of magnitude, with values from 165 to 42499 (Figure 1). Interestingly, some methods had E-factors in a rather narrow range, whereas others displayed significant variation between substrates/routes. Downey et al.'s anhydrose-based method is good example of the latter. While their originally reported procedure [72] had quite unfavorable E-factors in a broad range (sEF = 11−53.8, cEF = 20195−42499, routes N1−N7 in the Supporting Information), the subsequently published improved protocol [73] fared much better, but still displayed considerable variability (sEF = 1.8−9.6, cEF = 1760−6017, routes N8−N13). On the other hand, alternative methods such as Fraser-Reid and colleagues' n-pentenyl orthoester-based procedure [67] showed little E-factor variation (sEF = 22.2−29.7, cEF = 10590−14495), except for substrates where glycosylation yield suffered tremendously (N32−N37).
In general, we were surprised to find how high most of these E-factors were. Both well-established and newly developed methods, and even biocatalytic approaches, typically had cEFs in the range of 5000−10000. This significantly surpasses many other types of transformations employed in industrial settings that typically have cEFs of less than 100 per step. [15,16] While some of this may be ascribed to the fact that nucleosides are complex molecules with a high density of sensitive functional groups, these Efactors are still comparably high, considering the high demand and broad applicability of these compounds. Notably, none of the available methods performed well for all nucleobase substrates and/or delivered significantly lower E-factors than all other methods. Motivated by this lack of true efficiency [103] in the sense of resource usage (and consequently high waste production), we were curious to find the sources of these high E-factors and identify areas where improvement is needed. At the same time, we sought to identify strategies that worked particularly well and may be employed by future "greener" nucleoside syntheses.

Yield
Most methods in the literature for nucleoside synthesis focused on optimization of glycosylation yield as the key metric. To this end, several strategies have been developed that employ highly reactive sugar synthons or disable competition of the leaving group with the nucleobase for (re)attack at the anomeric position. In many cases, these strategies succeeded in achieving glycosylation yields upwards of 90%. However, while the yield of an individual step is certainly a critical variable, it should always be viewed in light of the entire synthesis. Indeed, we found no correlation between glycosylation yield and sEF or cEF (Figure 2A). Many of the strategies that sought to optimize glycosylation yield also performed quite lengthy routes and employed large leaving groups. Consequently, both the total yield and the atom economy of these routes suffered immensely, which is reflected by these E-factors.
In contrast to glycosylation yield, we found that total yield (over the entire route) correlated negatively with route sEF and cEF, albeit only moderately ( Figure 2B). This comes as no surprise, as one would generally expect a higher efficiency for high-yielding routes versus those that barely generated any product. Nonetheless, there were some interesting outliers in the literature. The glycosylation and total yields of Downey's improved anhydrose-based approach were modest by most standards (in the range of 30% and 20%, respectively), yet this method displayed some of the lowest sEFs in our entire dataset (sEF = 1.8−9.6, see the grey diamonds in Figure 2B). [73] By employing a concise route (2 steps) and managing the atom economy by using only one protecting group that could be cleaved in situ, they were able to outweigh the rather moderate yields. Obviously, these sEFs could have been even lower with higher yields under the same conditions. However, these yields were still sufficient to achieve what could be considered an efficient (lower waste) synthesis. Other methods that had higher glycosylation and total yields, for example those employing trifluoroacetimidates (> 88% glycosylation yield, N38−N43, please see the Supporting Information for details) or propargyl-orthoesters [68] (> 85% glycosylation yield, N44−N47), also necessitated longer routes and had more unfavorable atom economies, leading to much higher sEFs of 18−40. Clearly, yield is an important variable, but only to a certain extent. Even excellent yields are generally offset by cumulative reagent usage across a long route. Although yield of the key glycosylation step constitutes a bottleneck for some nucleobases, it appears that, from an efficiency standpoint, chasing maximum yields is not a fruitful strategy if it entails following longer routes.

Route length
The length of a route, as given by the number of total steps, [104] varied drastically among different glycosylation methods and appeared to dictate the lower bounds for possible route E-factors. The shortest available routes relied on nucleoside phosphorylases for biocatalytic glycosylation and had only one step, whereas the longest routes had around 9 to 11 steps, with several protecting group transformations. The cumulative sEFs and cEFs of all methods covered a broad range over all route lengths, and trended upwards with increasing route length ( Figure 3). This data highlights two important points for consideration. Routes employing only one step tended to perform rather favorably (sEF < 11), probably because the potential for waste accumulation in a one-step route is quite limited. On the other end of the spectrum, long routes with 9−11 steps all had sEFs higher than 40 and cEFs well above 10000. While these routes still displayed great heterogeneity, it appears that waste accumulation to a certain extent was a natural consequence of the number of transformations. It should be noted, however, that short routes don't guarantee lower E-factors, as our dataset featured 22 routes with 2 steps (all from Downey et al.'s method or biocatalytic) which mostly performed favorably regarding their E-factor, but also included a few outliers with cEFs above 15000 ( Figure 3). These data demonstrate that shorter routes do not necessarily translate to lower E-factors, but clearly have the potential to perform more efficiently than longer alternatives, especially if several protecting group transformations are involved.

E-factor contributions
Irrespective of yield and route length, the cEFs of all routes were mainly composed of solvent contributions (Figure 4). The reagents used throughout a synthesis, as well as inorganics only added minorly to the route cEFs, as contributions from organic solvents and (for biocatalytic routes) water typically made up more than 95% of the cEF. This observation is somewhat intuitive as nucleobases and many nucleosides are generally poorly soluble in all solvents and solvents are known to be the main determiner for E-factors, if they are included in the calculation. [11,28,36,37] Nonetheless, we were surprised to find that even routes which employed heterogenous steps were characterized by overshadowing solvent contributions. Biocatalytic routes were particularly plagued by the low water solubility of many nucleobases, which has so far largely restricted these syntheses to working concentrations in the low millimolar range. However, some routes which sought to prepare especially insoluble guanosine derivatives used this to their advantage to realize the lowest cEFs in our dataset. Zuffi et al.'s [81] (cEF = 200, route N29) and Ubiali et al.'s [82] (cEF = 165, route N31) one-step syntheses of 2'-deoxyguanosine from thymidine via transglycosylation employed a substrate loading which was an order of magnitude higher than other biocatalytic routes and profited from the target compound readily precipitating from the reaction mixture. Thus, the higher substrate loading in heterogenous reaction steps appears particularly attractive for sugar or nucleobase transformations. Based on our data it could be reasoned that (beyond heterogenous reactions) any strategy that allows higher substrate/reagent loading will result in lower Efactors. Yet even if one or multiple steps of a route can be realized heterogeneously, or with otherwise high substrate loading, solvent contributions from other steps may still be the main contributors to the cEF of that route. Again, this underscores that the demands and opportunities of a single step need to be considered as part of the entire route, and that shorter routes offer more potential for minimizing Efactors. It should also be noted that we did not consider recycling of any solvents in this analysis. Clearly, some solvents can be and are recycled in industrial settings to reduce the net waste arising from a reaction, especially in the case of low-boiling solvents such as dichloromethane or hexane. However, other solvents like pyridine, water or acetonitrile may be harder and much more energy-intensive to recover, purify and reuse. Thus, given the heterogeneity of solvents employed and their role in the respective syntheses, we opted to consider all solvents as waste without any recycling.

Chromatography
Solvent contributions for many steps originated to a large extent from chromatographic purification steps. In fact, the number of chromatography steps was equally good at setting the lower bounds for cEF as the total number of steps in a route -irrespective of the transformations, yields and types of workup performed ( Figure 5). It should be noted that very few syntheses in our dataset included quantities for their chromatography solvents, which required us to estimate these for most of the routes discussed herein. Like Hollmann and colleagues, [37] we estimated 500 mL of solvent per gram of crude product and calculated the Efactor contributions via the density of the chromatography solvents employed (which are generally reported). Depending on the solvent used and the crude product, this equaled a cEF contribution of around 500−1500 per chromatography step in most cases. Considering published experimental data on this issue, [36,102] and the fact that most chromatography steps in our dataset were done to isolate  Please see the Supporting Information for procedures and references. material from complex mixtures, this is a very conservative estimate for most syntheses analyzed here. Still, chromatography solvents dominated the cEFs of all routes that employed chromatographic purifications. Naturally, longer routes featured more chromatography steps, which is reflected in their cEFs (see e.g. N50−N52 or N56−N59). Conversely, the two routes that did not employ any chromatography steps (N29 and N31, see above) had the by far lowest cEFs. Admittedly, chromatography probably cannot be avoided altogether given the nature of the transformations required for nucleoside synthesis but limiting chromatography steps should be a primary goal to achieve more efficient and "greener" nucleoside synthesis.

Protecting groups
All non-biocatalytic syntheses considered herein employed protected sugar synthons, whose synthesis constitutes the most labor-and resource-intensive part of the route. Most sugar synthons need to be accessed in 3−7 steps from (deoxy)ribose through selective protection and introduction of the anomeric leaving group. Thus, the synthesis of these synthons accumulates a considerable E-factor even before the key glycosylation step. Even though yields for the required transformations are generally high to excellent, reagent usage and purification throughout these routes is reflected in the high sEFs and cEFs (Chart 1). Please note that these E-factors are somewhat skewed by the high molecular weights of the protected synthons and may not directly translate to full route E-factors since the E-factor is a mass-based metric, and the nucleoside products are generally a lot lighter than these synthons. Exceptions to these observations are presented by biocatalytic routes [76][77][78][79][80][81][82] requiring no protecting groups (which considerably shortens these syntheses by all protecting and deprotecting steps) and Downey's method [72,73] which only employs one (albeit large) protecting group, that can be installed in one step and cleaved in situ after the reaction. The only other method that consistently delivered E-factors close to these two approaches is halogenose-based glycosylation, which uses an easy to prepare synthon with a small leaving group. Clearly, the non-biocatalytic synthesis of nucleosides requires at least some protecting groups due to the complex arrangement of reactive functional groups in the target compounds. However, the choice of protecting groups and synthon for glycosylation should be made based on the most concise and efficient route to that synthon. Every protecting and deprotecting step that can be avoided in a synthesis typically results in lower waste production through a better atom economy and less purification effort.
Transitioning to more efficient nucleoside synthesis Benchmarks Based on these observations, we propose some benchmarks for nucleoside synthesis to be termed "efficient". Future synthetic efforts should seek to achieve sEFs below 10 and cEFs below 2000 in a route that takes 4 steps or less. We explicitly opt against inclusion of any recommendations regarding glycosylation or total yield, protecting groups, solvent usage or chromatographic purifications. However, a balance and improvement of all these metrics will be reflected in the E-factor. We chose to include route length as a relevant parameter, since the average step for nucleoside synthesis took roughly one day ( Figure S1) and time investment in a synthesis is certainly a relevant factor. Selected routes in our dataset already meet these benchmarks (N11 and N29−N31), although only for some purines. We believe that there is potential for nucleoside synthesis to become more efficient in general by striving to meet these proposed benchmarks. To this end, there are some areas which require and deserve attention by researchers to effect immediate improvement.

Areas for improvement
The above data illustrate that nucleoside synthesis is currently hampered by several bottlenecks that manifest themselves in inefficient routes with high E-factors. Most notably, chromatographic purification steps present a significant source of waste in the form of solvent. Although some of this solvent may be recycled to reduce the net waste from these steps, they remain notoriously inefficient separation processes from a sustainability perspective. However, at least one chromatography step will probably be required for most target nucleosides to achieve sufficient purity, since N-glycosylation is a non-trivial transformation that (beyond the desired nucleoside) often yields several hard-to-separate byproducts. Thus, reduction of additional chromatography steps should be a central aim for all synthetic routes to the relevant sugar synthons. Whenever possible, precipitation or recrystallization steps allow tremendously lower resource investment and, therefore, lower E-factors. For some of the routes outlined above, it would also be worth considering if some of the steps required for either sugar donor synthesis or postglycosylation deprotection could be performed in a one-pot manner to avoid intermediary purifications. This may also help to cut down the use of solvents like dichloromethane or hexane which serve as popular extraction and purification solvents but are recognized as environmentally concerning. [105] To facilitate these aims, applied routes should be as short as possible, since additional steps such as protecting group manipulations on the sugar moiety have the potential to render the entire route inefficientirrespective of metrics such as glycosylation or total yield. Therefore, glycosylation approaches which employ few protecting groups and do not rely on large leaving groups (which themselves necessitate prior installment) appear to have the most potential going forward. Future efforts to optimize existing methods or devise new methods may therefore focus on avoiding chromatography, shortening routes and/or doing multiple transformations in one pot. Furthermore, heterogenous reactions present an attractive strategy to cut down non-chromatography solvent waste, particularly since nucleosides are generally poorly soluble. This holds especially true for biocatalytic approaches which are currently severely hampered by the low water solubility of both the starting materials and products, as well as unfavorable reaction equilibria for some nucleosides. To overcome these obstacles, strategies to enable increased substrate loading and equilibrium shifts in favor of the target nucleosides should be key research goals.

Conclusions
Nucleosides and their analogs are indispensable biomolecules in nearly all areas of life science. However, the available methods to prepare these compounds via Nglycosylation of nucleobases suffer from severe drawbacks, which render these routes laborious and inefficient. Our comprehensive literature survey and E-factor analysis revealed that glycosylation methods for nucleoside synthesis cover an extended range of route E-factors and that glycosylation yield is an overrated metric of efficiency. Solvents, predominantly from chromatographic purification steps, are the main contributors to cEF and a heavy reliance on protecting groups tremendously increased both the sEF and cEF of most routes. Future syntheses should seek to address these bottlenecks to enable more efficient and sustainable nucleoside syntheses.

Conflict of Interest
A. K. is CEO of the biotech company BioNukleo GmbH. F. K. is a scientist at BioNukleo GmbH and P. N. is a member of the advisory board. These affiliations constitute no conflict of interest with the results presented and discussed in this report.