CHEM21 selection guide of classical-and less classical-solvents †

A selection guide of common solvents has been elaborated, based on a survey of publically available solvent selection guides. In order to rank less classical solvents, a set of Safety, Health and Environment criteria is proposed, aligned with the Global Harmonized System (GHS) and European regulations. A methodology based on a simple combination of these criteria gives an overall preliminary ranking of any solvent. This enables in particular a simpli ﬁ ed greenness evaluation of bio-derived solvents.


Introduction
The Innovative Medicines Initiative (IMI)-CHEM21 publicprivate partnership is a European consortium which promotes sustainable biological and chemical methodologies. 1 It comprises six pharmaceutical companies from the European Federation of Pharmaceutical Industries and Associations (EFPIA), 2 ten universities and five small to medium enterprises. CHEM21 financially supports research projects and will provide a training package to ensure that the principles of sustainable manufacturing are embedded in the education of future scientists. This education task is the main mission of CHEM21 work package 5 (WP5). For example, the training package will include a set of metrics permitting to compare the "greenness" of processes or syntheses. 3 In a drug substance synthesis, solvents represent at least half of the material used in a chemical process. 4 Therefore, limiting their amount and selecting the "greenest" solvents 5 are the most efficient levers to reduce the environmental impact of an active pharmaceutical ingredient. A preceding paper 6 describes a survey of publically available solvent selection guides, 7 often from pharmaceutical companies. The data given in these guides were compiled, and where possible combined, in order to allow a ranking comparison. Of the 51 clas-sical solvents considered, an acceptable alignment could be met, permitting a ranking into four categories: recommended, problematic, hazardous and highly hazardous. 17 solvents could not be ranked by this simplified methodology, thus reflecting differences in the weighing of criteria between the institutions (Table 1). Carbon disulfide (CS 2 ; CAS 75-15-0; volatile and highly flammable) and hexamethyl phosphoramide (HMPA; CAS 680-31-9; carcinogen), which are highly hazardous and are rarely used nowadays, were added to the list in order to prevent their use in the laboratories.
These rankings are defined as below: -Recommended (or preferred): solvents to be tested first in a screening exercise, if of course there is no chemical incompatibility in the process conditions. -Problematic: these solvents can be used in the lab or in the Kilolab, but their implementation in the pilot plant or at the production scale will require specific measures, or significant energy consumption. -Hazardous: the constraints on scale-up are very strong.
The substitution of these solvents during process development is a priority. -Highly hazardous: solvents to be avoided, even in the laboratory. The boundary between hazardous and highly hazardous cannot be clearly established, given that not all pharmaceutical companies and institutions have identical lists of prohibited solvents. 7c,8 This survey may be very useful for the quick selection of a solvent, in particular in academic institutions or companies which do not have their own solvent selection guides. Nonetheless, the ambition of CHEM21 is to develop a solvent guide which is not limited to a final ranking, but also presents explicit Safety, Health and Environment (SH&E) criteria, and encompasses newer solvents, such as bio-derived solvents. 9 This new guide would aid in the ranking of the seventeen "intermediate" solvents using these criteria. This task was given to a sub-team of WP5, and the outcomes are reported in this paper.

Elaboration of safety, health & environmental criteria
This ranking of the most common solvents is based on a benchmark of existing guides. Before integrating neoteric and bio-derived solvents into the CHEM21 guide, a set of criteria is needed to assess the desirability of any solvent, and this assessment must be consistent with the ranking of the commonly used solvents at the industrial level, which are fully registered in REACh. 10 Classically, solvent selection guides are mainly based on a set of SH&E criteria, to which Industrial or Regulatory constraints can be added. In this work, for simplicity, the criteria were combined in order to limit their number to three, resulting in one Safety, one Health and one Environment criterion, each scored from 1 to 10, 10 representing the highest hazard in each category. A colour code was associated with this scoring: green for 1-3, yellow for 4-6, and red for 7-10. A final combination of these three SH&E scores should also allow a direct preliminary ranking in the three categories: recommended, problematic and hazardous. At this level, one cannot make the distinction between hazardous and highly hazardous solvents, even if a solvent with any score of 10 is a good candidate for the latter category. The safety scoring system was the easiest to establish. As process chemists are expected to know or check the compatibility of the solvent with the reagents, the reactivity was not taken into account. Thus, the main hazard is its flammability. In the European Community, the Global Harmonized System (GHS) 11 has been integrated into the Classification, Labelling & Packaging (CLP) regulation. 12 In this system, the fire hazard is mainly based on the flash point (FP), combined to the boiling point (BP) when FP < 24°C. The safety score presented in this guide is aligned with GHS/CLP, but instead of considering the boiling point for solvents with FP < 24°C, it makes a finer distinction, by introducing three subcategories (Table 2). In order to take into account other hazards, the safety score is incremented by one if the solvent has a low auto-ignition temperature (AIT < 200°C), if it accumulates electrostatic charges (resistivity > 10 8 Ω m) or if it easily forms explosive peroxides (hazard statement EUH019 in CLP). For example, diethyl ether, with a flash point of −45°C, an AIT of 160°C, a resistivity of 3 × 10 11 Ω m and a EUH019 hazard statement, has a combined safety score of 10. The health scoring system reflects the occupational hazard. The ideal would be to link it with the occupational exposure limits imposed by authorities or agencies. However these limits are only established for the most widely used solvents or reagents, the use of which would narrow the applicability and scope of the guide. Moreover, threshold limit values are not unified, even in Europe (Table 3). For a simplified analysis, a health scoring based on the hazard statements in the GHS/ CLP system is sufficient. Even if the nature of the hazards are not directly comparable, at least the hazard level is clearly integrated in the system, as illustrated by the acute toxicity by inhalation: H330 (lethal) is worse than H331 (toxic), which is worse than H332 (harmful). Also, H314 (causes severe skin  1 is added to the safety score for each of the following properties: -AIT < 200°C -Resistivity > 10 8 Ω m -Ability to form peroxides (EUH019) Any solvent with a high energy of decomposition (>500 J g −1 ), like nitromethane, 13 would be scored 10. burns and eye damage) reflects a higher hazard level than H318 (causes serious eye damage), which in turn is higher than H315 (causes skin irritation). As such, a simple health scoring system based by default on the CLP statements and the GHS pictograms has been constructed. The health score of any solvent is equal to the figure corresponding to the highest hazard according to Table 4, to which one is added if the solvent's boiling point is lower than 85°C. This adjustment allows a scoring of 10 for the carcinogens benzene and 1,2-dichloroethane, 18 reflecting the higher occupational risk linked with the use of volatile solvents. In this system, water's health score is one, a value which can also be assigned to any other solvent with a BP ≥ 85°C that does not have any H3xx statements after full REACh registration. 19 It is important to bear in mind that H3xx statements are not assigned to chemicals unless toxicological data are available. In order to exclude bias toward solvents with incomplete toxicological data, a score of 5 is attributed by default. The proposed scoring system for the environmental impact of a solvent is still incomplete. Such an assessment should include acute toxicity towards aquatic life, bioaccumulation, the ability to generate harmful Volatile Organic Compounds (VOC), and a metric to evaluate the CO 2 impact of its synthesis, recycling and disposal. Such data are often not available, as shown by the debate on the energy balance of the so-called "bio-fuels". 20 Life cycle analysis systems have been proposed, based on multiple effects (eutrophication, global warming potential, cumulated energy demand, acute toxicity, etc.), which are sometimes combined. 21 As some of the life cycle impacts are linked to human health, and thus already integrated into the health score, we preferred to focus on criteria which are solely linked to environment issues (ozone layer depletion, acute ecotoxicity, bio-accumulation, volatility, recyclability). As a basis of environment ranking, a set of criteria is proposed, each scored between 1 and 10, with the highest scoring criterion dictating the final score (Table 5). The lowest score, one, is assigned to water. Decontamination of water following contact with reagents and solvents can be tedious and energy-demanding 23 but at least, when properly treated, the effluents are safe. On the other limit of the scale, solvents which are hazardous to the atmospheric ozone layer 24 (H420 in GHS: carbon tetrachloride, trichloroethylene) are scored 10.
In order to illustrate the qualitative nature of the system, only three intermediate figures are used: 3, 5 and 7. The boiling point plays an important role in the environment impact. A low boiling solvent will generate VOCs, but on the other hand, a high boiling solvent cannot easily be recycled, and complicates the work-up and downstream unit operations such as product drying. The ideal temperature range has been set between 70 and 139°C.
The acute environmental toxicity and the bio-accumulation potential are highlighted by H4xx statements in the GHS. If such labels are present, they give a score of 5 or 7. In the absence of data, Quantitative Structure-Activity Relationship (QSAR) modelling can give an estimate of eco-toxicity, 25 such as the ECOSAR tool which is freely available to use. 26 We did not make this choice, as the accuracy of the toxicity values generated strongly depends on how the molecule matches the training set used. Without full environment toxicity data, a score of 5 is set by default. If the solvent, after full REACh registration, does not have any H4xx statement, the corresponding score will be 3. Other criteria linked to environment have not yet been included in the scoring, for the sake of simplification. For example, water solubility has not been taken into account, considering that a high solubility in water is not per se an environmental issue. The most eco-toxic solvents (heptane, cyclohexane) are scarcely soluble in water. The recycling of a water miscible solvent may require a high energy  demand, whereas a benign solvent-water solution can often be treated in water treatment plant (e.g. alcohols, acetone), even if it can be problematic. Volatile solvents can partition into air, and high concentrations of readily biodegradable solvents can lead to a high chemical oxygen demand which can be deleterious to degrading organisms. The renewable origin of the solvent also deserves to be considered, but an in-depth analysis is needed, as often, solvents which can be bio-derived are currently mostly synthetized by the petrochemical industry (e.g. methanol, n-butanol). It would also be ideal to have a simple metric to evaluate the environmental impact involved in manufacturing solvents, such as the CO 2 footprint (in kg kg −1 ) or the Cumulative Energy Demand (CED, in MJ kg −1 ). Both can be calculated by software such as Ecosolvent® for some solvents. 22 However a benchmark analysis of our companies' data gave very divergent figures in some cases, and an in depth comparison of existing computation systems is needed before integration into the scoring. The only unambiguous result is that in all simulations the synthesis of THF is the most energy demanding.
Criteria concerning industrial issues which are not directly linked with SH&E, have not been included, such as the cost, the security of commercial supply if the solvent has a single source, and the freezing point (some solvents are solid at 20°C and have to be melted before charging). These safety, health and environment scores can be combined in order to give a ranking by default of any solvent. As a combination based on the sum of the scores could under-estimate a major issue, a ranking based on the most stringent criteria is proposed (Table 6).
This simplified analysis does not make a distinction between "hazardous" and "highly hazardous" solvents. The decision to blacklist a solvent can only be made by a company or institution after appraisal of all the available data and internal policy. It is important to note that CMR solvents of category 1 (H340, H350 or H360) have a health score of 9 or 10, which ranks them directly as "hazardous" by default. This is consistent with the CMR regulation which imposes the substitution of such solvents, or the justification for their use if substitution is not possible.
This methodology has been applied to the 53 common solvents (Table 7). Of the 36 solvents which have a clear ranking in the survey, the ranking by default coincides with 29 of them (81%). For 2 solvents (anisole and sulfolane 27 ), the ranking by default gives a more severe ranking, and for 5 solvents (1,4-dioxane, chloroform, acetonitrile, DMSO and TEA), a less severe ranking. Moreover, for the 17 solvents which did not have a clear ranking in the survey, the ranking by default is always close, except in one case ( pyridine). Thus, the ranking methodology described here gives a very satisfactory alignment with the former results.
However, this simplified system sometimes underestimates the health hazard, as illustrated by the cases of acetonitrile, nitromethane and pyridine: the health score of these solvents, based on the H3xx statements, do not reflect their low occupational threshold values (Table 3) or ICH limits. 28 In-depth discussions within the CHEM21 solvent sub-team were sometimes needed to assess the ranking of the 53 solvents (Table 7). As a general rule, we decided not to modify the clear rankings given by the survey (Table 1), except in the case of sulfolane which was "recommended". Its ranking was changed to "hazardous", as a reproductive study on rat suggests that this solvent could affect the development of the unborn child. 29 As a result, sulfolane has recently been labelled H360. 30 Interestingly, though THF and Me-THF were both ranked as "problematic", the scoring methodology indicates that Me-THF offers advantages in terms of health and environment.
For the "intermediate" solvents tert-butanol, benzyl alcohol, ethylene glycol, MEK, MIBK, methyl acetate, MTBE, cyclohexane, DCM, formic acid, acetic acid and acetic anhydride, we confirmed the ranking by default. Methanol was finally ranked as "recommended", though it is ranked as "problematic" by default. As a matter of fact, despite alarming H3xx statements, the current occupational exposure limits for methanol are relatively high, and consistent between authorities (Table 3), as well as its ICH limit (3000 ppm). Besides, its synthesis is very short and has a low energy-demand. 31 In the ketone family, acetone was ranked as "recommended" in contrast with its ranking by default. Acetone generates VOCs, but is not toxic and readily biodegradable. Cyclohexanone was ranked as "problematic", given that its synthesis via benzene and cyclohexane is not sustainable, and in order to favour the other ketones. Pyridine and TEA were ranked as "hazardous", on the basis of their low occupational limit values.
The CHEM21 solvent guide developed is relatively well equilibrated, with 14 recommended, 17 problematic, and 22 hazardous or highly hazardous solvents. Furthermore, these rankings are generally (81%) in agreement with the SH&E scorings given by the simple methodology proposed.
In this methodology, the safety score may appear as under-estimated, some highly flammable solvents such as acetone having a moderate safety score of 5. This does not mean that the fire hazard is neglected. The manufacture of active pharmaceutical ingredients requires very high levels of containment, controlled nitrogen blanketing of reactors, careful aspiration of solvent vapours, grounding of all pieces of equipment and high levels of process safety evaluation before any scale-up. Increasing the safety score would have given a less satisfactory alignment with the existing solvent guides. The final ranking is given by the most stringent combination.

Extension to less common solvents
As this methodology allows a satisfactory preliminary greenness assessment of classical solvents in the context of the pharmaceutical industry, it can also be used to evaluate other solvents, even those not yet described in any guide. An Excel® table, available in the ESI, † automatically gives the SH&E scorings and the ranking by default, using the physical data and hazard statements extracted from Safety Data Sheets. The "neoteric" or less common solvents have been listed by CHEM21 to be of potential interest in the synthesis of pharmaceutical intermediates, and some of them are being actively employed in CHEM21 projects 33 (Table 8). In this round, supercritical fluids and gas expanded liquids 34 have not been included, although CHEM21 involves some projects using supercritical carbon dioxide. A number of other emerging solvent classes proposed as greener solutions such as ionic liquids, 35 high molecular weight glymes, 36 or Poly-Ethylene Glycols (PEGs), 37 fluorinated solvents, 38 switchable solvents 39 and deep eutectic solvents 40 have also been omitted. Ionic liquids and switchable solvents have so far made little penetration in pharmaceutical synthesis, although they are being used as process liquids in other sectors. The same can be said of PEGs, which are more widely used in the formulation sectors. Likewise, trifluorotoluene and fluorous phase solvents have made no impact in pharmaceutical synthesis, and their synthesis is far from being green. While we have not included these materials in the current analysis, there is no reason why the methodology described here could not be used to rank them. New ethers have been developed and proposed to circumvent the issues of the classical ethers (low flash point, volatility, solubility in water and persistence in the environment). Ethyl-tert-butyl ether (ETBE) is produced using bio-ethanol, and substitutes MTBE as gasoline additive. 41 Cyclopentylmethyl ether (CPME) is obtained from dicyclopentadiene, via cyclopentene. 42 In a similar way, tert-amyl-methyl ether (TAME, or methoxypentane®) derives from C5 distillation fractions of naphta. 43 The syntheses of ETBE, CPME and TAME are short, atom efficient (addition of an alcohol to an alkene) and thus moderately energy consuming. To this list can be added 2-methyl-tetrahydrofuran (Me-THF), a bio-derived ether, 44 which has now entered the club of classical solvents. Their solubility in water is comparable (ca. 1%), as well as their boiling point and their toxicity. The ranking by default ( Table 8) clearly indicate that these solvents offer advantages compared to MTBE (Table 7, hazardous). The ranking of CPME and ETBE as "problematic" mainly reflects their resistivity associated with the low auto-ignition point (180°C) of the former, and the ability to form peroxides of the latter. Nevertheless, these hazards are manageable in most industrial facilities. This illustrates the importance of not limiting a solvent guide to a simple ranking, and of analyzing the criteria supporting such conclusions.
There is an increasing interest in bio-derived solvents in the Green Chemistry community, 9 which aims to benefit from a sustainable source of solvents to help circumvent potential fossil fuel shortages in the future.
In line with recent European standards 45 applied to lubricants which can be marketed as bio-derived if more than 25% of the carbon is from a renewable resource (assessed by 14 C content), 46 a similar bio-derivability criterion is to be applied to solvents. 47 A three band system is proposed, band A if more than 95% of carbon is bio-based, band B between 50 and 95% and band C between 25 and 50% (e.g. ETBE: 33%). Below 25%, solvents are considered as petrochemically derived (e.g. CPME: 17%). The other neoteric solvents here discussed can theoretically be obtained at scale as band A. Such a standard will permit to establish if they are fully, or only partly, bio-derived, which is not always obvious.
Nowadays, most of ethanol is prepared by fermentation (bio-ethanol). 48 Other commonly used solvents could be Only the hazard statements given in the REACh dossiers 15 are included, except in the case of TH-furfuryl alcohol, for which the ECHA harmonized classification 32 is more recent. a n.a.: not available: no full REACh registration; only the highest scoring H3xx statements (cf. Table 4) are shown. The lowest figure is given when there are more than one H3xx statement in the highest scoring category. b TAME, CPME and ETBE are estimated as resistive as MTBE (ρ = 5 × 10 9 Ω m); the hydrocarbons even more (ρ > 10 11 Ω m). c Water sensitive. d Solid at 20°C. produced from the biomass: n-butanol, isobutanol, isoamyl alcohol 49 as well as their related acetates, acetone, diethyl succinate, 50 etc. when it becomes economic to do so. Other solvents are solely obtained from natural sources: glycerol (from oils and fats), turpentine (from pine resin), and limonene 51 (from citrus waste). γ-Valerolactone, 52 Me-THF, tetrahydrofurfuryl alcohol and dihydrolevoglucosenone 53 (cyrene) are produced from ligno-cellulosic biomass. Lactic acid is obtained by fermentation of starch, 54 and gives access to ethyl lactate. 55 Isomerization and dehydrogenation of limonene offers a bio-derived route to p-cymene, 56 although it is currently only commercially available from petrochemical feedstocks. Some solvents derive from carbon dioxide, such as dimethyl carbonate, ethylene carbonate and propylene carbonate.
Tetrahydrofurfuryl alcohol is ranked "hazardous" because it has recently been classified as toxic to the unborn child. Many bio-derived solvents are ranked "problematic" by default, as a result of their high boiling point, thus reflecting the difficult separation of the product and recycling of the solvent. Additionally a number of new solvents are only produced on a relative small scale, or only as intermediates (γ-valerolactone), and have not yet come under consideration by REACh, resulting in a default scoring of at least 5 in Health and Environment criteria. D-Limonene, turpentine and p-cymene are also ranked as problematic, with relatively high boiling points and aquatic toxicity for D-limonene and turpentine. Besides, the first two are also prone to oxidation.
The carbonate solvents show a remarkable range of polarity, dimethyl carbonate being a potential replacement for MEK, ethyl acetate, MIBK, butyl acetate and most other ketones and glycol ethers. Cyclic carbonates such as ethylene and propylene carbonate are much more polar and could replace undesirable aprotic polar solvents such as DMF. 57 According to this assessment, dimethyl carbonate seems to be the greenest carbonate. As it is also considered as a mild methylating/ carboxymethylating agent, 58 careful check of the reaction compatibility is necessary before any scale-up.
The solvent sub-team decided to confirm the ranking by default as final ranking in CHEM21 solvent guide for all these less common solvents. This ranking may evolve on the basis of new toxicology or ecotoxicity studies, especially for solvents which are not yet registered in REACh.

Conclusion
The solvent sub-group of CHEM21 has elaborated a selection guide based on a survey of publically available solvent guides for pharmaceutical industry. As this survey was based on the most classical solvents, there was a need to expand this guide to neoteric solvents, and particularly bio-derived solvents. A model was elaborated, allowing a hazard-driven scoring of Safety, Health and Environment of any solvent, and an overall ranking by default into three categories (recommended, problematic or hazardous). As this model gave a satisfactory alignment with the classical solvents, it can be used to make a preliminary greenness assessment of newer solvents. This ranking methodology is consistent with the CMR and atmospheric ozone regulations, and aligned with the Global Harmonized System. It is based on easily available physical properties and toxicological/eco-toxicological data given in the solvent's REACh dossier. When these data are not published, the solvent is ranked as at least "problematic" by default, which reflects well the difficulties to implement such solvent on industrial scale. Lack of available data can indeed be a key detractor from the uptake of new solvents in the pharmaceutical industry. We would urge solvent suppliers to publish data on toxicity to allow a ranking in the ICH Guidelines.
Some good examples include the data published for Me-THF and CPME. 59 The methodology described here cannot be presented as an expert system. In the timeframe of CHEM21, we could not elaborate a health scoring based on occupational threshold limit values. This choice was made because the latter are not available for newer solvents, and often not unified for classical solvents. Also, the environmental scoring should include a life-cycle impact analysis of the solvents manufacture, or at least their carbon footprint or total energy demand. Such a simplified system only gives a preliminary ranking which needs to be challenged case by case by solvents experts of each institution, as we did to assign the final rankings of CHEM21 solvent guide. This is also why existing solvent guides will continue to be used in the corresponding pharmaceutical companies.
However the strength of our methodology is that it can easily be maintained by chemists using data and hazard statements available in Safety Data Sheets. The highest hazards are highlighted by the system, which gives a satisfactory alignment with the existing solvent guides. Besides, the methodology is versatile enough to accept further improvement.
This solvent guide will be the cornerstone of the CHEM21 training package on solvents dedicated to students and chemists in the pharmaceutical or fine chemical industry.
It cannot be presented as a universal solvent guide, because it was developed to give solvent rankings adapted to the pharmaceutical industry. But the field of green chemistry is wider, and the same methodology can be applied to design solvent guides for other applications such as coating, formulation, consumer products, agrochemicals etc., by changing the selection and combination of criteria. For example, for some of these applications in which the solvent is not recovered, the boiling point impact can be revised. In the same way, the flash point impact needs to be scored more severely for applications using solvents in open air such as paint stripping, coating, etc.
This will reflect the high interest of some bio-derived solvents for such applications, whereas these solvents often appear as "problematic" for pharmaceutical chemistry, as a result of their high boiling points which complicate the recov-ery and downstream processing on scale, or require the use of new process technologies.

Remark
The conclusions reached in this paper are the collective opinion of the authors who contribute to the CHEM21 consortium and do not reflect, at time of publishing, official policy of any individual company or institution.