Open Access Article
Gaopeng
Ren
a,
Austin M.
Mroz
ab,
Frederik
Philippi
a,
Tom
Welton
a and
Kim E.
Jelfs
*a
aDepartment of Chemistry, Imperial College London, White City Campus, London, W12 0BZ, UK. E-mail: k.jelfs@imperial.ac.uk
bI-X Centre for AI in Science, Imperial College London, White City Campus, London, W12 0BZ, UK
First published on 21st January 2026
Ionic liquids (ILs) are salts that are liquids at ambient conditions (typically below 373 K) and are known for their many unique properties, including low volatility and high thermal stability. Despite the promise of ILs, their targeted design is challenging for several reasons, including (i) the vast number of candidate ions that could be synthesised as components of an IL, (ii) the lack of predictive methods to determine what ion combinations will yield ILs with desired melting points, and (iii) experimentally known ILs possess limited chemical diversity. In this work, we present a data-driven framework for designing novel low-melting-point ILs. We model ILs as bipartite graphs and apply a link prediction algorithm to identify promising cation–anion pairs, expanding the collected IL database more than 30-fold, while prioritising low melting points. To further explore chemical space, we trained variational auto-encoders (VAEs) to generate new IL candidates through learning a latent space that enables modelling the data distribution. A thermodynamics-inspired classification model is subsequently employed to filter out ILs predicted to melt above 373 K. Finally, molecular dynamics simulations validate our approach, confirming that 18 out of 20 generated ILs have melting points below 373 K.
Due to the extensive range of possible cation–anion combinations, the chemical space of ILs is vast, and therefore it would not be possible to screen all possible ILs experimentally.10 Further, the complex, and often non-linear, structure–property relationships limit the utility of design rules for informed IL design, and frequently means that experiments are performed in a trial-and-improvement manner. Computational design methods provide a tractable alternative to experimental efforts. The computational design of ILs typically involves two key components: proposing new IL systems and developing models to calculate their properties. A straightforward method for constructing IL systems is via high-throughput screening and candidate enumeration. Here, new combinations of known cations and anions are enumerated, resulting in a large number of possible ILs.11,12 To further expand the IL chemical space, some studies decompose cations and anions into cores and substituents (for example, side chains) and recombine them to create new structures.13–18 These approaches allow the creation of extensive IL libraries, which can then be screened using conventional simulations or property prediction models.
Two widely used methods for calculating IL properties are density functional theory (DFT) and molecular dynamics (MD) simulations. Although these techniques can provide accurate results, such calculations are typically computationally expensive and often unsuitable for high-throughput screening. Recently, machine learning (ML) models have emerged as an efficient alternative for IL property prediction.5,19–23 These data-driven models can learn structure–property relationships from data, and predict various IL properties, such as melting point,20,21,24 viscosity,25,26 heat capacity,27 and CO2 solubility.28,29 By combining IL construction strategies with property prediction models, there is the potential to identify ILs optimised for specific applications.
Despite the promise of property prediction models, the IL chemical space in the aforementioned workflows is limited, as ILs are constructed by combining existing cations and anions or manually combining different molecular substituents onto ions. In recent years, generative ML models have emerged as a powerful tool for exploring chemical space.30,31 The goal of generative models is to create new samples that closely resemble existing ones.32 Variational auto-encoders (VAEs) are among the most popular generative methods. In VAEs, input data is encoded into a latent space that follows a normal distribution, and the decoder reconstructs the data from the latent space. These models have been widely applied to molecular generation, with examples including ChemicalVAE,33 GraphVAE,34 and JT-VAE.35 These VAEs can efficiently learn smooth, continuous latent spaces that capture molecular features, enabling the generation of novel, diverse, and valid molecular structures. Yet, these VAEs are mainly focused on small, organic molecules, for which there are extensive datasets available. Applying VAEs to molecular materials like ILs poses significant challenges, particularly due to data scarcity. Indeed, the number of known ILs is limited. For instance, the NIST ILThermo database36 contains only ∼3000 unique ILs, far fewer than the 10
000 examples typically required to train robust VAE models.37 Furthermore, IL datasets that include experimental properties are even scarcer. This greatly hinders the development of robust ML models for accurate IL property prediction.
There are several methods that have been introduced to address the issue of data scarcity. For example, Liu et al.38,39 employed particle swarm optimization to simultaneously optimise a property prediction model and a VAE. They introduced a normative score to guide the VAE towards generating valid ILs and a property score to conditionally generate ILs with desired properties. However, this approach does not fully resolve the data scarcity issue, as no additional data or external information is introduced into the workflow. As a result, the IL chemical space to be explored will remain fairly limited. Another strategy to mitigate the data scarcity issue is transfer learning. Specifically, since large datasets of general organic compounds are available, the VAE model can be pre-trained on such datasets and then fine-tuned using the smaller IL datasets. However, Beckner et al.40 found that while transfer learning improves the generation of valid molecules, the dominance of neutral organic compounds in the training data reduces sample efficiency, as the majority of generated molecules are neutral, not the desired ions. Furthermore, neither the optimisation-based nor the transfer learning methods typically explicitly consider the melting points of generated ILs. As a result, they do not guarantee the production of ILs with low melting points, limiting practical application. In addition to post-processing and transfer learning, data augmentation is another commonly used technique to address data scarcity. For ILs, a straightforward approach is to combine all available cations and anions to expand the dataset, given the sparsity of existing cation–anion pairs (Fig. 1). However, not all of these combinations would produce ILs with low melting points. For example, the salt, NaCl, composed of Na+ as the ‘IL cation’ and Cl− as the ‘IL anion’ has a melting point up to 801 °C, which is far from that of typical ILs. Another problem with all combinations is the low training efficiency. The training examples will increase significantly, while the unique ion structures do not increase. Additionally, the generated examples will include many high-melting-point cation–anion combinations.
Here, we proposed a link prediction method to address the issue of data scarcity in using VAEs for IL generation. We formulated the task of identifying suitable cation–anion pairs as a link prediction problem, which predicts whether two nodes in a network are likely to be connected.41 In the IL scenario, this task aims to identify new cation–anion combinations that are likely to form low-melting-point ILs. Using this method, we generated over 250
000 new cation–anion combinations, expanding the collected database more than 30-fold. A VAE was then trained on the expanded IL database, and achieved good performance in terms of novelty, validity, uniqueness, and reconstruction accuracy. However, the generated ILs are not inherently guaranteed to have low melting points. To address this, we developed a thermodynamics-based classification model to filter out undesirable systems likely to have melting points above 373 K. Finally, we validated the expanded ILs from link prediction and the generated ILs from the VAE by performing MD simulations to calculate their melting points. The results showed that the IL samples from both link prediction and the VAE exhibited low melting points, with most below 373 K, demonstrating the effectiveness of our workflow.
Link prediction aims to identify potential links based on existing ones. In the context of ILs, a link could be interpreted as a cation–anion combination that can form an IL. As shown in Fig. 2, we constructed a bipartite graph using the cations and anions in the collected IL database as heterogeneous nodes. We used extended-connectivity fingerprints (ECFP) descriptors47 as the initial node embeddings. Then, a graph neural network (GNN), GraphSAGE,48 was applied to update the node embeddings iteratively. The computation process of GNNs consists of two main phases: (i) an aggregation phase to gather information from node neighbours, and (ii) an update phase to update node embeddings according to the gathered information. In the GraphSAGE model, the node embeddings are updated through:
![]() | (1) |
are node embeddings with d dimensions.
By stacking multiple GNN layers, nodes can incorporate information from increasingly distant neighbours, enriching their representations. To determine the likelihood of a cation–anion pair forming an IL, we computed the dot product similarity of their updated embeddings, followed by a Sigmoid activation function to output a probability:
| sij = σ(xiTxj), | (2) |
Before training, we split the existing links into training, validation, and test sets with a ratio of 85
:
5
:
10. Negative samples were generated by randomly selecting non-existent links. To ensure a balanced dataset, the number of negative samples is equivalent to the number of existing links. Note that the non-existent links may include combinations capable of forming ILs, but this probability is likely low, given the very small ratio of existing ILs to non-existent ILs (0.4%). During training, we optimised the GNN using cross-entropy loss between the predicted probabilities and the true link labels. We set the number of training epochs to 100, and used the Adam49 optimiser with a learning rate of 0.01 to optimise the model parameters. To expand the collected IL database as much as possible, we conducted multiple iterative training rounds. In each round, the newly predicted links were added to the existing links, progressively increasing the database.
| z = µ + σ·ε, | (3) |
. The decoder then reconstructs the input data from the latent representation. After training the VAE, we can generate new data by sampling random noise from a standard normal distribution and passing it through the decoder.
Here, we applied gated recurrent units (GRUs) for both the encoder and the decoder. SMILES strings are used to represent the cations and anions that comprise the ILs. First, we built a vocabulary by collecting the unique characters from the SMILES strings in the database and adding special tokens (i.e., start, end, and padding). Each SMILES string can be encoded to a series of integers that represent the indices of characters in the vocabulary. These integer sequences are padded to a maximum length of 256 to ensure uniform input size and are further projected to high-dimensional embeddings by an embedding layer. The GRU encoder takes these embeddings as inputs. Here, the encoder and decoder each consist of three GRU layers, and the dimension of the latent representations is set to 128.
The training objective of a VAE is to maximise the evidence lower bound (ELBO) which consists of two components: the reconstruction loss
to regulate the recovered data to be close to the initial input data, and the Kullback–Leibler (KL) divergence to encourage the latent representation follows a standard normal distribution. To stabilise the training process (specifically to avoid the KL vanishing), we adopted the loss function from Fu et al.:50
![]() | (4) |
D versus 1/T. Assuming Arrhenius behaviour with a constant diffusion activation energy within each phase,61–63 two linear regimes corresponding to the solid and liquid states were identified. The intersection of these regimes was taken as the melting temperature.64 Further details of the MD validation process are provided in Section S3 of the SI. The code for the MD simulation is available at https://github.com/fate1997/MD4IL.
076. To further analyse the dataset expansion, we visualised the frequency of occurrence of cations and anions in both the collected and expanded IL datasets, Fig. 3c and d. In the collected IL database, the distributions of cations and anions were highly imbalanced, with only a few species being dominant. After expansion, the distributions became more uniform while still maintaining a trend similar to the collected dataset.
To further validate the performance of the link prediction model, we randomly selected 10 ILs predicted by the model and conducted MD simulations to calculate their melting points. As shown in Fig. 4 and S6, 6 out of the 10 new ILs had predicted melting points less than 373 K, while a further 3 had melting points near 373 K (within 15 K deviation). Only 1 of the 10 ILs had a relatively high melting point (560 K). These results demonstrate that the link prediction approach is an effective method for identifying potential cation–anion combinations with low melting points, even without a filtering step to check the melting point prediction. This effectiveness can be attributed to the ability of GNNs to capture and transfer knowledge across different nodes, enabling the link prediction model to recognise intrinsic patterns that determine whether a cation–anion combination can form an IL. Finally, the expanded IL database (virtual database) contains a large number of ILs with low melting points, which is essential for training an effective generative model for IL generation.
000 ILs for evaluation. Unlike the above metrics, reconstruction accuracy is assessed based on the test set. It represents the percentage of SMILES strings from the test set that the model can successfully reconstruct. To evaluate the reconstruction accuracy of the VAE model, we excluded 10% of the total data points for use as a separate test set.
Here, we evaluate the generation performance of the VAE model using three different input datasets: the collected IL database (VAE-collected), the expanded IL database generated through link prediction (VAE-link), and a dataset containing all possible cation–anion combinations (VAE-all). The results are summarised in Table 1. The VAE-collected model demonstrates high novelty but suffers from low validity and reconstruction accuracy. This limitation arises because of the small dataset, making it challenging for the VAE model to learn the correct SMILES grammar effectively. In contrast, the VAE-all model exhibits strong performance in terms of validity, uniqueness, and reconstruction accuracy, but has very low novelty, indicating that most generated ILs are already present in the training dataset. Among the three models, VAE-link performs well across all metrics, showing strong validity and reconstruction accuracy. The high validity and reconstruction accuracy of the VAE-link model suggest that this model effectively learns the correct SMILES grammar, enabling the generation of valid molecular structures. However, the novelty of the VAE-link is slightly lower than that of the VAE-collected, primarily because the expanded IL database does not introduce new ion structures. Despite this minor decrease in novelty, the VAE-link model excels in sample efficiency, defined as the number of novel valid ILs generated among all the sampled SMILES strings. It significantly outperforms both the VAE-collected and the VAE-all in this regard, demonstrating its effectiveness in generating diverse and novel IL structures.
| Model | Data points | Validity | Unique | Novelty | Recon. |
|---|---|---|---|---|---|
| VAE-collected | 7507 | 31% | 95% | 89% | 11% |
| VAE-all | 1 640 507 |
97% | 100% | 10% | 93% |
| VAE-link | 262 076 |
84% | 99% | 78% | 90% |
Using the melting point classification model, we divided the generated ILs from the VAE-link model into two categories: generated low-melting-point ILs and generated high-melting systems. To compare the distribution of generated ILs with the collected ILs, we selected 2000 novel generated ILs and 2000 ILs from the collected database.51 We then computed the ECFP features for both sets and applied t-SNE to reduce the dimensionality for visualisation. As shown in Fig. 5a, the chemical space of the generated low-melting-point ILs largely overlaps with that of the collected ones, indicating that the generated low-melting-point ILs are more diverse, while still covering the original distribution. In contrast, the chemical space visualisation of higher melting systems (Fig. 5b) shows that the number of generated higher melting systems (175) is significantly smaller than that of the collected higher melting systems (875). Furthermore, 90% of generated ILs are classified as low-melting-point ILs. These results suggest that the expanded IL dataset enhances the diversity of generated ILs while ensuring a preference for those with low melting points.
![]() | ||
| Fig. 5 t-SNE plots of (a) low-melting-point ILs (<373 K) and (b) high-melting systems (>373 K) in both the generated and collected IL datasets. | ||
We employed MD simulations to estimate the melting points of novel ILs. To validate the MD workflow, we tested it on 20 ILs with experimentally known melting points (ranging from 246 K to 421 K) and found it aligns well with the experimental values (Fig. S5). We then extended the simulations to additional ILs, including newly expanded and generated ones. Representative figures illustrating the determination of melting points are presented in Fig. 6. The melting points were identified as the intersection of the fitted solid-phase and liquid-phase lines. All the plots to determine the melting point are shown in Section S4 and S5 in the SI.
We randomly selected 20 generated ILs from the VAE-link model and estimated their melting points using the MD workflow. The structures and predicted melting points of 10 generated low-melting-point ILs are shown in Fig. 4, with more examples provided in Fig. S9. Our analysis shows that the generated ILs exhibit low melting points, with 18 out of 20 classified as low-melting-point ILs (melting points below 373 K). The remaining 2 ILs have melting points (385 K) slightly above this threshold, but do not exceed it significantly. The average melting point of the 20 generated ILs is 339 K, which is lower than the average melting point of the IL melting point database (361 K). Several factors contribute to these results. First, the link prediction model effectively identifies combinations of cation–anion that are likely to have low melting points. Second, the VAE model generates structures similar to those in the input dataset, ensuring that the produced cations and anions resemble those found in the collected IL database. Third, the melting point classification model accurately differentiates between low-melting-point ILs and higher melting systems by leveraging thermodynamic knowledge.
The low melting points of the newly designed ILs (both from link prediction and VAE) can be rationalised based on the intermolecular interactions between ion pairs. In particular, localised and directional interactions, such as hydrogen bonds, tend to lower melting points significantly.66 For example, ILs composed of imidazolium cations and amide anions (IL 9, 10, 15 in Fig. 4) exhibit extensive hydrogen-bond networks involving C–H and N–H bonds, which explains their low melting points.67 Hydrogen bonding is also present in IL examples 1, 3–8, 11–14, also contributing to reduced melting points.68–71 Another key factor is the anion size: larger anions weaken ion–ion interactions, which also leads to lower melting points.52,72 Large anions are observed in IL examples 1, 4–6, 8–10, and 15, further explaining their reduced melting behaviour.
076 ILs. The expanded dataset enables the effective training of a VAE with improved sample efficiency. The generated ILs are further screened using a melting point filter, to test whether their melting points are predicted to be below 373 K. Finally, MD simulations validated that 90% of generated ILs meet this criterion. We believe our strategy can also facilitate the discovery of other molecular mixtures, such as deep eutectic solvents. However, there are still limitations to our approach; the expansion of the database has not increased the diversity of the individual molecular components, which may restrict exploration of the IL chemical space. Additionally, synthetic feasibility, an important factor for the practical application of low-melting-point ILs, was not explicitly considered in this study. Future work will focus on broadening the range of ion structures to enhance their diversity, while considering synthetic accessibility and the prediction of other properties, beyond the melting point, of relevance to the IL community. In parallel, we will pursue the synthesis of selected predicted novel ILs and experimentally validate their melting points.
Supplementary information (SI) is available: melting point distribution in the database, melting point classification model, MD for melting point estimation, generated ILs from link prediction, generated ILs from VAE, and melting point distribution of generated ILs. See DOI: https://doi.org/10.1039/d5dd00282f.
| This journal is © The Royal Society of Chemistry 2026 |