Pablo
Díaz-Rodríguez
a,
John C.
Cancilla
a,
Natalia V.
Plechkova
b,
Gemma
Matute
a,
Kenneth R.
Seddon
b and
José S.
Torrecilla
*a
aDepartment of Chemical Engineering, Complutense University of Madrid, E-28040 Madrid, Spain. E-mail: jstorre@ucm.es; Fax: +34 91 394 42 43; Tel: +34 91 394 42 44
bQUILL Centre, School of Chemistry and Chemical Engineering, The Queen's University of Belfast, Belfast, BT9 5AG, UK
First published on 28th October 2013
Statistical models have been used to estimate the refractive index of 72 imidazolium-based ionic liquids using the electronic polarisability of their ions as the data for two different mathematical approaches: artificial neural networks, in the form of multi-layer perceptrons, and multiple linear regression models. Although the artificial neural networks and linear models have been able to accomplish this task, the multi-layer perceptron model has been shown to be a more accurate method, thanks to its ability of determining non-linear relationships between different dependent variables. Additionally, it is clear that the multiple linear regression presents a systematic deviation in the estimated refractive index values, which confirms that it is an inappropriate model for this system.
In order to choose a suitable ionic liquid for a specific use, insight into its structure, knowledge of its properties, and an understanding of the structure–property relationships are required.12 Only a handful of ionic liquids are available commercially at an acceptable purity level: indeed, a recent paper published extensive thermophysical property data (density, speed of sound, refractive index, and viscosity) on eleven ionic liquids, which were selected because they were commercial samples.13 Properties of several hundred synthetic ionic liquids have been reported,14 and there is an online NIST database,15 but few of the published studies report coherent data sets on related series of ionic liquids, and purity (especially water and halide content) is often an issue ignored, despite the marked effect impurities have upon physical properties.16 Of the millions of possible candidates for a specific purpose, the published data only represent a drop in the ionic liquid sea. It is for this reason that reliable models for property estimation are so essential – without them we are reduced to intuition, monkeys and typewriters,17 or searching the contents of the Library of Babel.18
Mathematical tools that can aid in the estimation of these physicochemical properties would favour the use of these promising and novel compounds, both in research and industrial fields. An interesting measurable optical property is the refractive index (RI), which can be related to the electronic polarisability of chemical compounds such as ionic liquids.19 This polarisability is a very important electrical property in solvents and determines how the molecular electron cloud behaves under the influence of an electric field. Its value is defined by the strength of the positive charge in the nuclei of the atoms which affects the surrounding electrons. This property is highly relevant for ionic liquids due to the low electrostatic interactions which provide them their surprising and desirable characteristics.20–22
The refractive indices, as well as other physicochemical properties of ionic liquids (such as density or viscosity), are highly dependent on the purity level, and can be drastically modified when impurities (typically halides or water) are present, even at trace levels.16,21,22 Additionally, there is a physicochemical relationship between refractive index and polarisability, and it is governed by the Lorenz–Lorenz relationship, eqn (1).
![]() | (1) |
Due to the huge potential these compounds have, it is indispensable to develop reliable models to estimate the different properties of an ionic liquid (like toxicity),23 or to define the optimal ionic liquid for a specific task. One of the most currently applied mathematical tools is the artificial neural network (ANN), which is formed by various algorithms that are able to determine non-linear correlations between data points so as to offer an estimation of the targeted value. In order to develop an ANN, it is required to employ databases to define the modelled system and limit its range. These kinds of models have been already applied with the objective of estimating physicochemical properties of ionic liquids,21,22,24–27 and good results were obtained, proving that ANNs are powerful estimative tools.
In this work, an ANN model has been developed with the goal of estimating the refractive index of seventy-two different ionic liquids by using the polarisability values of their different structural fragments (cations and anions) as input information (independent variables). This model has been compared with a multiple linear regression (MLR) approach, and found to perform better because of its ability to determine both linear and non-linear relationships between dependent and independent variables.
In order to optimise the linear models and the ANNs, the total database was randomly separated into training and verification datasets, which were formed by 85% and 15% of the database, respectively. It is important to highlight that the verification dataset is contained within the range of the learning dataset. This is an essential requirement of the ANN model, because it works correctly when the input data are inside this range (interpolating), but generates large errors when it is forced to estimate values which do not fulfil this condition (extrapolating).28
For every data point, the polarisability values for the cation and the anion which form the ionic liquid were calculated (independent variables, vide infra) to create the mathematical models considered. To do so, the software included in the package Marvin Suite version 5.11.5 and Chemicalize.org were employed, both developed by the ChemAxom company.29 To calculate the polarisability, these programmes base their calculations on the Thole model.30
The software applied to create the MLP and perform the experimental design to optimise its parameters (ESI†) was Matlab version 7.0.1.24704 (R14).33
In this case, just like with the ANN model, the refractive index value was the dependent or the estimated variable, while the independent variables were the polarisabilities of the ions (cations and anions) of the ionic liquids analysed, which were previously obtained (vide supra) and then normalised. Regarding the independent variable “anion type”, the ionic liquids were classified into different groups related to their anions, obtaining seven groups (which correspond to borate-, sulfonamide-, halide-, sulfate-, phosphate-, dicyanamide-, and succinate-based anions) where the different ionic liquids were ordered by their specific anion polarisability values.
Mathematical and statistical calculations, as well as the designs, have been performed by the software Statgraphics Centurion XVI.1.11.
Optimised parameters of the MLP | |
---|---|
Transfer function | Sigmoid |
Learning function | TrainBR |
Number of input nodes | 8 |
Number of hidden neurons | 3 |
Number of output neurons | 1 |
Lc | 0.001 |
Lcd | 0.499 |
Lci | 100 |
Statistical results | |
---|---|
MPE (%) | 0.18 |
R 2 | 0.99 |
In order to test the level of generalisation of the designed ANN, the k-fold cross-validation method has been followed, obtaining the results and average statistical values summarised in Table 2. This validation method relies on the random subdivision of the analysed database into K datasets (K = 6, in this case) to verify that all data points provide accurate results when simulating the mathematical model. This is achieved by employing a different dataset to verify the model in each one of the MLPs designed.
MLP | R 2 | MPE (%) |
---|---|---|
1 | 0.99 | 0.20 |
2 | 0.99 | 0.27 |
3 | 0.98 | 0.30 |
4 | 0.99 | 0.22 |
5 | 0.98 | 0.17 |
6 | 0.96 | 0.28 |
Average value | 0.98 | 0.24 |
Deviation | 0.01 | 0.05 |
The presence of three outer data points in the verification dataset in the second, third and sixth models implies a worse estimation capability of the designed MLP, since data extrapolation is required. Therefore, and as can be seen, the results obtained after applying this model in those three cases were less accurate than in the others. However, their R2 and MPE values still prove a decent statistical performance, even when extrapolation takes place.
These results indicate that the network should be able to correctly estimate refractive index values for a given imidazolium-based ionic liquid within the range considered in the training dataset (Table SI-1, ESI†) with a high accuracy only using polarisability values. In order to extend these results to a different kind of ionic liquids, such as pyridinium-, ammonium- or pyrrolidinium-based ones, or compounds containing different anions, like arsenates, nitrates or carboxylates, they must have been considered in the learning database. Additionally, another important advantage is that the purity of an ionic liquid could be assessed by comparison of the refractive index value given by the designed network with the bibliographically or the experimentally obtained value.
Now that it is clear that ANNs can correctly define this system, other more simple mathematical approaches have been tested. Linear models, and more specifically MLRs, have been employed to try to create an adequate and easier model.
MLR model | C | Cation | A1 | A2 | A3 | A4 | A5 | A6 | A7 | R 2 | MPE (%) |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.414 | 0.0064 | −0.223 | −0.222 | 0.585 | 0.144 | −0.449 | 0.248 | 0.099 | 0.96 | 0.36 |
2 | 0.410 | 0.003 | −0.212 | −0.201 | 0.617 | 0.146 | −0.440 | 0.231 | 0.103 | 0.86 | 1.13 |
3 | 0.472 | 0.0356 | −0.552 | −0.342 | 0.505 | 0.061 | −0.546 | 0.178 | 0.027 | 0.18 | 1.00 |
4 | 0.431 | 0.0011 | −0.270 | −0.250 | 0.588 | 0.125 | −0.567 | 0.240 | 0.084 | 0.97 | 0.47 |
5 | 0.392 | −0.002 | −0.176 | −0.158 | 0.634 | 0.170 | −0.409 | 0.284 | 0.123 | 0.78 | 0.66 |
6 | 0.391 | 0.024 | −0.201 | −0.191 | 0.623 | 0.198 | −0.404 | 0.264 | 0.108 | 0.78 | 0.67 |
Average value | 0.42 | 0.01 | −0.3 | −0.23 | 0.59 | 0.14 | −0.47 | 0.24 | 0.09 | 0.76 | 0.72 |
Deviation | 0.03 | 0.02 | 0.1 | 0.06 | 0.05 | 0.05 | 0.07 | 0.04 | 0.03 | 0.3 | 0.3 |
As can be seen in Table 3, an average model has been calculated in order to describe the behaviour of the MLRs in the whole range of the data points. The deviation values show how much the individual coefficient values fluctuate in relation with the average value for each variable. While low deviation values indicate a tendency for a specific result, high values imply randomness during the linear fit.
![]() | ||
Fig. 1 Estimated refractive index values vs. literature refractive index values (Table SI-1, ESI†). Grey circles correspond with the training data points and black squares with the verification data points. The results of the MLPs are represented in (a) to (f), and the MLRs are represented in (g) to (l). The line represents an ideal estimation in which the estimated refractive indices are the same as the ones found in the literature. |
As can be seen in Tables 2 and 3, the correlation coefficient R2 (0.98 versus 0.76) is greater in the case of the MLPs when compared to the MLRs, and MPE values are lower when the ANN model is applied (0.24% versus 0.72%). In addition, it is easy to see in Fig. 1 how the MLP models are better suited to model the given data than the MLRs.
When the residual error graphs (Fig. 2) are studied, it can be noticed that, in the case of the MLR models, the error dots follow a specific trend, Fig. 2(g–l). On the other hand, this behaviour is not observed in the corresponding graphs for the MLPs, Fig. 2(a–f), where the residuals show a random distribution in every case.
This trend in the MLR models indicates a systematic error during the fitting, meaning that this model is not able to adequately describe the studied system, that is, to determine the relationships between the refractive index and the polarisability of the imidazolium-based ionic liquids. On the other hand, the random distribution of errors in the case of the MLP models, as well as their lower values, shows that the previously mentioned non-linear relationship exists, and, therefore, ANNs are able to process it, while conventional linear models are not.
Interestingly, when the ionic liquids are studied closely, two different tendencies are observed. When analysing a group of ionic liquids with a common anion type but different cations, and observing the variation of the refractive index value when changing the position or length of the substituent chains on the imidazolium cation, two different trends were observed. The refractive index value either rises when increasing the chain length of the substituents or, in contrast, it decreases with this growth. The first tendency is observed when the ionic liquid has borate, sulfonamide, phosphate or succinate as its anion, and the second one when halides, sulfates or dicyanamides are present. These different tendencies have been already studied and recognised by Bica et al.,20 where they appear to be linked to the relationship of the polarisability to the molar volume of the ionic liquid ions, as shown in the Lorenz–Lorenz relationship, eqn (1). An increase in the alkyl chain length on the imidazolium ring will lead to a rise of both polarisability and molar volume of the ionic fragment, and the behaviour of the refractive index will depend on the value of the ratio of these two properties. If the contribution to the general polarisability of each CH2-unit added to the chain is higher than the corresponding volume contribution, the refractive index will rise when increasing the chain length, and vice versa. The ionic liquids which follow the first tendency have anions which contain a great number of fluorine atoms (i.e. [BF4]−, or [PF6]−). These halides possess a low polarisability (0.40 Å3) and an elevated atomic volume (18 Å3),20 so the addition of CH2-units to the cation would increase the ratio αmol/Vmol, see eqn (1), thus increasing the refractive index value. In the case of the succinate anion, oxygen polarisability and atomic volume values are relatively similar to fluorine (0.64 Å3 and 11.9 Å3, respectively).20 In contrast, the cations of the ionic liquids which follow the second tendency have higher polarisability values than the anions they are paired with. Therefore, an increase in the chain length does not affect the molecular polarisability value as much as it does the volume of the molecule. So, the rise of the molecular volume leads to a decrease of the refractive index value.
Additional to this discussion, it is worth mentioning that there are other approaches to determine refractive indices. For instance, it is possible, through the use of ab initio methods, to calculate the polarisability of different chemical compounds, including ionic liquids.35 This prediction allows the indirect calculation of the refractive index of particular molecules. For three specific ionic liquids, the refractive indices obtained by Izgorodina et al.35 offer MPEs of 0.97% when compared with the literature, which are higher than the ones obtained by the ANN model.
The previously mentioned behaviours are difficult to process with a single linear regression model, such as the designed MLRs. Nevertheless, the capability of the ANNs to determine non-linear relationships makes them suitable candidates to identify the different tendencies which the ionic liquids present. From a statistical point of view, the MLP model has proved to be a more accurate approach for the estimation of the refractive index of the considered ionic liquids using polarisability values in comparison to the MLR model.
It is also important to note that the purity of an imidazolium-based ionic liquid can be indirectly assessed by comparison of the estimated value obtained by the MLP and the experimental one.22
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c3cp53685h |
This journal is © the Owner Societies 2014 |