Open Access Article
Kan Hatakeyama-Sato
*a,
Yasuei Uchimaa,
Takahiro Kashikawab,
Koichi Kimurab and
Kenichi Oyaizu
*a
aDepartment of Applied Chemistry, Waseda University, Tokyo 169-8555, Japan. E-mail: oyaizu@waseda.jp
bFujitsu Ltd, Kanagawa 211-8588, Japan
First published on 15th May 2023
Data-driven optimal structure exploration has become a hot topic in materials for energy-related devices. However, this method is still challenging due to the insufficient prediction accuracy of material properties and large exploration space for candidate structures. We propose a data trend analysis system for materials using quantum-inspired annealing. Structure–property relationships are learned by a hybrid decision tree and quadratic regression algorithm. Then, ideal solutions to maximize the property are explored by a Fujitsu Digital Annealer, which is unique hardware that can quickly extract promising solutions from the ample search space. The system's validity is investigated with an experimental study examining solid polymer electrolytes as potential components for solid-state lithium-ion batteries. A new trithiocarbonate polymer electrolyte offers a conductivity of 10−6 S cm−1 at room temperature, even though it is in a glassy state. Molecular design through data science will enable accelerated exploration of functional materials for energy-related devices.
Artificial intelligence-based material screening consists of three main steps: (1) database construction, (2) structure exploration, and (3) experimental investigation (Fig. 1a).3,18,19 The most challenging part of informatics is efficient structure exploration. Promising candidates must be extracted from the limited number of existing material datasets. An accurate prediction model must be constructed, and the exploration system must find reasonable structures from a huge search space of materials. For example, there are potentially more than 1060 types of structures for organic compounds, but regular computers cannot explore each candidate in a reasonable time.20–23
![]() | ||
| Fig. 1 (a) Scheme of material exploration by artificial intelligence. (b) Protocol to extract candidate molecular structures by quantum-inspired annealing in this work. | ||
Exploring an ideal material structure, X, with a given parameter, y, is called an inverse problem.23,24 It has become easy to construct a normal prediction model y = f(X) with conventional machine learning models implemented in, for example, scikit-learn libraries.25 In contrast, the construction of the inverse function, f−1, is still difficult due to scientific and technological challenges. First, the relationship between X and y is not bijective; specific material structure X determines property y, but there can be many materials that may have the same y.23,26 Therefore, most studies use forward function f and examine different X to search better structures with optimal y.1,2,18 The second problem is that even defining prediction model f is difficult, especially in experimental projects where trainable databases are smaller.3 Every model contains prediction errors due to insufficient trainable data and the approximation of prediction algorithms. Third, extensive forward prediction with different structures X takes a long time due to the astronomical number of candidates, and thus the exploration space must be carefully reduced.20,21
Several algorithms have been proposed to solve inverse problems for materials. Deep reinforcement learning searches for suitable structures that satisfy specific target properties, and these algorithms are often developed for organic molecules, especially drugs.27 The exploration space is often reduced by specifying the pre-trained molecular structures in the deep learning models.21,27 Another deep learning algorithm, chemical autoencoder, reversibly converts chemical structures to continuous vectors,12,28,29 and structure exploration becomes an optimization problem in the multidimensional vector space.12 Recently, quantum annealing and the hardware that it has inspired have been used to explore large search spaces efficiently by superparallel computing.20,21,30,31
Despite major algorithmic and hardware developments, most technologies still have gaps for experimental molecular screening projects. The difficulties arise not only from insufficient prediction accuracy of a target parameter, but also from molecule filtering by practical criteria.21 Molecules must satisfy good synthetic accessibility, stability, safety, cost, and other parameters.21 Individual modules have been developed to evaluate the parameters, especially for drugs, but effective integration has been difficult due to problems with prediction accuracy and computational cost.32,33 Considering the current technical level of molecular screening for materials science, the suggestions may be used as indicators rather than exact solutions.
Here, we demonstrate a molecule exploration system that provides a near-ideal design for molecules to improve a specific parameter, y (Fig. 1b). A forward prediction model of y = f(X) was constructed from a structure–property relationship database of organic functional molecules. Then, the ideal vector, X, was searched for by a Fujitsu Digital Annealer, which is quantum-inspired annealing hardware.34,35 The hardware overcomes the intrinsic difficulties of real quantum machines, such as the limited dimensions of the search space and the restriction options during computation, but its computation time is reasonable (about 1 to 10 s). Comparing ideal X with existing molecules revealed essential features for improving the performance of materials. The exploration system was used to design a new solid-state polymer electrolyte for lithium-ion batteries. An emerging trithiocarbonate design for the electrolyte yielded a room-temperature conductivity of over 10−6 S cm−1 and a high glass transition temperature of 100–125 °C. The quantum-inspired annealing algorithm may help researchers design other functional materials for energy-related devices and accelerate research and development.
The chemical structures, recorded as simplified molecular input line entry system (SMILES) strings, were converted into about 200-dimensional numerical descriptors by the RDKit module.36 During data processing, up to six compounds were extracted in order of weight per electrolyte, resulting in an approximately 200 × 6 = 1200-dimensional vector. The compound information expressed as the vector was treated as X and the logarithmic ionic conductivity was set as y. For machine learning, the dimensionality of X was reduced to about 200 by dropping columns with the same values in all different records. The variables were standardized as z-scores.
Quantum annealing hardware solves quadratic unconstrained binary optimization (QUBO) problems. Solutions that give the minimum in eqn (1) should be explored.30,34,35
![]() | (1) |
If we build a quadratic regression model according to eqn (1), the quantum annealer finds a solution, Xideal, that maximizes y predicted in the equation. However, Xideal is not easily found with normal nonlinear models, such as RFR.
Quadratic terms Ji,j in the quadratic model were generated by randomly selecting 500 interactions between xi and xj. The Ridge algorithm was selected to determine the coefficients. Although the factorization machine algorithm can prepare adequate quadratic interaction terms with low computational costs,31 it cannot be used in this work because continuous variables are used for xi, whereas the algorithm only uses binary data as the inputs.
Quantum annealers also only use binary inputs. Thus, a continuous variable, xi, was expressed as 10 bit data (unary method, eqn (2)). Constant cbin,k was set so that xi could take between the maximum and minimum values of the training data.
![]() | (2) |
Normal quantum annealers have difficulty in solving these problems because the number of assigned variables is large, and the applicable interactions, (i,j), are limited due to hardware problems.39 In contrast, the third-generation digital annealer can solve up to 100
000 bits with arbitrary interactions,34 which is suitable for the current work.
Here, a new regression model was developed that is compatible with the annealer (Fig. 2). The algorithm consists of a decision tree and several quadratic regressors. The tree categorizes the input into different groups according to the magnitude of y. The depth of the tree was set to two, and the training data were categorized into three groups automatically. Then, individual quadratic models were generated for the different groups. Compared with ordinary quadratic regression, the hybrid model can handle data that is more complex because of the tree. A previously reported tree-linear hybrid model (linear tree)40 was not used due to the excessive training time for high-dimensional data. The third-generation Fujitsu Digital Annealer natively handles the constraint conditions generated by the decision tree, whereas normal quantum annealers only input pure QUBO data.
Quantum annealers are a new type of computer based on quantum physics and can solve QUBO problems efficiently.34,35,39 Their superparallel nature can outperform classical computers, although the acceptable dimensions of data and interactions are currently limited.39 Thus, quantum-inspired classical hardware and software have been developed as alternatives to real annealers. These quantum-annealer-mimicking systems, such as the Fujitsu Digital Annealer34,35 and simulated bifurcation machines (SBMs),42 provide practical solution times of typically 1 to 101 s, and they provide similar solutions to quantum annealers.
A third-generation digital annealer accepts one of the largest binary bit sizes of 100
000,34 whereas true (D-Wave: about 5000)39 and other (SBM: 10
000)42 quantum annealers accept smaller sizes. The higher dimension allows quasi-continuous variables to be used with annealers, even though they accept originally binary data (0 or 1). In the present work, the dimension of the electrolyte structure data Xwas about (200 descriptor dimension) × (maximum of 6 compound components) = 1200. If each continuous variable is expressed with 10 bits by the unary method (eqn (2)), the corresponding bit size would be as large as 12
000. This dimension size is acceptable for the digital annealer, but it can be challenging for D-Wave and SBM.
In the current work, the dimension of the continuous vector X was, fortunately, reducible from about 1200 to 200 by dropping the columns in which all the records have the same variables. The hybrid model was still most compatible with the Fujitsu Digital Annealer because the decision tree and quadratic regression model generated QUBO data. The decision tree categorized the training data into three main groups. Then, three QUBO data were generated with different constraints specified by the tree algorithm. With normal annealers, constraints must be carefully embedded in QUBO data by adding large constants, which requires parameter optimization. In contrast, the annealer natively accepts the constraints during solution finding, and we designed the hybrid regression model for this hardware.
A heatmap was created to compare Xideal with several datasets from the training data (Fig. 4). Although the first two solutions were infeasible in terms of y, they looked like existing data, indicating that the corresponding chemical structures could be reasonable.
![]() | ||
| Fig. 4 Heatmap for the continuous vector of X. The first three lines show data sampled by annealing. The rest are representative records in the training dataset. | ||
According to the sampled solutions, essential chemical structures for higher conductivity were explored. We calculated the cosine similarity between Xideal and electrolyte data in the training dataset (Fig. S1†). As the similarity increased, the experimental conductivities tended to increase because Xideal contained essential features for larger y. The highest similarity was limited to about 70% due to the limited types of chemical structures recorded in the experimental database. Imperfect constraints during quantum sampling were another reason for the gap. We found no significant trend differences in the chemical structures of the three sampled solutions; thus, the essential structural trends should not change significantly, regardless of ionic conductivities (Fig. S2 to S4†).
The chemical structures that appeared frequently are summarized in Fig. 5. The most common species were esters, such as cyclic and linear carbonates.43,44 The esters are popular electrolyte components for lithium-ion batteries because their polar moieties solvate ions. Sulfonylimide anions were also extracted many times because their delocalized negative charge allows greater dissociation of ions and higher ionic conductivity. Flexible siloxane structures also appeared as candidates. They are often used with solid-state polymer electrolytes because they exhibit lower glass transition temperatures, and thus allow more active ion movement.43,44 Furthermore, zwitterionic salts have been proposed as additives to electrolytes and their highly polarized molecular structure has attracted attention for superionic conduction.45 These results demonstrated that our system can extract important and reasonable molecular structure trends for higher performance. Such comprehensive trend analysis has been done mainly by humans but is also now accessible by computers.
Suggesting de novo molecular structures is a challenging topic in informatics. Just comparing solutions with conventional molecular data does not yield new molecular designs. However, new molecular designs can be obtained from the sampled solutions by using deep reinforcement learning, although it requires additional computation and parameter tuning.21 This work used a heuristic approach, involving manually modifying molecules from the extracted candidates.
A cyclic carbonate was suggested by quantum-inspired annealing as a reasonable electrolyte design. We decided to replace the oxygen atoms in the ring with sulfurs, yielding trithiocarbonate (Fig. 6). The decision to replace the oxygen atoms in the ester with sulfur atoms was made purely from the perspective of a human chemist. Although our system was helpful in extracting essential information from known data, it was not necessarily beneficial for providing an entirely novel perspective. Therefore, instead of oxygen, we decided to focus on sulfur, which also belongs to group 16 in the periodic table and possesses similar properties but with some differences.
Density functional theory (DFT) calculations by Gaussian16 were performed for the two cycles at the B3LYP 6-31G(d′,p′) level (Fig. 6a and b). The structures were initially optimized by a semiempirical PM6 method, and then by B3LYP 6-31G(d′,p′). The cyclic carbonate and trithiocarbonate molecules yielded large dipole moments of about 5.2, indicating the potential ability to solvate cations.
The Mulliken charge distribution on the molecules was slightly different. Heteroatoms in the two five-membered rings showed a charge of −0.3, whereas the charges on the (thio)ketone differed: the values were −0.37 for oxygen and −0.096 for sulfur. The higher electronegativity of oxygen resulted in a larger negative value on the carbonyl group. Despite the interesting similarities and differences between carbonate and thiocarbonate, thiocarbonate has been little studied as an electrolyte, except as an additive for solid electrolyte interphase formation.46
![]() | ||
| Fig. 7 Characterization and conductivity data of the polymer electrolytes. (a) Raman spectra for the electrolyte and pristine components. (b) Nyquist plot for the 11 mol% LiTFSI in PTCMA electrolyte measured at 106 to 100 Hz. See Fig. S5 to S8† for other plot data. (c) Room-temperature conductivity as a function x. (d) Arrhenius plot for the 11 mol% electrolyte and a typical PEO electrolyte. Conductivity data for PEO was taken from a previous report.50 | ||
The temperature dependence of the conductivity generally followed the Arrhenius equation (Fig. 7d). The activation energy for the 11 mol% electrolyte was 35 kJ mol−1, giving a much smaller temperature dependence of conductivity than the PEO composite where ion motion is strongly coupled to the segmental motion of the polymers.50
The composite film was mechanically hard even after complexation with lithium salts. A high glass transition temperature above 100 °C was obtained with the 11 mol% salt composite (Fig. S9†). The result was different from a carbonate polymer in which the sulfur atoms in PTCMA were replaced by oxygens, and the plasticizing effect lowered the glass transition temperature to −19 °C.51 The weaker interactions of sulfur with lithium than with oxygen according to the HSAB rule may have resulted in the retention of the original mechanical properties. The carbonate polymer exhibited a room-temperature conductivity of 10−5 S cm−1 with LiTFSI, about 10 times greater than the current electrolyte.51 However, the loss in conductivity was slight compared with the drastic difference in glass transition temperatures of >150 °C for trithiocarbonate and −19 °C for carbonate.
Although the conductivity of 10−6 S cm−1 did not reach that of state-of-the-art solid polymer electrolytes (>10−4 S cm−1),43,44 the characteristics of the PTCMA electrolyte could be promising due to the improved thermal properties. Amorphous electrolytes without crosslinking show high conductivities, but are viscous above the glass transition temperature.44 Semicrystalline polymers such as PEO are mechanically tough but the crystalline domains are not highly conductive. PEO also begins to melt at a relatively low temperature of about 70 °C.44 Crosslinking is essential to improve thermal and mechanical properties, but interface engineering becomes more challenging due to the decreased miscibility with other materials.43,44 Conversely, linear and glassy polymers, including the electrolytes presented here, can have technological advantages because they are thermally stable and easily molded or solution cast. Their decoupled ion movement through hard media is scientifically attractive and even has the potential for superionic conduction.52,53 We intend to continue to optimize molecular structures using the molecule exploration system and plan to evaluate better candidates in real battery environments.
Inspired by the structures suggested by digital annealing, new solid polymer electrolytes containing cyclic trithiocarbonate moieties were investigated experimentally. Lithium salt composites showed potentially promising ionic conductivities above 10−6 S cm−1 at room temperature and thermal stabilities above 80 °C. Data-driven molecular screening can objectively extract essential features for higher performance and help design newer structures. More extensive applications with different materials will improve the exploration system and accelerate research to pursue better functional materials for energy-related devices.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3ra01982a |
| This journal is © The Royal Society of Chemistry 2023 |