Vincent Chung*a,
Aron Walsh
a and
David J. Payne
abc
aDepartment of Materials, Imperial College London, South Kensington, London SW7 2AZ, UK. E-mail: vincent.chung15@imperial.ac.uk
bResearch Complex at Harwell, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0FA, UK
cNEOM Education, Research, and Innovation Foundation, Al Khuraybah, Tabuk 49643-9136, Saudi Arabia
First published on 19th July 2025
The rate of materials discovery is limited by the experimental validation of promising candidate materials generated from high-throughput calculations. Although data-driven approaches, utilizing text-mined datasets, have shown some success in aiding synthesis planning and synthesizability prediction, they are limited by the quality of the underlying datasets. In this study, synthesis information of 4103 ternary oxides was extracted from the literature, including whether the oxide has been synthesized via solid-state reaction and the associated reaction conditions. This dataset provides an opportunity to supplement existing solid-state reaction models via reliable data and information from articles whose content and formats are challenging to extract automatically. A simple screening using this dataset identified 156 outliers from a subset of a text-mined dataset that contains 4800 entries, of which only 15% of the outliers were extracted correctly. Finally, this dataset was used to train a positive-unlabeled learning model to predict the solid-state synthesizability of new ternary oxides, where we predict 134 out of 4312 hypothetical compositions are likely to be synthesizable.
A promising and scalable method is to use data-driven approaches to learn from synthesis records. Raccuglia et al. used hydrothermal reaction data from their laboratory notebooks to train a model to predict the reaction outcome of templated vanadium selenite.13 Bartel et al. applied the independence screening and sparsifying operator (SISSO) method to analyze synthesized perovskite oxides and halides to create a new tolerance factor, with overall improved performance compared to the traditional perovskite Goldschmidt tolerance factor.14,15
The first major obstacle for data-driven approaches to predicting materials' synthesizability is the low quantity and quality of relevant data. Synthesis information is not easily accessible on a large scale because it is commonly stored in text format in the literature or private lab books.13 To overcome this challenge, natural language processing (NLP) techniques have been used to build material synthesis datasets. Kim et al. developed an autonomous framework to extract the synthesis parameters from over 640000 journal articles on 30 oxide systems.16 This was later expanded and improved, and the resulting text-mined dataset was used to predict hydrothermal synthesis conditions17 and to plan the solid-state synthesis of metal oxides.18 Later on, Kononova et al. developed a text mining pipeline for solid-state reactions and sol–gel synthesis data from the literature,19 which was used to train a reaction graph model for predicting the major product20 and synthesis conditions21,22 of solid-state reactions. More recently, models trained with text-mined datasets were used to generate synthesis recipes for an autonomous laboratory that accelerated the discovery of novel materials.23
As data availability increases, the bottleneck of data-driven approaches shifts from quantity to quality of text-mined datasets. As demonstrated in the chemistry community, the difference in performance between a well-filtered and noisy dataset should not be ignored.24–26 The overall accuracy of the Kononova et al. dataset (where all of the extracted synthesis conditions and actions of the entry are correct) is only 51%,19 which was cited by Malik et al. as a reason to use coarse descriptions of synthesis action (e.g. mix/heat/cool) instead of detailed descriptions (e.g. heating temperature/time) in their study.20 While it is widely acknowledged that the quality of text-mined datasets is lower than their manual counterparts,27 no quantitative analysis between the two has been performed in the materials domain, which could have served as a milestone or a goal for the material text-mined dataset.
Another issue with current material synthesis data, as highlighted by Raccuglia et al. and Jensen et al., is that it is rare for papers to include failed material synthesis attempts,13,28 which is challenging to resolve without a change in the scientific community. One approach to overcome the lack of failed reaction data is positive-unlabeled (PU) learning, a type of semi-supervised learning when only positive and unlabeled data are available.29 Frey et al. adopted a transductive bagging PU learning approach developed by Mordelet et al. to predict the synthesizability of 2D MXene and their precursors.30,31 Jang et al. later used a similar approach to predict the synthesizability of hypothetical compounds in the Materials Project.32 Recently, Gu et al. used inductive PU learning and domain-specific transfer learning to predict the synthesizability of general perovskites, which showed better performance than Jang et al.'s and tolerance factor-based approaches.33 However, all three studies can only evaluate the positive data, so it is difficult to estimate the number of false positives (compounds that cannot be synthesized but are classified as synthesizable).
In this paper, a dataset that contains information on whether ternary oxides in the Materials Project database with ICSD IDs have been synthesized via solid-state reaction was human curated (i.e. manually). This included articles whose formats are difficult for automatic extraction. Potential applications of this human curated dataset are illustrated in the following sections: (1) analysis of Ehull with solid-state synthesizability, defined as whether the material can be synthesized via solid-state reaction, as opposed to general synthesizability; (2) outlier detection of a text-mined dataset on solid-state reaction; (3) prediction of solid-state synthesizability using a PU learning framework.
Each ternary oxide has been checked in the literature for whether it has been synthesized via a solid-state reaction. If there is at least one record that the compound has been synthesized via solid-state reaction, the highest heating temperature, pressure, atmosphere, mixing/grinding condition, number of heating steps, cooling process, precursors, and whether the synthesized product is single-crystalline, were collected when available. Otherwise, the material would be labeled as non-solid-state synthesized (the material has been synthesized but not via solid-state reactions). For entries in which there was insufficient evidence that the ternary oxides have been synthesized via solid-state reactions, they were labeled as undetermined. The reasons for these undetermined entries are provided in the comment section of the dataset. In total, the human curated dataset contains 3017 solid-state synthesized entries, 595 non-solid-state synthesized entries, and 491 undetermined entries. The data description of the human curated dataset can be found in ESI S2.†
For comparison, Kononova et al.'s text-mined solid-state reaction dataset (ver. 2020-07-13) was downloaded through their repository.19 It contains 31782 solid-state reaction entries from the literature after the year 2000. The definition of whether a synthesis is considered a solid-state reaction in this study differs from those in previous studies. Huo et al. defined a solid-state reaction as follows: (1) the input materials are subjected to a process of grinding/milling; (2) the powders are mixed and heated.35 During data curation, we observed that a non-negligible number of papers did not explicitly state the grinding/milling steps, so the first criteria was dropped. We also added two additional criteria for a synthesis to be considered a solid-state reaction: (3) the reaction does not involve flux or cooling from melt (except for high-pressure solid-state synthesis where oxidizers were used with a secondary function as flux or mineralizer for higher crystallinity when explicitly stated); (4) the heating temperature must not be above the melting point of all the starting materials. Details on the processing of the text-mined dataset can be found in ESI S3.1.†
For the analysis of the dataset, binary oxide melting points were taken from the CRC Handbook of Chemistry and Physics online36 and other papers. Some of the melting points are the decomposition temperature or the transition temperature.
The metrics used for the validation were recall and precision. The recall and precision used in the paper were based on the definitions used by Raghavan and Jung for information retrieval.37 The recall is the ratio of the number of correctly extracted values to the number of relevant values. The precision is the ratio of correctly extracted values to the number of extracted values. For consistency, data validation was performed by the same (first) author who manually extracted the dataset. The formula and an example of the recall and precision calculation are shown in ESI S4.†
The solid-state synthesized and non-solid-state synthesized compositions were gathered from the human curated dataset, while the hypothetical compositions were collected from the Materials Project ternary oxide entries without ICSD IDs or with ICSD IDs that reference a computational (i.e. non-experimental) paper. All compositions were then featurized using matminer38 and basic mathematics operations based on their binary oxide melting points (ESI S5.1†). Compositions that contain the element Pa or have difficulties in oxidation state assignment were removed, resulting in 7033 compositions in the dataset for PU learning. This dataset for PU learning contains 2213 solid-state synthesized, 508 non-solid-state synthesized, and 4312 hypothetical compositions.
In total, three PU learning models were trained based on different labeling schemes but on the same dataset, as shown in Fig. 1. The task of model 1 and model 2 is the prediction of solid-state synthesizability, while the task of model 3 is the prediction of general synthesizability. The data labeling schemes for the three models are as follows:
• For model 1, the solid-state synthesized compositions are positively labeled, while the hypothetical compositions and non-solid-state synthesized compositions are unlabeled.
• For model 2, the solid-state synthesized compositions are positively labeled, the non-solid-state synthesized compositions are negatively labeled, and the hypothetical compositions are unlabeled.
• For model 3, the solid-state synthesized and non-solid-state synthesized compositions are positively labeled, while the hypothetical compositions are unlabeled.
The data labeling of models 1 and 2 differ in the labeling of non-solid-state synthesized compositions, where they are unlabeled in model 1 and negatively labeled in model 2. Two schemes were tested and compared because the non-solid-state synthesized compositions are relatively noisier and not a random representation of solid-state unsynthesizable compositions (e.g. unsynthesizable compositions are also solid-state unsynthesizable compositions, but not non-solid-state synthesized compositions).
In addition to the three PU learn models, a supervised learning Lightgbm39 classification model was trained without using the hypothetical compositions to show why PU learning is required for synthesizability prediction. For this supervised learning model, the solid-state synthesized compositions are positively labeled and the non-solid-state synthesized compositions are negatively labeled. The data was separated into train-test sets with an 8:
2 ratio in a stratified manner. 10-Fold cross-validation was then performed on the training set. In the end, model evaluation was performed on the test set. The Lightgbm model was trained with an AMD Radeon GPU and specified the use of the 64-bit float point setting to prevent reproducibility issues experienced when using a 32-bit float point (the default) for number summations.39 Details of feature selection and hyperparameter tuning of all models are in ESI S5.2 and S5.3.†
For model 3, due to the absence of negative data (unsynthesizable compositions), the bias ROC AUC was computed instead, which assumes all unlabeled data (hypothetical compositions) as unsynthesizable compositions and labeled as negative. For a clean positive data set (no mislabeled positive entries), the relationship between the true AUC and the biased AUC (AUCPU) is described by the following equation:40
The precision and recall of the solid-state reaction conditions extracted in the solid-state synthesized entries are shown in Table 1. Overall, the precision of the extracted conditions is high (0.96–1), while the recall is slightly lower (0.89–1). Lower extraction performance was observed for columns that contain more information, namely the cooling and mixing/grinding conditions.
Metric | Solid-state synthesized | Precursor | Heating temperature | Heating pressure & atmosphere | Cooling | Mixing/grinding |
---|---|---|---|---|---|---|
Precision | 0.99 | 0.96 | 0.98 | 1 | 1 | 0.96 |
Recall | N/a | 0.99 | 1 | 0.98 | 0.9 | 0.89 |
Metric | Solid-state synthesized | Precursor | Heating temperature |
---|---|---|---|
Precision | 0.88 | 0.93 | 0.85 |
Recall | N/a | N/a | 0.87 |
• Target material synthesized via non-solid-state methods:
– Three of them were synthesized via methods that do not fit Huo et al.'s definition of a solid-state reaction.35 These alternative synthesis methods are sol–gel precursor synthesis, solid–gas reaction, and mechanical activation of precursors without heating.
– Two of them described the solid-state synthesis attempts of Nd6Mo10O39 and LiBiO3, but according to their respective result and discussion sections, their synthesis attempts were unsuccessful.41,42
– Seven of them had erroneous target strings. In some of these cases, the target composition described the nominal composition of the mixture instead of the actual composition of the target.
• Incorrect precursor:
– Three out of seven erroneous extracted precursors were due to the extraction of precursors from another target or synthesis route in the same paragraph.
• Incorrect highest heating temperature:
– Eleven of them did not contain the heating step with the highest heating temperature.
– Ten of them did not extract the correct highest heating temperature from the heating step.
The recall of the precursor category for the text-mined dataset is not applicable because Kononova et al. filtered out entries where balanced chemical reactions could not be formed using the target and precursors of the entries.19 Therefore, all of the text-mined entries have precursor information. Out of the three categories, the highest heating temperature performed the worst, with a precision and recall of 0.85 and 0.87, respectively, which are notably worse than the human curated dataset's precision and recall.
The overall lower extraction performance suggests that a model trained using the text-mined dataset should perform worse than a model trained using the human curated dataset. However, the severity of the impact would require further investigation. For example, solid-state reactions can occur over a relatively wide temperature range and the range can be several times larger than the mean absolute error (MAE) of the highest heating temperature in the text-mined dataset, which was calculated to be 48 °C. On the other hand, models for synthesis condition prediction in the literature used features generated from the target and precursor information.21 Therefore, the extraction errors of the precursor and target should be considered. Other factors, such as the number of chemical systems, are also important and can affect the generalization of models. For the 100 entries in the text-mined dataset examined, only 63% of the entries have the correct solid-state synthesized target, precursors, and highest heating temperature.
Another limitation of the current automatic data extraction is that many of them focus only on the paragraphs describing the synthesis, which might ignore relevant information located in other sections of the paper. The focus on extraction from a single paragraph in the methods section also misses opportunities to curate information on the target material such as crystal structure and physical properties, which would provide useful information for future investigations on wider quantitative structure–property relationships. However, training a model to collect information from non-method sections of the paper increases training complexity and time, so this trade-off has to be considered. Human curation also offers increased flexibility, where additional but crucial information or details outside of the initial extraction plan can be collected.
The largest disadvantage of human data extraction is the time required to curate the data. Data collection of the 4103 ternary oxide entries took around a year for a single person, with synthesis condition extractions taking up to 10 minutes per paper. This is significantly longer than the automatic pipelines for text mining synthesis conditions as once the models are trained, the time it takes per paper is on the scale of seconds. A possible circumvent is through a combination of crowd-sourcing and semi-automated text mining methods,43 but there are some difficulties to overcome, mainly the software for collaborated annotation and the lack of community agreed standards.27 In addition, once the automatic pipeline is trained, the information extracted from the same paper will be consistent and the decision-making process can be reviewed and analyzed. The same cannot be done for manual data extraction readily, even if strict criteria are applied.
Fig. 2a agrees with the previous analysis of computational material databases where the frequency of synthesized materials decreases at higher Ehull,10,44 making it a simple criterion to account for the likelihood of finding synthesizable materials. In addition, we found that around 6% of the examined ternary oxide entries are hydrates in the ICSD but not in the Materials Project, where only hydrogen atoms are missing. Most of them tend to have higher energy (81.3% of them have an Ehull above 100 meV per atom) so setting a heuristic upper limit to Ehull during screening may be effective at filtering them out.
However, the ratio of solid-state synthesized materials and synthesized materials does not follow the same trend, where the ratio is 0.69–0.87 across all Ehull values (narrower Ehull intervals show the same trend, as shown in ESI S6†). This suggests that it may not be possible to set a simple energy threshold to determine the solid-state synthesizability. There are a few reasons:
• Solid-state reactions have a wide range of reaction conditions that could change throughout the synthesis process. This means that calculation of the free energy of reactions at one particular set of reaction conditions and reactants cannot reflect the stability of phases throughout the reaction, for example, because of the formation of intermediate phases.
• At temperature typically used in solid-state reaction, entropic contribution to the free energy of competing phases and reactions cannot be ignored.11 In addition, configurational entropy also plays a major role in the stability of multi-component materials.45,46
• The kinetic barriers of reactions have an effect on the formation of intermediate phase during solid-state reaction, which could affect the final synthesis target. Todd et al. show that for the metathesis reaction LiMnO2 + YOCl → LiCl + YMnO3, using different polymorphs of LiMnO2 as the reactant will alter the rate of the initial reaction, resulting in the formation of different polymorphs of YMnO3.47 Non-equilibrium cooling from high temperature can also stabilize metastable phases and prevent their decomposition. For example, the crystal structure of solid-state synthesized Bi0.4Pb0.6Mn0.4Ti0.6O3 depends on whether the reaction was quenched or slowly cooled.48
Another observation is that ∼96% of the entries have at least one binary oxide as one of the precursors, and ∼58% of them used binary oxides exclusively. Similar observations were observed by He et al. for the text-mined dataset, where simple oxides are often the most commonly used precursor, attributed to their stability under ambient conditions.49 The dominant usage of a few types of precursor implies that the trends in synthesis condition obtained from an analysis of literature data are biased. This should be taken into consideration by researchers when using this information for the prediction of synthesis condition.
Compared to heating conditions, which are mostly reported numerically, the descriptions of cooling conditions can be either numerical (either the cooling rate or time), descriptive (e.g. furnace cooling, slow cooling), or both, with descriptive conditions being more common. The frequency of the three most common descriptions of cooling conditions (quench, slow cooling, and furnace cooling) and the cooling condition information (cooling rate, cooling time, and cooling medium) are shown in Fig. 4.
![]() | ||
Fig. 4 Frequency of cooling condition descriptions and information out of the 607 entries that have cooling conditions in the human curated dataset. |
The choice of cooling conditions, similar to the choice of precursors, is also influenced by researchers' bias. For example, out of the 607 examined papers with cooling conditions in the human curated dataset, 544 (90%) have one cooling step while 63 (10%) have two or more cooling steps. In particular, out of 47 Mo ternary oxide entries, 29 of them used two or more cooling steps for the solid-state synthesis, which is a much higher ratio than other ternary oxides'. An examination of the entries reveals that the reason is likely due to the researcher's bias, since 26 out of 29 entries came from 18 reference articles with at least one but often two common authors.
On the other hand, descriptions of mixing and grinding conditions are mostly descriptive. As shown in Fig. 5, manual mixing and grinding are more popular than mechanical alternatives (e.g. ball milling and vibration milling). The top three liquid mediums used for wet milling are acetone, alcohol (mainly ethanol), and water. The high usage rate of these liquids is probably due to the fact that these chemicals are widely available in all laboratories, and common precursors used in solid-state synthesis are insoluble in these liquid mediums. Although the choice of liquid mediums may not affect the chemical composition of the synthesis target, different liquid mediums would influence the particle size and geometry of the precursors, which in turn would affect the properties of the synthesis product.50,51
Despite the lower reporting frequency of cooling and mixing conditions, they are sometimes important to solid-state synthesis, for example:
• The solid-state reaction between Er2O3 and Na2CO3 would yield NaErO2 with either α-NaFeO2 type or β-LiFeO2 type structure depending on the cooling condition.52
• Wet milling of Bi2O3 precursors with acetone under ambient atmosphere would introduce carbon contaminates, which could lead to the formation of Bi2O2CO3 as impurities in solid-state reactions.53
• Chen et al. compared solid-state synthesized LiNi0.5Mn1.5O4 samples prepared with ball-milled and manually ground precursors and noticed a difference in electrochemical performance.54
Unfortunately, there is usually no clear indication of the importance or necessity of these synthesis conditions on the synthesis outcome.
Interestingly, the published date of the article affected the degree of omission of data. Out of 1149 entries that referenced pre-2000 sources (1929–1999), only 223 (19%) and 127 (11%) have any cooling and mixing conditions, respectively. This increased to 384 (26%) and 434 (30%) for cooling and mixing conditions, respectively, out of 1469 entries referenced from post-2000 sources. There are also minor differences in the reported synthesis conditions. For example, there are more papers that reported using furnace cooling or have defined cooling rates in post-2000s papers.
In summary, the reporting habit and frequency of the synthesis conditions from the literature should be taken into account when preparing an annotated synthesis text to train a text-mining algorithm to extract synthesis conditions. As discussed above, certain conditions like cooling and mixing conditions are reported less frequently and sometimes in greater varieties (e.g. cooling conditions could be a cooling rate, cooling time, or a description), which means a greater amount of annotated synthesis text is required for the model to learn to extract these conditions compared to more common conditions like the precursors and heating temperature.
As a demonstration, the general trend of ternary oxide calcination temperature observed from the human curated dataset was applied to find outliers in a subset of the text-mined dataset. The text-mined subset consisted of 4800 entries where each entry contains one ternary metal oxide as the target material and the constituent elements of the target material are all present in the human curated dataset. The ratios of the highest heating temperature and the minimum/maximum/mean of the binary oxide melting point were chosen as filters. This choice was made because the melting points of the precursors are a common consideration of the heating temperature used for solid-state synthesis, e.g. Tammann-rule.55 Binary oxide melting points were used in place of precursor melting points as the latter are not easily available, and binary oxides are used in most solid-state synthesis for ternary oxides.
The cumulative distribution of the ratio of the highest heating temperature and the maximum binary oxide melting point for the human curated dataset and text-mined subset is shown in Fig. 6, which was chosen due to the highest Pearson's correlation out of the three ratios (ESI S8†). 90% of the text-mined subset has a ratio between 0.44 and 0.80, which is narrower than the range of the human curated dataset (0.40 and 0.87). The wider distribution is due to a larger number of chemical systems in the human curated dataset (1072 vs. 665) despite having fewer entries (2234 vs. 4800). The inclusion of more chemical systems could improve the generalizability of machine learning models, as shown by D. Jha et al. where the MAE of the predicted formation enthalpy of compounds increased when the compounds that share the same chemical systems were removed from the training data.56
To identify the outliers, the minimum/maximum and 1st/99th percentile pairs of the 3 ratios calculated from the human curated dataset were chosen as thresholds to filter the text-mined subset. The reference papers of the entries considered to be outliers were examined to verify whether the synthesis target, method, and highest heating temperature were correctly extracted. Discrepancies in the definition of solid-state synthesis were accounted for during examination by using Huo et al.'s definition.35
Out of the 52 entries identified as outliers using the minimum/maximum ratios as filters, only 5 (∼10%) were correctly and completely extracted. Whereas, when using 1st and 99th percentile as filters, the correct ratio increased to ∼15% (23/156). Most of these outliers were detected because their ratios are below the lower thresholds, whereas only 6 entries out of 156 have ratios above the upper thresholds.
Out of the 133 erroneous entries, around 23 (49%) have erroneous highest heating temperatures, 45 (34%) have the wrong synthesis method, and 23 (18%) have the wrong target. Most of these errors are described in the earlier sections on the evaluation of the text-mined dataset. In particular, it was observed that sentences that described heating operations but used a general verb that can apply to all synthesis methods (e.g., “synthesis” and “prepare”) were not extracted properly because the action was defined by the researchers as “starting operations” as opposed to “heating operations”. This illustrates how a possible error in the data annotation could affect automatic text extraction. An example of the above can be found in ESI S3.2.†
The higher quality of the human curated allows the derivation of simple criteria for filtering text-mined datasets, which can reduce error propagation when the dataset is used for model training and also help researchers identify the limitations and flaws in the automatic text extraction process.
Model | Decision threshold | ROC AUC | G-mean | TPR | FPR | Predicted SSS (%) |
---|---|---|---|---|---|---|
Model 1 | 0.562 | 0.921 | 0.863 | 0.780 | 0.047 | 3.9 |
Model 2 | 0.584 | 0.771 | 0.713 | 0.738 | 0.311 | 7.9 |
Supervised model | 0.827 | 0.775 | 0.767 | 0.813 | 0.314 | 66.3 |
(1) The non-solid-state synthesized compositions cannot fully represent the solid-state unsynthesizable compositions, because they do not include unsynthesizable compositions, which are present only among the hypothetical compositions.
(2) As shown in previous sections, the probability of non-solid-state synthesized compositions mislabeled during manual data collection is relatively high (precision of 0.86), which means that around 14% of them should be positively labeled instead.
Therefore, in all iterations during training, model 2 classifiers learned from a higher proportion of non-solid-state synthesized compositions that are negatively labeled (step 3 in Fig. 1), which reduces the effectiveness of the model in learning from the unlabeled data and in distinguishing between solid-state synthesizable and unsynthesizable compositions (Fig. 8).
The similar performances of model 2 and the supervised learning model highlight the importance of understanding the dataset before model training. Although the supervised learning model appeared to perform better when looking at only the evaluation metrics alone, the model cannot predict the solid-state synthesizability of hypothetical compositions reliably. This is because the supervised learning model learns only from solid-state and non-solid-state synthesized compositions (both are synthesizable compositions), but does not learn from unsynthesizable compositions. Therefore, the supervised learning model can only make solid-state synthesizability predictions on synthesizable compositions, which is not the case for hypothetical compositions, since they contain both synthesizable and unsynthesizable compositions. As a result, the supervised learning model predicted that a high number of hypothetical compositions to be solid-state synthesizable as shown in Table 3.
To further verify the model predictiveness, the predictions made by model 1 on non-solid-state synthesized compositions were examined. An examination of 24 compositions predicted to be solid-state synthesizable but labeled as non-solid-state synthesized material in the human curated dataset found that 45.8% (11 out of 24) of them have been synthesized via solid-state synthesis, which indicates they were erroneous entries in the human curated dataset. A further 40 compositions with the highest score below the decision threshold and the 30 compositions with the lowest scores were examined for comparison, where only 7.5% and 10% were found to have been synthesized via solid-state reaction, respectively. The much higher percentage of solid-state synthesized compositions above the decision threshold compared to compositions below the threshold means that model 1 is capable of predicting the solid-state synthesizability of compositions.
Materials project id | Composition | Solid-state synthesizability score |
---|---|---|
mp-778430 | K8Al2O7 | 0.926 |
mp-1200359 | Na12(CuO2)7 | 0.895 |
mvc-3343 | Zn3Sn2O7 | 0.869 |
mp-774805 | Na6Mn7O10 | 0.861 |
mp-758139 | Sr21Co14O43 | 0.844 |
mp-1197010 | Sr5Ti9O23 | 0.841 |
mp-1223708 | K2Mo8O13 | 0.838 |
mp-1096838 | Ba(AgO)2 | 0.827 |
mp-774428 | K3V14O28 | 0.826 |
mp-1197629 | Sr3Ti5O13 | 0.819 |
mp-1202462 | Sr5Ti8O21 | 0.812 |
mp-674312 | Eu2Nb4O13 | 0.810 |
mp-1147775 | Ba2OsO4 | 0.807 |
mp-1018032 | SrCdO2 | 0.805 |
mp-1198567 | Sr25Ti39O103 | 0.791 |
mp-757454 | Mn6PbO12 | 0.790 |
mp-1202132 | Sr5Ti7O19 | 0.789 |
mp-1201432 | Sr15Ti23O61 | 0.787 |
mp-755102 | K6Cr2O9 | 0.781 |
mp-1208410 | TbMoO5 | 0.746 |
mp-773070 | KNb2O5 | 0.737 |
mp-753320 | RbV4O10 | 0.718 |
mp-981103 | Sr3CdO4 | 0.705 |
mp-674350 | TiPb9O11 | 0.703 |
mp-1095551 | K8AsO3 | 0.691 |
During inspection of the candidate compositions, 3 of them were removed as these compositions are highly likely to be synthesized compositions with non-stoichiometric oxygen (Bi16Ru16O55 is probably oxygen-deficient Bi2Ru2O7) or with fractional occupancies (Na11(Ru4O9)4 and K6Nb11O30 are probably Na2.7Ru4O9 (ref. 57) and K6Nb10.88O30 (ref. 58)). This highlights a limitation of the Materials Project, where disordered materials are not distinctively represented from ordered materials.59
![]() | ||
Fig. 9 Comparison of general synthesizability score of the 6924 ternary oxide compositions present in both the current and Jang et al. study.32 (a) 2652 compositions with ICSD IDs (b) 4272 compositions without ICSD IDs. The horizontal and vertical lines are the decision thresholds for model 3 and Jang et al. model, respectively. The upper right and lower left quadrants of each plot contain the compositions where the classification of both models align. For compositions with polymorphs in Jang et al. study, the highest score of each composition was chosen for this comparison. |
One possible reason for the difference in prediction is that while both used PU learning, Jang et al.'s model used structural features,32 while model 3 used compositional features. This could mean that a hypothetical material with a dissimilar composition but a similar crystal structure to synthesized materials might be predicted as synthesizable by Jang et al.'s model but unsynthesizable by model 3. Another reason is the difference in training data. While model 3 was trained with only ternary metal oxides, Jang et al.'s model was trained on materials aside from ternary oxides. In addition, an examination of the Materials Project entries assumed to be synthesized ternary oxides by Jang et al. showed that around 10% have a different stoichiometry from the actual material (e.g. missing hydrogen for hydrates or oxygen-rich/deficient phases) or is a duplicated entry of the same materials based on a different structure refinement.
The labeling scheme for synthesizable materials is complicated because it depends on the discovery of the materials and the origin of the data. Solid-state synthesizable compositions that are unlabeled in this study can be due to (1) no or failed synthesis attempts; (2) the information is not added to ICSD and the Materials Project; (3) the material has been solid-state synthesized but was not found during data collection. All three reasons are affected by the bias in research on certain chemical systems/crystal structures, either because the materials have desired properties or their relative ease of synthesis. If there is any bias in the discovery of materials and/or inclusion in relevant databases, the model will underestimate the synthesizability of hypothetical materials that are dissimilar to the training data but otherwise synthesizable. Nevertheless, such limitations are not unique to PU learning and apply to other data-driven approaches for materials research. Although biases from the data are not always a negative aspect, e.g. exclusion of elements due to toxicity or limited availability, we believe this should be considered when drawing conclusions from the results.
The bias in model 1 synthesizability prediction was examined by comparing the chemical systems between the predicted synthesizable and synthesized compositions. 141 out of 168 (83.9%) of predicted solid-state synthesizable compositions are from chemical systems that have at least one solid-state synthesized composition in the human curated dataset. This indicates that a hypothetical composition has a higher likelihood of being predicted as solid-state synthesizable if there are other solid-state synthesized compositions in the training dataset with the same chemical systems.
Another limitation of the PU method in this study is assuming the absence of observation as negative data, where synthesized compositions that haven't been solid-state synthesized are treated as solid-state unsynthesizable. This assumption is supported by the fact that more than 80% of the synthesized compositions in this dataset have been solid-state synthesized in at least one paper, demonstrating that solid-state synthesis has been one of the most popular synthesis approaches. Therefore, it is likely that solid-state synthesis has been attempted for most of the compositions. However, this assumption may not apply to relatively new or uncommon synthesis methods. More importantly, solid-state synthesis is uncommonly chosen for the synthesis of materials where the intended application requires the dimensionality of the materials to be in the nanometer or micrometer scale. Therefore, exploratory investigations to discover new thin-film or nanomaterials might be biased toward synthesis methods like sol–gel and thin-film deposition. In these cases, the evaluation of the false positives would be difficult and might lead to an overestimation or underestimation of the number of synthesizable materials.
This human curated dataset was then used to train a PU learning model to predict the solid-state synthesizability of hypothetical ternary oxide compositions using only compositional information. By cross-validating with another model that predicts the general synthesizability, 134 compositions were identified to be solid-state synthesizable, of which at least 56 have been synthesized and 43 out of the 56 have been solid-state synthesized. Future investigations on synthesizing hypothetical compositions can be attempted by predicting synthesis conditions using the collected data.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5dd00065c |
This journal is © The Royal Society of Chemistry 2025 |