Assessment of magnetic properties of A2BBO6 double perovskites by multivariate data analysis techniques

Perovskite oxides form a structurally simple but compositionally and functionally diverse family of inorganic materials. The ABO3 single perovskite itself allows a wide range of A and B cation combinations, but through co-occupation of one of the two cation sites with two different cation species the material category is expanded to ordered perovskites, such as the B-site ordered double perovskites A2BBO6. 1,2 These compounds were first studied in the 1960’s and since then they have been synthesized with hundreds of different cation combinations of A, B0 and B00, and a wide variety of attractive material properties have been realized for them. In 1998, halfmetallicity was discovered for the Sr2FeMoO6 compound with a mixed valent Fe –Mo state, while the Mg-based analogue Sr2MgMoO6 turned out to be a promising SOFC-anode material. Also importantly, the A2BBO6 perovskites exhibit exciting spin-state configurations and magnetic properties. For example, the Sr2CuBO6 system with Jahn–Teller active Cu is a uniquely suited host lattice for novel low-dimensional and frustrated magnetic behaviours. Particularly intriguing is the spin-liquid-like state discovered for the Sr2Cu(Te,W)O6 system. 16

Perovskite oxides form a structurally simple but compositionally and functionally diverse family of inorganic materials. The ABO 3 single perovskite itself allows a wide range of A and B cation combinations, but through co-occupation of one of the two cation sites with two different cation species the material category is expanded to ordered perovskites, such as the B-site ordered double perovskites A 2 B 0 B 00 O 6 . 1,2 These compounds were first studied in the 1960's [3][4][5] and since then they have been synthesized with hundreds of different cation combinations of A, B 0 and B 00 , 1 and a wide variety of attractive material properties have been realized for them. In 1998, halfmetallicity was discovered for the Sr 2 FeMoO 6 compound with a mixed valent Fe II/III -Mo V/VI state, [6][7][8] while the Mg-based analogue Sr 2 MgMoO 6 turned out to be a promising SOFC-anode material. [9][10][11] Also importantly, the A 2 B 0 B 00 O 6 perovskites exhibit exciting spin-state configurations and magnetic properties. 12 For example, the Sr 2 CuB 00 O 6 system with Jahn-Teller active Cu II is a uniquely suited host lattice for novel low-dimensional and frustrated magnetic behaviours. [13][14][15][16] Particularly intriguing is the spin-liquid-like state discovered for the Sr 2 Cu(Te,W)O 6 system. 16 The A 2 B 0 B 00 O 6 structure indeed presents a widely adjustable host lattice that can be tailored to a range of new material functions. The compositional variety is already vast, but it could be further expanded by utilizing e.g. high-pressure techniques for the synthesis. [17][18][19][20] To guide the new-material synthesis, novel approaches allowing us to systematically predict the materials properties would be highly beneficial.
In the search of new functional compounds/compositions, materials screening is commonly used for organic molecules in medical applications. While such an approach would be attractive also in case of oxide materials, its utilisation is not as straight forward as the sample synthesis often takes several days. However, among the oxides, perovskites form one of the most widely studied material families and the structure-property data collected over several decades could be utilized for data mining. In our early works we have used statistical multivariate data analysis (MVDA) techniques in assessing the structureproperty relationships among super-conductive copper oxides and magnetic single-perovskite manganese oxides; 21,22 however, in these cases the data sets were limited to less than 100 different samples. Now, with the wider composition variation among the A 2 B 0 B 00 O 6 compounds, along with the multitude of functional properties discovered, an exciting opportunity arises to employ MVDA techniques in finding systematic structure-property relationships within the A 2 B 0 B 00 O 6 double-perovskite family.
Here we aim to demonstrate this opportunity by focusing on the magnetic properties of these materials. We gathered literature data for 2606 A 2 B 0 B 00 O 6 samples comprising 1010 unique stoichiometries. For each sample entry, the given chemical composition together with the reported synthesis details and crystal structure data and -when available -physical properties were collected (Supplementary data S1, ESI †). For these 2606 samples, usable magnetic property data were found for 671 entries: 181 were reported to be ferromagnetic (FM), 54 ferrimagnetic (FiM), 289 antiferromagnetic (AFM) and 147 paramagnetic (PM). In terms of the chemical composition, the data set covered 24  For our multivariate data analysis we employ SIMCA 15 software (Sartorius Stedim Data Analytics AB). Each sample entry (observation) is described by a range of quantitative and qualitative X variables, i.e. chemical, structural and physical properties (inputs). The magnetic properties are assigned as Y variables (outputs) and expressed with the type of magnetism (FM, FiM, AFM or PM) as a class identification and then with the specific values for the magnetic transition temperature, magnetic moment and saturation magnetisation. In Table 1, we list these variables used together with their short explanations (detailed explanations are found in Supplementary data S1, ESI †).
The SIMCA software utilises a few different methods to find correlations among the observations; note that it can also handle missing data. The first and simplest method is PCA (principal component analysis) which is typically used in conjunction with observations having only one type of variables (only X). Each observation, i.e. sample, is given a summary index number, made by summing variable values with given weights. SIMCA calculates each weight factor according to variable's importance in the model. When observations are plotted (score plot) according to their summary indexes, similar observations are located near each other, preferably forming groups. Also, a statistical confidence limit is calculated (Hotelling's T2; oval shape in plot), and observations lying outside this oval in the plot are potential outliers because of (i) bad/wrong data, or (ii) very different properties compared to the other observations. In Fig. 1, we present the PCA result for the present data based on the X variables, revealing clear group formation for example due to the choice of the A-site element. However, PCA is not able to tell why the groups are different in detail. From Fig. 1, very few (if any) true outliers can be distinguished indicating the high quality of the input data in general.
Next, we employ the so-called O2PLS (orthogonal partial least squares) model with DA (discriminant analysis) extension, which accepts multiple Y variables. This method is helpful in finding groups/subgroups among the sample set. In O2PLS score plots, horizontal variability indicates variance between the groups and vertical variability within the groups. This model works also with multiple observation classes, but for the simplicity we carry out the analysis for the following property-pairs separately: AFM-FM, AFM-PM and FM-PM.
Here we demonstrate the results of O2PLS-DA for the AFM-FM analysis in more detail. From the score plot shown in Fig. 2a it can first of all be seen that the AFM and FM compounds are in clearly separate groups. This manifests the fact that the type of magnetic ordering can indeed be predicted for the A 2 B 0 B 00 O 6 compounds based on simple chemical and structural parameters (such as those listed in Table 1). Moreover, from the plot in Fig. 2b it is seen that both the AFM and the FM compounds form two subgroups based on the A-site metal constituent depending whether the A metal is from the s-block or from the p-, d-or f-blocks. An interesting observation is (even though not specifically indicated in Fig. 2) that the FM compounds with the highest T C values are mostly located in the upper subgroup.  Also in the AFM-PM and FM-PM models the AFM or FM and PM compounds are divided into subgroups based on the A-site constituent (not shown here); however, it should be noted that for the PM compounds the A metal is in practice always from the s-block.
For all the models made, a so-called loadings plot can be constructed. This plot indicates the importance of the different variables in explaining the variability among the observations: the higher the weighting factor, the higher the importance of the variable. The O2PLS loadings plots also include auxiliary Y variables (located leftmost and rightmost in the loadings plot) indicating which variables affect each group/class the most. The same sign in both X and Y weighting factors indicates correlation and on the other hand, an opposite sign or a value close to zero indicates lack of correlation.
The loadings plot shown in Fig. 3 for our O2PLS-DA/AFM-FM model reveals that the most important variables defining an AFM compound are the B 00 -site electronegativity and charge, and the unit cell volume, while for the FM compounds, the most influential variables are the ionization potential and charge of the A-site metal, and the electronegativity and electron configuration of the B 0 -site metal.
From a similar O2PLS-DA model for the FM-PM pair (not shown here) it may be concluded that the FM and PM compound groups start to show subgrouping when the B-site d-electron configuration is considered. In PM compounds, the B-site metals are of d 0 , d 10 or f n configuration. On the other hand, in FM compounds the B-site cations are (in line with Goodenough-Kanamori-Anderson rules) either d 3 -d 7 or d 3 -d 8 type when the T C value is below room temperature. Compounds with T C above the ambient have usually d 1/2/3/4/5 -d 3/4/5 combinations. It should also be noted that a tolerance parameter close to or above unity is preferred with compounds having high T C . The most influential parameters for FM compounds are the B-site electronegativity, cations coming from the d-block and the d-electron configuration.
Finally, we test the predictive power of our FM model. For building the FM model we use a so-called training set, i.e. a dataset based on observations for which the ferromagnetism is confirmed and the T C value is known. In Fig. 4, we plot the experimentally observed T C values against the values predicted by our model for the same known FM samples. It can be seen that our model can predict the magnetic transition temperature with relatively good certainty for most of the samples. We then use the model to obtain T C values for a series of selected samples, whose chemical and structural properties have been described in literature, but whose magnetic properties are not known or reliably determined.
Among the selected ''magnetically-unknown'' samples there are naturally an extensive number of entries that are simply samples with stoichiometries and structural parameters very similar to those of the already known FM compounds, but which have not been characterized in their respective reference papers for their magnetic properties. These predictions rather serve as a model validation; the predicted T C values for these entries are indeed found to be nearly perfectly in line with the expectations. Most interestingly, we can also use the model to predict totally new promising FM material candidates. In particular, we can identify several unique candidate compositions for new FM double-perovskite compounds. For example, for La 2 CoFeO 6 , 23 Pb 2 FeMoO 6 , 24 Sr 2 CrRuO 6 25 and La 2 CrFeO 6 26 the magnetic properties are not known, but our model predicts T C values between 100 and 200 K. In Table 2 we summarize the predicted T C values for these promising A 2 B 0 B 00 O 6 FM candidate compounds. Finally, we may utilize our FM model for qualitative prediction by looking at the most interesting FM compound families (with the highest T C values), like the A 2 CrReO 6 compounds with A = Ca and Sr, 27     perovskite compounds based on their type of magnetism; their magnetic properties correlate with their chemical and structural properties, enabling us to predict the type of magnetic ordering and ordering temperature for new compounds. Such predictive power can greatly help in guiding further experiments for finding novel functional materials. While this method can handle missing or poor data, it is still dependent on having good data for majority of the compounds in the training set. This suggests that our predictions could be greatly improved with additional measurements for already known compounds. For example, magnetic properties can be very sensitive to bond distances and angles. Thus, better structural data could improve the predictions of magnetic properties. Similarly, additional data on other physical properties, such as electrical, ionic or thermal conductivity, Seebeck coefficient, electrical polarization, or redox properties could help discovering new thermoelectrics, dielectrics, and battery and fuel-cell materials.
All of the compounds considered in this study were in bulk form. However, for technologically important applications, thin films are often required. Different substrates can cause strain/ stress in the film, which can greatly affect the materials properties, as for example the magnetization in thin films of La 2 NiMnO 6 . 32 Lattice mismatch, strain/stress and other such parameters could easily be added to our model, adapting it to new purposes as needed.

Conflicts of interest
There are no conflicts to declare.