Open Access Article
Siyu Isaac Parker
Tian‡
ab,
Zekun
Ren§
ab,
Selvaraj
Venkataraj
b,
Yuanhang
Cheng¶
b,
Daniil
Bash
c,
Felipe
Oviedo||
d,
J.
Senthilnath
e,
Vijila
Chellappan
c,
Yee-Fun
Lim
cf,
Armin G.
Aberle
b,
Benjamin P.
MacLeod
g,
Fraser G. L.
Parlane
g,
Curtis P.
Berlinguette
g,
Qianxiao
Li
h,
Tonio
Buonassisi
*ad and
Zhe
Liu**
ad
aLow Energy Electronic Systems (LEES), Singapore-MIT Alliance for Research and Technology (SMART), 1 Create Way, Singapore 138602, Singapore. E-mail: zhe.liu@nwpu.edu.cn; buonassi@mit.edu
bSolar Energy Research Institute of Singapore (SERIS), National University of Singapore, 7 Engineering Drive, Singapore 117574, Singapore
cInstitute of Materials Research and Engineering (IMRE), Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, Singapore 138634, Singapore
dDepartment of Mechanical Engineering, Massachusetts Institute of Technology (MIT), 77 Massachusetts Ave., Cambridge, MA 02139, USA
eInstitute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, Singapore 138632, Singapore
fInstitute of Sustainability for Chemicals, Energy and Environment, Agency for Science, Technology and Research (A*STAR), 1 Pesek Rd, Singapore 627833, Singapore
gDepartment of Chemistry, The University of British Columbia (UBC), 2036 Main Mall, Vancouver, BC V6T 1Z1, Canada
hDepartment of Mathematics, National University of Singapore (NUS), 21 Lower Kent Ridge Rd, Singapore 119077, Singapore
First published on 13th July 2023
Transfer learning (TL) increasingly becomes an important tool in handling data scarcity, especially when applying machine learning (ML) to novel materials science problems. In autonomous workflows to optimize optoelectronic thin films, high-throughput thickness characterization is often required as a downstream process. To surmount data scarcity and enable high-throughput thickness characterization, we propose a transfer learning workflow centering an ML model called thicknessML that predicts thickness from UV-Vis spectrophotometry. We demonstrate the transfer learning workflow from a generic source domain (of materials with various bandgaps) to a specific target domain (of perovskite materials), where the target-domain data are from just 18 refractive indices from the literature. While featuring perovskite materials in this study, the target domain easily extends to other material classes with a few corresponding literature refractive indices. With accuracy defined as being within-10%, the accuracy rate of perovskite thickness prediction reaches 92.2 ± 3.6% (mean ± standard deviation) with TL compared to 81.8 ± 11.7% without. As an experimental validation, thicknessML with TL yields a 10.5% mean absolute percentage error (MAPE) for six deposited perovskite films.
The state-of-the-art characterization method is optical spectroscopy. However, despite its rapid measurement, optical spectroscopy requires a manual fitting of optical models (a parametric description of the material's wavelength-resolved refractive indices) to obtain thickness. This manual fitting for a new material is slow ranging from a few tens of minutes to hours per sample, and it usually requires much experience on top of trial and error. The refractive indices of different material classes fall into different distributions (domains), reflected by different numbers and types of optical models typically used. For each specific domain (material class), especially newly developed materials such as lead-halide perovskites, readily available refractive-index data are few. This data scarcity poses difficulty to replacing the manual model fitting with the high-throughput ML model across domains (material classes).
To counter the data scarcity prevalent in many materials science applications, the use of transfer learning is gradually rising nowadays. Notable examples lie heavily in materials property prediction, where the learning transfers across properties,27–32 across modes of observation, e.g., from calculated properties to experimental ones,27,30 and across different materials systems,31e.g., from inorganic materials to organic polymers,30 or from alloys to high-entropy alloys.32 Following the same rationale, thin film thickness characterization also presents itself as a suitable field for transfer learning to overcome data scarcity across material classes.
To demonstrate high-throughput thickness characterization with ML across material classes, we propose in this work the following high-throughput transfer learning workflow (Fig. 1) to automatically characterize thickness, i.e., predict film thickness from optical spectra. Without loss of generality, we select lead-halide perovskites as our ultimate material class (target domain) for prediction. Lead-halide perovskites are a family of ABX3 semiconductors with excellent optoelectronic properties, e.g., for photovoltaics, light-emitting diodes, and photodetectors. To ultimately predict thickness for perovskite films, the workflow relies on a transfer learning from the source domain (once-off pre-training) to the target domain (retraining for every individual target domain) as shown in Fig. 1a. The source domain contains generic semiconductor refractive indices; we parametrically simulate 702 refractive indices from a single optical model (Tauc–Lorentz) commonly used for optical materials with an absorption bandgap. We then simulate the optical reflectance/transmittance with 10 thicknesses for every refractive index, constructing a training dataset of 702 × 10. The source domain is “big data”. The target domain contains 13 perovskite semiconductor refractive indices that are experimentally fitted found in the literature. We then repeat the optical reflectance/transmittance simulation with 10 thicknesses per refractive index, obtaining a training dataset of 13 × 10. The target domain is “small data”. Note that the source domain has over 50 times more training data than the target domain. In practice, this chosen perovskite target domain represents the typical data-scarce bottleneck of newly developed materials. This transfer learning workflow enables the target domain to easily extend to other data-scarce material classes with a few literature refractive indices of that material class.
Transfer learning entails two stages of training—(I) once-off pre-training on the source domain and (II) once-every-target-domain retraining. Each training stage features the same model named thicknessML. thicknessML takes optical reflection (R) and transmission (T) spectra as input and outputs thickness (d) and optionally wavelength-resolved refractive indices as shown in Fig. 1b. We denote the real and imaginary parts of refractive indices as n and k respectively.
With the two-stage transfer learning, thicknessML predicts the perovskite film thickness with a mean absolute percentage error (MAPE) of 4.6 ± 0.5% compared to a MAPE of 7.4 ± 4.2% from direct learning (no transfer learning). When validated on six experimentally deposited methylammonium lead iodide (MAPbI3) perovskite films, thicknessML achieves 10.5% MAPE from retraining on only eight most dissimilar literature refractive indices whose perovskite compositions contain no methylammonium.
To facilitate transfer learning, we build the source dataset to be generic; practically, we simulate refractive indices with the Tauc–Lorentz (TL) optical model, universal for materials with a band gap. The simulation of refractive indices is analogous to the simulation of materials (possessing the simulated refractive index). Paired with different thicknesses, a set of simulated n, k spectra (a simulated material) can yield the respective optical R, T spectra. This is analogous to measuring the optical response of a batch of thin films (different thicknesses) made of the same material (the same simulated refractive index spectra). The optical response is simulated by the physical transfer-matrix method (TMM). Without loss of generality, we adopt 0° incident angle and a 1 mm glass substrate in the TMM simulations.
In the source dataset, we simulate 1116 n(λ), k(λ) spectra by sampling a grid of parameter values (for A, C, E0, and Eg, with a fixed ε∞ = 1) of a single TL optical model with λ ranging from 350 to 1000 nm. The λ range was chosen to be a common subset of frequent ranges in UV-Vis measurements and reported literature. The 1116 n, k spectra of the source dataset are divided into 702, 302, and 112 for the training, the validation and the test set respectively. Then we randomly choose 10 thicknesses per pair of n, k spectra (per simulated material) in the training and the validation set, and 50 thicknesses per n, k spectra in the test set to obtain the corresponding R, T spectra. The larger number of d per pair of n, k spectra in the test set gives a more stringent and thus reliable evaluation of how well thicknessML performs. The range of d is 10–2010 nm. Three different training-validation-test splits are performed for three ensemble runs, and the randomly selected thicknesses for the same n, k spectra differ in the three splits.
In the target dataset, we obtain 18 perovskite n(λ), k(λ) spectra from the literature.33–37 The 18 n, k spectra of the target dataset are divided into 13 and 5 for the training and the test set. The validation set is not used in the target dataset due to data scarcity. We follow the same convention and simulate the corresponding R, T spectra with the number of d per n, k spectra of 10 and 50 respectively in the training and the test set. To compare with direct learning (no transfer), we also build a dataset for direct learning by assigning 500 d per n, k spectra for training while maintaining the 50 d per n, k spectra for test. The extremely large number of assigned d per n, k spectra is to compensate for the small number of n, k spectra available, ensuring enough training data in learning from scratch. Five training-test splits are performed for ensemble runs. To study transfer learning with varying data quantities, we also build training-test splits with increasing numbers of training n, k spectra from 0 to 17, with the corresponding rest (18 – number of training n, k spectra) in the test set. In these datasets, we preserve the number of d per n, k spectra in the training (10) and the test set (50) as well as the five training-test splits. To prepare for the experimental validation on six deposited MAPbI3 films, we build the target training dataset by selecting the more distinct perovskite materials (not containing methylammonium) out of the 18 literature materials. We follow the number of d per n, k spectra in the training as well as the five training-test splits. We capture the details of the datasets in Table 1.
| Source dataset | |||
|---|---|---|---|
| Training set | Validation set | Test set | |
| Number of n, k spectra | 702 | 302 | 112 |
| Number of d per n, k spectra | 10 | 10 | 50 |
| Resulting number of R, T spectra | 702 × 10 | 302 × 10 | 112 × 50 |
| Target dataset | ||
|---|---|---|
| Training set | Test set | |
| Transfer learning vs. direct learning | ||
| Number of n, k spectra | 13 | 5 |
| Number of d per n, k spectra | 10 (500 for direct learning) | 50 |
| Resulting number of R, T spectra | 13 × 10 (13 × 500 for direct learning) | 5 × 50 |
![]() |
||
| Transfer learning with varying training data quantities | ||
| Number of n, k spectra | X (0 ≤ X ≤ 17) | 18 − X |
| Number of d per n, k spectra | 10 | 50 |
| Resulting number of R, T spectra | 10X | (18 − X) × 50 |
| Training set | |
|---|---|
| Experimental validation | |
| Number of n, k spectra | 8 (not containing methylammonium) |
| Number of d per n, k spectra | 10 |
| Resulting number of R, T spectra | 8 × 10 |
![]() | ||
| Fig. 2 thicknessML framework: thicknessML receives the R(λ) and T(λ) spectra and outputs d (and n(λ), k(λ)) for single task learning (multitask learning). Input R and T spectra first go through four convolutional and max pooling layers for feature extractions and then get flattened to be passed to three fully connected (FC, also “Dense” in Keras43 terminology) and dropout layers, where mappings from extracted features to task targets are drawn. The three dedicated FC blocks for d, n(λ), and k(λ) correspond to MTL implementation. STL implementation has the same architecture without the two FC blocks for n(λ) and k(λ). (The adopted incident angle in UV-Vis is 0°. The inclined beams are drawn to achieve better visual clarity.) The detailed hyperparameters are recorded in Section S1 thicknessML hyperparameters in the ESI.† | ||
Aside from a straightforward RT-to-d architecture, we also explore a multitask learning (MTL) architecture, where n(λ) and k(λ) are additional outputs with d. This is inspired from physics where the determination of d (from R and T) is closely related to the concurrent determination of n and k. Therefore, we reflect such concurrent determinations with MTL, concurrent learning of multiple tasks. In MTL, if the tasks are related, the model benefits from concurrent learning to be more accurate and is less likely to overfit to a specific task (in other words, the learning is more generalized).41,42 As a result, we concurrently learn to predict d as our main task, and n, k as auxiliary tasks in our MTL implementation. The straightforward RT-to-d architecture, framed in a compatible language, becomes single task learning (STL).
In this study, we use the term “thicknessML” to refer to both the ML model per se and the encompassing framework (including the UV-Vis operation) depending on context.
| d (<10% deviation) | n (<10% deviation) | k (<10% deviation) | |
|---|---|---|---|
| STL | 89.2% | ||
| MTL | 83.3% | 94.2% | 26.8% |
• Many k values on larger wavelengths are near or at zero, e.g., on the magnitude of 10−2. This makes the percentage-based within-10%-deviation accuracy definition unduly stringent. A different choice of absolute-error-based accuracy definition may reflect the k prediction performance more appropriately.
• The many near-zero and at-zero k values bias the output data distribution unfavorably.
• The prediction of wavelength-resolved values is naturally harder than the prediction of a scalar value.
We acknowledge the k prediction limitation of thicknessML-MTL and caution potential users to place more confidence in the d prediction than the n, k prediction when using thicknessML in the MTL setting. Overall, we recommend potential users to use thicknessML in the STL setting for better performance.
Fig. 3 offers some visual performance of thicknessML-MTL: (1) (a) d prediction is considered accurate if it falls between the two side diagonal lines denoting 10% deviation from perfect prediction (predicted value = actual value) as shown in Fig. 3b. (2) An example of thicknessML-MTL outputting predicted d, n, and k is shown in Fig. 3c; and the optical response R and T can be reconstructed from the predicted d, n, and k to compare with the actual values as shown in Fig. 3d.
We propose and run two types of transfer learning: (1) full-weight retraining, to continue updating the weights of both convolutional and FC blocks, and (2) partial-weight retraining, to freeze the weights of the convolutional block (unchanged feature extraction), while updating the FC block. To provide a baseline, we also implement a case of direct learning/training from scratch (from random initialized weights as in pre-training). We first validate the use of transfer learning by comparing against direct learning as shown in Table 3 and Fig. 4b. In this transfer learning vs. direct learning comparison, we use different datasets—for transfer learning, we split the 18 literature n, k spectra into 13 and 5 for training and test (paired with 10 and 50 d per n, k spectra respectively); for direct learning, we preserve the same n, k spectra split (13–5) but pair with 500 and 50 d per n, k spectra respectively for training and test. The details of the datasets are described in Table 1 and the section preparation of source and target datasets. Transfer learning achieves better accuracy (higher mean) and precision (smaller spread) than direct learning regardless of MTL or STL. Within transfer learning, full-weight retraining of the STL setting has the highest performance. We observe that although certain individual runs of direct learning can surpass the transfer learning performance, direct learning is largely affected by specific training-test splits (certain runs having extremely low performance). We point out that the current comparison is based on a 50 times difference in the training data size between transfer learning and direct learning. To conclude, we justify the use of transfer learning (better performance achieved with less data).
| Transfer learning | Direct learning | ||
|---|---|---|---|
| Full-weight retraining | Partial-weight retraining | ||
| a This is the proportion of accurate d predictions, where accuracy is defined as within 10% deviation. | |||
| d accuracya | 92.2 ± 3.6% (STL) | 90.0 ± 2.9% (STL) | 76.9 ± 23.7 (STL) |
| d MAPE | 4.6 ± 0.5% (STL) | 4.9 ± 0.6% (STL) | 10.0 ± 9.6% (STL) |
![]() | ||
| Fig. 4 Transfer learning performance of thicknessML. (a) The transfer learning dataset, showing the size of training data. (b) Transfer learning vs. direct learning on predicted d accuracy (%, evaluated on the test set) shown in a box plot: the box plot records the d accuracies of each model in the ensemble runs (3 pre-trained models × 5 data splits). This transfer learning is full-weight retraining due to its better performance. The retraining data are as described in (a). (Table 1 records the dataset in more detail.) To study transfer learning with varying data quantities, we record d accuracy vs. the number of training n, k spectra (out of a total of 18) of (c) full-weight retraining and (d) partial-weight retraining. Solid lines and spreads denote the mean and standard deviation of the performance (d accuracy) from the 3 × 5 ensemble runs. | ||
To consider whether the 13–5 training-test split of the n, k spectra yields reasonable results and to study the effect of the retraining data size on the transfer learning, we conduct transfer learning with an increasing number of retraining n, k spectra. The results are shown in Fig. 4c and d. Here we take an increasing number of retraining n, k spectra from 0 to 17 and leave the rest (in the 18 literature n, k spectra) to test. We preserve the 10 and 50 d per n, k spectra respectively for the training and the test set throughout. The details of the dataset are described in Table 1 and the section preparation of source and target datasets. We observe that the initial transfer at 0 training n, k (without retraining) only yields 50+% and 70+% d accuracy for MTL and STL respectively. Full-weight retraining encounters an initial drop for MTL (or a minimal increase for STL) in performance before showing performance rise with the increase of training data size, while partial-weight retraining immediately shows steady performance rise. However, with larger training data sizes (>11 training n, k) full-weight retraining eventually becomes better than partial-weight retraining and yields better d accuracies. To explain full-weight retraining vs. partial-weight retraining, we look at the difference in the weights to update—compared to partial-weight retraining, full-weight retraining has more weights to update. Thus full-weight retraining is more flexible. Flexibility has both pros and cons: when the number of retraining n, k is small, flexibility more easily steers thicknessML away from the optimal weights (an initial drop or a minimal increase in accuracy); when the number of retraining n, k becomes large enough, flexibility offers a higher learning capacity, and thus a better accuracy. Overall, we recommend the STL implementation in the transfer learning workflow paired with either partial-weight retraining (when the number of retraining n, k is smaller) or full-weight retraining (when the number of retraining n, k is larger). We follow this recommendation in our ensuing experimental validation.
We reiterate the goal of thicknessML, to characterize film thickness across materials classes in high throughput, and we evaluate the transfer learning performance in this section accordingly. Note that thicknessML-STL after transfer learning starts to approach a high d accuracy, e.g., 90% d accuracy, as shown in Fig. 4c and d rapidly with only 1 retraining n, k (taking the better of full-weight and partial-weight retraining). Around 90% d accuracy is achieved when there are ≥9 retraining n, k spectra. Functionally, this means that for any materials class, thicknessML can successfully predict film thickness with high (around 90% in this perovskite case) accuracy given a few (9 in this case) literature n, k spectra via this generic-to-specific transfer learning framework. The impact of thicknessML in the transfer learning workflow is significant for achieving high-throughput film thickness characterization across materials classes; the only requirement is a few literature n, k spectra of the target material class.
| Film no. | Concentration of precursor solution (M) | Spin coating speed (rpm) | Measured thickness (nm) | Predicted thickness (nm) |
|---|---|---|---|---|
| 1 | 0.5 | 3000 | 154.17 | 122.8 |
| 2 | 0.5 | 6000 | 99.29 | 101.3 |
| 3 | 1.25 | 3000 | 389.89 | 418.5 |
| 4 | 1.25 | 6000 | 265.07 | 256.0 |
| 5 | 1.5 | 3000 | 460.35 | 489.9 |
| 6 | 1.5 | 6000 | 311.15 | 373.8 |
To evaluate thicknessML as a high-throughput characterization framework, we also record its throughput. During prediction (when deployed), the thickness prediction of one film is within milliseconds, and the bulk of time (per sample) is spent on UV-Vis. In this study, UV-Vis is performed by using a stand-alone tool with an integrating sphere and takes about 2 minutes per sample to measure R(λ) and T(λ) of 0° incident angle. During training (pre-training and retraining), the once-off pre-training takes about 1.5 hours (STL) and 3.3 hours (MTL) per model, while the retraining completes within minutes, on a desktop equipped with an Intel(R) Core(TM) i7-4790 CPU and NIVIDIA GeForce GTX 1650 GPU.
On the generic source dataset simulated from the generic Tauc–Lorentz optical model, 89.2% of the predicted d from pre-trained thicknessML fall within 10% deviation (89.2% d accuracy). After transferring to the specific target dataset from 18 literature perovskite refractive indices, retrained thicknessML reaches 92.2 ± 3.6% d accuracy compared to 81.8 ± 11.7% d accuracy of direct learning. Moreover, we demonstrate that just a few (9 in the perovskite case) literature n, k spectra in the target domain are sufficient for this generic-to-specific transfer learning framework to predict target-domain film thickness with high accuracy. This transfer learning workflow yields a 10.5% MAPE when validated on six experimentally deposited MAPbI3 films.
Overall, we demonstrate that our proposed generic-to-specific transfer learning workflow can effectively characterize film thickness in high throughput; it only needs a few literature n, k spectra (tackling data scarcity) to perform high-accuracy thickness prediction across various material classes. We believe that this study opens a new direction of high-throughput thickness characterization and serves as an inspiration for future research encountering data scarcity.
The Python implementation of the single Tauc–Lorentz oscillator entails the implementation of the following equations:45
![]() | (1) |
![]() | (2) |
![]() | (3) |
![]() | (4) |
![]() | (5) |
| aln = (Eg2 − E02)E2 + Eg2C2 − E02(E02 + 3Eg2), |
| aatan = (E2 − E02)(E02 + Eg2) + Eg2C2, |
After combining all the above equations, n(λ) and k(λ) are parameterized with five fitting parameters, A, C, E0, Eg, and ε∞. We fix ε∞ = 0 and sample grids for each parameter as follows—A, 10 to 200 with 11 grid nodes; C, 0.5 to 10 with 10 grid nodes; E0, 1 to 10 with 10 grid nodes; Eg, 1 to 5 with 10 grid nodes. After sampling, we randomly select 1116 n, k spectra to be included in our dataset. We describe the selection of these 1116 n, k spectra in more detail in Section S3† selection of 1116 n, k spectra in the generic source dataset.
Section S2† Visualization of thicknessML Activation Maps of an example R, T spectra peeks into the black box of thicknessML and visualizes activation maps of example R, T spectra (from the source dataset). The four rows of activation maps correspond to the outputs of the four convolutional layers (after ReLU activation) respectively (ten filters are randomly chosen for each convolutional layer to produce the activation maps). Certain filters are activated maximally at peaks or valleys of the R, T spectra, which is closely related to film thickness.
![]() | (6) |
lossid = log(cosh( i − di)) | (7) |
i denotes the predicted thickness for sample i. Given our d range from 10 nm to 2010 nm, we amplify individual sample loss by how much thinner it is than 2010 nm. If di is 2010 nm, its loss is simply lossid, where lossid is the log-cosh loss for thickness as shown in eqn (7). If di is close to 0 nm, its loss is close to 2.5 × lossid. Here, 1.5 is tunable. This heteroskedastic loss penalizes d prediction deviations for smaller thickness more than d prediction deviations for larger thickness, yielding a more consistent relative error overall. The overall loss is the average over all individual sample heteroskedastic loss los
hid as shown in eqn (8)![]() | (8) |
For the n loss function, we adopt a similar heteroskedastic form,
. The prediction deviation of n is proportionally amplified by how much less the actual n is from 10; if ni is close to 0, its loss is close to 2 × lossin. For the k loss function, we simply use the log-cosh loss function without heteroskedasticity because of its many near-zero and zero values. The corresponding loss scalers wn and wk are recorded in Section S1†thicknessML Hyperparameters.
![]() | ||
| Fig. 6 A simplified neuron representation of fully connected layer n and n + 1, where Wn denotes the weights associated with layer n, and fn the activation functions. | ||
:
1 molar ratio), and the spin coating speed, are used and recorded in Table 4. The deposited films are then measured by UV-Vis with an Agilent Cary 7000 UV-Vis-NIR spectrophotometer, and by profilometry with a KLA Tencor P-16 + Plus Stylus Profiler.
Footnotes |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2dd00149g |
| ‡ Now at: Department of Mathematics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore 119076 |
| § Now at: Xinterra, Singapore, 77 Robinson Road, Singapore 068896, Singapore |
| ¶ Now at: Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, 999077, P. R. China |
| || Now at: Microsoft AI for Good, Redmond, WA 98052, USA |
| ** Now at: School of Materials Science and Engineering, Northwestern Polytechnical University, Xi'an, Shaanxi, 710072, P. R. China |
| This journal is © The Royal Society of Chemistry 2023 |