Open Access Article
Tanzim
Rahman‡
a,
Ahnaf
Tahmid‡
a,
Shifat E.
Arman
b,
Tanvir
Ahmed
c,
Zarin Tasnim
Rakhy
c,
Harinarayan
Das
d,
Mahmudur
Rahman
e,
Abul Kalam
Azad
a,
Md.
Wahadoszamen
c and
Ahsan
Habib
*a
aDepartment of Electrical and Electronic Engineering, University of Dhaka, Dhaka-1000, Bangladesh. E-mail: mahabib@du.ac.bd
bDepartment of Robotics and Mechatronics Engineering, University of Dhaka, Dhaka-1000, Bangladesh
cDepartment of Physics, University of Dhaka, Dhaka-1000, Bangladesh
dBangladesh Atomic Energy Commission, Dhaka-1000, Bangladesh
eDepartment of Electrical and Electronic Engineering, Dhaka University of Engineering & Technology, Gazipur-1707, Bangladesh
First published on 9th December 2024
Tandem neural networks for inverse design can only make single predictions, which limits the diversity of predicted structures. Here, we use conditional variational autoencoder (cVAE) for the inverse design of core–shell nanoparticles. cVAE is a type of generative neural network that generates multiple valid solutions for the same input condition. We generate a dataset from Mie theory simulations, including ten commonly used materials in plasmonic core–shell nanoparticle synthesis. We compare the performance of cVAE with that of the tandem model. Our cVAE model shows higher accuracy with a lower mean absolute error (MAE) of 0.013 compared to 0.046 for the tandem model. Robustness analysis with 100 test spectra confirms the improved reliability and diversity of cVAE. To validate the effectiveness of the cVAE model, we synthesize Au@Ag core–shell nanoparticles. cVAE model offers high accuracy in predicting material composition and spectral features. Our study shows the potential of cVAEs as generative neural networks in producing accurate, diverse, and robust nanoparticle designs.
Conventionally, inverse problems are solved using gradient-based techniques such as topology optimization10 and gradient-free methods such as genetic algorithms and particle swarm optimization. Although useful, these methods have drawbacks in nanophotonic inverse designs. For example, gradient-based algorithms often converge to local minima, and gradient-free methods can suffer from slow convergence and high computational costs.11 To solve these inadequacies, there has been a recent emergence in the incorporation of deep learning approaches in the nanophotonic inverse designs.12 In the inverse design model, the electromagnetic (EM) response serves as the input, and the model directly generates the corresponding structure as the output. Two types of deep neural architectures, deterministic and generative networks, can be utilized for nanophotonic inverse design. In nanophotonics, the task of a deterministic network (e.g., basic feed-forward neural network) for inverse design to deduce geometric parameters from optical properties is “ill-posed”. This is because of multiple potential solutions for a given input condition (one-to-many solutions). Consequently, during training, such networks often converge to non-physical averages between different solutions, learning invalid solutions that hurt the network's performance. To address the challenges of inverse networks, researchers use tandem architectures that integrate both an inverse network (mapping optical response to structure) and a forward solver network (mapping structure to optical response).13 Tandem networks are effective in the inverse design of numerous nanophotonic structures.13–15 He et al. utilized a tandem model to predict the structures of multilayered nanoparticles.16 Although tandem networks offer good accuracy, they are limited to providing a single prediction for a specific inverse design task, even if multiple designs can achieve the target. This limitation negatively impacts the diversity of the predicted structures. Diverse designs aid the fabrication process by offering more candidate options, which is especially beneficial for shapes that are challenging to manufacture at the nanoscale. Generative networks, such as generative adversarial networks (GANs) and variational autoencoders (VAEs), can stochastically output multiple different predictions to address this limitation.17,18 These networks can generate numerous optimized solutions based on variations in their latent space which contains an abstraction of learned parameters for diverse generations. VAEs encode input data into a lower-dimensional latent space and then decode it back while creating a compact, continuous, and normally distributed latent space. A conditional variational autoencoder (cVAE) allows further control in such generations by imposing a condition, which is utilized in inverse designing to achieve desired attributes.19,20 This lets the cVAE network explore the latent space and generate multiple valid designs for the same required condition. However, a comprehensive literature search revealed a lack of research focused on the applications of generative neural networks to inverse design ENPs.
Here, we use generative neural network architectures to predict core–shell nanoparticle structures based on specific optical responses. Recent reports indicate that cVAEs are the most efficient generative neural networks for the nanophotonic inverse design task.18 Therefore, we focus on cVAEs for nanoparticle inverse design. We also evaluate the effectiveness of cVAE by comparing results with those obtained from the tandem neural network. For such comparisons, we use evaluation metrics of accuracy, robustness, and diversity. We present a dataset obtained from Mie theory simulations, utilizing the Python module Scattnlay.21 Our results from comparing the input condition spectra with the predicted particle spectra show that the tandem network has a mean absolute error (MAE) of 0.046, while the cVAE has an MAE of 0.013. The lower MAE signifies better accuracy for the cVAE model. Our robustness analysis with 100 test spectra shows that the cVAE has higher robustness compared to the tandem network. Furthermore, the cVAE model exhibits significant diversity, generating multiple valid solutions for the same input condition. To validate the effectiveness of the cVAE model, we synthesize Au@Ag core–shell nanoparticles. Our cVAE model demonstrates high accuracy in ENP inverse design and can correctly predict key plasmonic modes. Our existing dataset, which will be publicly accessible on GitHub for further research, can be extended to accommodate multilayer nanoparticle inverse design.
750 samples. Structural parameters (t1, t2) were both uniformly sampled in the ranges of [1, 99] nm, but also ensuring that t1 + t2 does not exceed 100 nm. The absorption cross-section spectrum was computed between the [300, 800] nm wavelength range with a 2.505 nm step size. In both models (i.e., cVAE and tandem neural network), we used 87
210 samples for training, 21
802 samples for validation, and the rest 5738 samples for testing purposes. The spectra were normalized to values between 0 and 1 for improved training of neural network models. The dataset consisted of 205 columns, where the first two columns represented the material IDs m1 and m2 for the core and shell sections, respectively. The next two columns contained their corresponding thicknesses t1 and t2. The dielectric host medium, either air (0) or water (1), was indicated in the fifth column; only the water medium was considered for this work. The remaining 200 columns contained the absorption cross-section data for each nanosphere.
In the case of the cVAE model, both the particle design and optical response (condition) were applied as inputs to the encoder network, which converted these inputs to the latent vector. The decoder network, on the other hand, was given the latent vector and the optical response (condition) as inputs from which it generated the predicted inverse design. In this way, the model can be trained on multiple designs for the same input conditions by changing the latent variables. Hence, exploring different values in the latent space with the same input condition can yield multiple valid designs. The model has a complex architecture that uses resnet CNN branches along with dense layers and uses four additional reconstruction losses for the design parameters of the particle. Categorical cross-entropy was used as a loss function for the core and shell material predictions, and mean squared error (MSE) was used as a loss function for the predicted thickness of the layers. Furthermore, Kullback–Leibler (KL) divergence loss was used for training the encoder to ensure the proper construction of the latent z space.43 The entire model consisting of the encoder and the decoder networks was constructed and trained as a whole, rather than separately, to implement the process. After the completion of training, the decoder section was separated and saved to be used for inverse predictions. All the networks were trained using the AdamW optimizer. Additionally, a callback function that reduces the learning rate once the training loss stops improving was utilized. Furthermore, the batch size was doubled after every 16 epochs. This adaptive batch size technique helps by reducing the training time significantly while minimally affecting the accuracy.44
Second, the Au@Ag nanoparticles were prepared by employing a seed-mediated growth chemical reduction protocol.46 To achieve this, 15 mL of AuNPs (seed solution) were placed in a beaker and continuously stirred at room temperature (25 °C) at a speed of approximately 1200 rpm using a magnetic hot plate. After successive additions of 300 μL of TSC (approximately 1%) and 9.0 μL of 100 mM ascorbic acid, the mixture was vigorously stirred for 5 minutes. Then, 90 μL of AgNO3 (approximately 10 mM) solution was added into the mixture at a rate of one drop per 40 seconds. The addition of the specified amount of AgNO3 resulted in a gradual color change of the solution from wine-red to orange-yellow. Color change indicated the successful fabrication of Au@Ag core–shell NPs.
210 samples. The input parameters include the core material ID (m1), shell material ID (m2), core thickness (t1), and shell thickness (t2). Material IDs range from 1 to 10 (ESI Fig. S1†), corresponding to the following materials: silver (Ag), aluminum (Al), gold (Au), copper (Cu), gallium arsenide (GaAs), indium arsenide (InAs), indium phosphide (InP), molybdenum (Mo), silicon (Si), and silicon dioxide (SiO2). Once the forward network is adequately trained, an inverse network is introduced, and training takes place in tandem (Fig. 1a-bottom panel).11 In the inverse model, the desired optical spectra are provided as conditions to generate the design parameters (m1, m2, t1, t2), which are then fed into the forward network to predict the outcome. As shown in Fig. 1a, the error is calculated by comparing the predicted outcome with the desired outcome. This error is then used to adjust the inverse network (ESI Text 1†). Fig. 1b and S2† show the training loss and validation loss curves during training. In Fig. 1c, we present four random samples where the model-predicted particle designs closely matched the particle designs that produced the input spectra. The optical response is calculated for the predicted inverse design (dashed black line) using Mie theory (Scattnlay) and then compared with the input condition spectra (solid red line). The results in Fig. 1c reveal that the corresponding predicted and target spectra are nearly identical, demonstrating the competence of our model. However, the four solutions predicted by the tandem model for four input conditions may not be the only possible solutions. Therefore, one major disadvantage of the tandem model is that it can only predict one design at a time. As a result, other potential solutions remain unexplored. The lack of diversity represents a significant limitation, particularly during fabrication when multiple designs are required. The use of cVAE addresses this limitation of the tandem network.18
The capability to identify multiple possible solutions with a cVAE is demonstrated in Fig. 3, S3 and S4.† Different regions of the latent space map to various materials and thickness predictions for the particle. ESI Fig. S5† illustrates how we choose z values. First, we consider a search area centered on the origin (0, 0). Then, we examine a 5 × 5 grid of the latent space within the search area, input them into the cVAE, and compare the spectra of the predicted particles with the input spectrum. The z-value with the lowest MAE is considered the best z-value. The MAE is calculated using the equation:
where yi represents the true values, ŷi represents the predicted values, and n is the total number of data points taken on the spectrum. For the next iteration, we regard the best z-value as the new origin (center) and reduce the search area to half of its original size. We repeat the process five times and take the best result at the end as the final model prediction. Fig. 3 represents the z-values found after five iterations. This figure is designed to not only show the prediction results based on the optimal z-values but also to illustrate the impact of slightly different z-values. From left to right, the z1 value changes by 1 in each graph, and from bottom to top, the z2 value changes by 1. Thus, the entire Fig. 3 functions as a larger graph with z1 on the x-axis and z2 on the y-axis. Each sub-figure represents a point on this larger graph, arranged in a 3 × 3 grid for better visibility. This visualization demonstrates how the cVAE model facilitates the exploration of multiple solutions for this inverse design task.
We use MAE and root mean squared error (RMSE) to evaluate accuracy, as MAE reflects overall performance, while RMSE highlights the influence of larger errors.18 The RMSE is expressed as
where yi and ŷi refer to the target spectrum values and the spectrum values for the inverse designed structure, respectively, and n is the total number of data points taken across the spectrum. The tandem model gives average MAE and RMSE values of 0.046 and 0.068, respectively, and the cVAE shows average MAE and RMSE of 0.013 and 0.019, respectively (Fig. 4a and S6a†). Both MAE and RMSE measure the differences between the spectral responses of the predicted designs and the input condition spectra. Therefore, lower MAE and RMSE signify better accuracy.
Next, we compare the performance of the two models in terms of their robustness. From the new test set containing 500 samples, we randomly select 100 test targets and obtain the inverse predicted structures. We then vary t1 and t2 randomly in a range of −5% to +5% to create a perturbed set of the predicted designs and simulate the optical response. We again calculate the MAE of the perturbed spectra with respect to the target spectra. Our results show that the cVAE achieves better robustness, with MAE and RMSE values of 0.054 and 0.076, respectively, compared to the tandem model, which suffers from higher MAE and RMSE values of 0.074 and 0.104 (Fig. 4b and S6b†), respectively. This indicates that the optical responses of fabricated structures predicted by the cVAE deviate less from the target responses, demonstrating better fabrication tolerance. Therefore, our results demonstrate that the generative cVAE produces more accurate and robust results in the ENP spectrum inverse design compared to the deterministic tandem model.
Finally, we compare the diversity of predicted structures between the two models. The cVAE generates diverse outputs for the same input condition by sampling from the latent space, introducing variability in the outputs. This feature is beneficial for nanofabrication tasks, as diverse structures aid fabrication by offering more candidate designs for challenging shapes. Conversely, the tandem model provides a single, deterministic output for a given input, limiting its diversity. In accuracy calculations, we select the best solution (the closest match to the desired spectrum) by minimizing MAE. This approach impacts the cVAE's diversity. If researchers prioritize diversity, they can select the top n structures whose spectra best match the desired spectrum instead of just the best one. These top n structures provide a diverse set of solutions close to the target.
To show the diversity of cVAE, we select a test spectrum (target), sample the latent space for potential solutions, and calculate the MAE between each potential solution and the target input. Fig. 5 (ESI Fig. S7†) shows the MAE (RMSE) distributions for a particular test target. The use of cVAE provides multiple diverse nanoparticle structures that match the desired spectrum to varying degrees (Fig. 5). This diversity in predictions offers flexibility in choosing the best structure based on additional criteria like manufacturability or cost. The tandem model, on the other hand, provides a single, highly optimized structure that closely matches the desired spectrum, focusing on precision and suitability when a specific, optimal solution is required.
To further compare the performance of cVAE in ENP spectrum inverse design, we synthesize Au (core)–Ag (shell) nanoparticles. As evident in our TEM image (Fig. 6a), the synthesized nanoparticles are not completely spherical and isotropic in composition; the core and shell thicknesses are not symmetrical throughout the particle. Furthermore, these particles exhibit a wide range of sizes, which affects their absorbance spectra and, in turn, the model predictions. These factors combined hinder the ability of an inverse design network to completely match the synthesized particles. Even so, using the cVAE model, by probing the z-space (ESI Fig. S5†), we find that the most optimized design closely matches the experimental measurements for z = [2.32, – 1.26].
Fig. 6b compares the experimental absorption cross-section given to the models as the input condition (red curve) and the simulated response from the predicted inverse design particles from both the tandem (dashed blue curve) and cVAE (dashed black curve) models. The tandem model essentially predicts a single-layered gold nanoparticle with a radius of 53.5 nm. On the other hand, the cVAE model, due to its advantageous diversity and robustness, shows a close match to the target spectrum, resulting in a low MAE despite the thicknesses being significantly different from the true values (Fig. 6b). ESI Fig. S8† further illustrates that multiple solutions exist where the predicted spectrum closely matches the target spectrum with varied geometrical parameters. The cVAE model predicts the spectral features more correctly. For example, it predicts the ordinary modes47 of plasmon resonance corresponding to field localization at the outer surface of the Ag shell at around 500 nm. It also predicts the extraordinary modes47 of plasmon resonance at around 375 nm, which corresponds to field enhancements at the core–shell interface.
Footnotes |
| † Electronic supplementary information (ESI) available: Text 1: tandem neural network; Text 2: conditional variational autoencoder neural network; Fig. S1: schematic of tandem and cVAE models for inverse design of core–shell nanoparticle; Fig. S2: training and validation loss curves for the forward networks (for tandem) used in the study; Fig. S3: additional examples of the cVAE's capability in predicting varied designs while maintaining accurate optical properties are included; Fig. S4: additional examples of the cVAE's capability in predicting varied designs while maintaining accurate optical properties are included; Fig. S5: illustration of the method for choosing z values; Fig. S6: comparison of root mean squared error (RMSE) between the spectral responses of the predicted designs and the input condition spectra for both the tandem model and the cVAE model; Fig. S7: RMSE distributions in the latent space for a particular test target; Fig. S8: comparison of the experimental absorption cross-section given to the models as the input condition and the simulated response from the predicted inverse design particles from the cVAE models; Table S1: material IDs and sources for refractive indices for the materials used during this work. See DOI: https://doi.org/10.1039/d4na00859f |
| ‡ These authors contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2025 |