Open Access Article
Jun Hyeong Kim†
a,
Kyunghoon Lee†
a,
Hyeonsu Kimab,
MinSoo Kangc,
Suk-Ku Changd,
Yinglan Jind,
Dongwook Kime and
Woo Youn Kim
*a
aDepartment of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. E-mail: wooyoun@kaist.ac.kr
bSimulation Group, Samsung SDI, Samsung-ro, Yeongtong-gu, Suwon-si, 16678, Gyeonggi-do, Republic of Korea
cMaterial Development Department, SK Materials JNC, 5609 Dongtangiheung-ro, Hwa-sung City, 18469, Gyeonggi-do, Republic of Korea
dUNIPlus, 40 Omokcheon-ro 152beon-gil, Suwon-si, 16642, Gyeonggi-do, Republic of Korea
eDepartment of Chemistry, Kyonggi University, 154-42 Gwanggyosan-ro, Yeongtong-gu, Suwon-si, 16227, Gyeonggi-do, Republic of Korea. E-mail: dongwook-kim@kgu.ac.kr
First published on 2nd March 2026
Generative AI has emerged as a powerful tool for the discovery of organic light-emitting diode (OLED) materials. However, its practical application remains underexplored due to the small datasets and difficulties in ensuring molecular synthesizability. To overcome these challenges, we introduce a building block-based autoregressive generative model. Trained on a dataset of approximately 1000 OLED molecules, the model demonstrated refined control over key thermally activated delayed fluorescence (TADF) properties, including S1 energy and the singlet–triplet energy gap ΔEST, while generating structurally novel candidates through strategic repurposing of building blocks not previously associated with TADF activity. In addition, we experimentally validated its potential by synthesizing four AI-designed green emitters and integrating them into OLED devices, achieving external quantum efficiencies of up to 11.22% at 1000 cd m−2. It achieved more than a 100-fold reduction in the computational cost of quantum chemical calculations compared to conventional heuristic methods. This work bridges the gap between generative molecular design and experimental realization, showcasing a pathway to overcome data scarcity and unlock innovative discovery of optoelectronic materials.
One of the critical factors that influence the HTVS process is the quality of the initial chemical library. When the library has a low density of desirable candidates, it is necessary to enlarge the library in order to identify valuable hits, which can eventually lead to substantial increase in computational costs. Unfortunately, such low densities are often inevitable when using simple strategies like random sampling, due to the complex and specific requirements of functional materials. Traditionally, these issues have been tackled by deriving expert-driven rules to capture key characteristics of materials tailored for specific applications.3 For example, photovoltaic and organic light-emitting diode (OLED) materials are typically designed through the combination of donor and acceptor building blocks to facilitate charge-transfer excitations.7,12–14 While these rule-based strategies can be effective, they are unlikely to capture all the key characteristics. Consequently, HTVS guided solely by expert heuristics still requires the screening of millions to billions of compounds to identify high-potential candidates.
In this context, generative AI has emerged as a promising alternative to overcome the limitations of expert rule-based design strategies. These models can learn the underlying distribution of training data and capture key features of data, enabling the design of libraries enriched with promising candidates.15 In recent years, generative models have achieved remarkable advancement in diverse fields, including image synthesis,16 natural language processing,17 and drug discovery.18 In materials discovery, generative AIs have been used to design novel functional materials.19–31 However, applying generative AI to materials discovery poses several unique challenges. Unlike in previous domains, where models are typically exposed to millions or even billions of data points, datasets relevant to materials discovery often contain just a few thousand, or even just a few dozen, data points.32,33 Furthermore, synthesizability must be ensured, which is less pertinent to language or image generation tasks.34,35 These requirements are further complicated by the multi-objective nature of materials discovery, including factors such as stability, functionality, and performance.36 As a result, the experimental validation of molecules designed by generative models is uncommon in the materials domain.
In this work, we present a successful application of generative AI to accelerate materials discovery, supported by experimental validation. Among various types of materials, we focused on thermally activated delayed fluorescence (TADF) materials,37–39 a domain characterized by limited data availability (approximately 1000 samples) and significant challenges when relying solely on expert-driven design rules, such as the combination of donor and acceptor building blocks.
To address these challenges, we used the Building Block–based AutoRegressive (BBAR) molecular generative model, which was originally developed for drug discovery.40 BBAR generates molecules by iteratively assembling molecular building blocks and has shown strong performance in generating candidates with desired properties given as conditions, including those that rarely appear in the training data. Furthermore, BBAR can implicitly consider synthesizability; by preparing the building blocks through retrosynthetic decomposition of known compounds and restricting fragment attachments to compatible connection sites defined by the same retrosynthetic templates, the generated molecules are more likely to be synthesizable.
Leveraging the BBAR model, we synthesized four novel green TADF molecules and fabricated OLED devices to evaluate their performance. Fig. 1 illustrates the overall workflow of our study, which proceeded as follows. We first compiled a training set of approximately 1000 OLED molecules from various literature sources, focusing on molecular structures and their optoelectronic properties. Using this dataset, we trained the BBAR model to generate candidate molecules for red, green, and blue TADF emitters. The model produced a virtual chemical library of candidates projected to achieve significantly higher hit rates compared to traditional heuristic-based approaches, which was verified by extensive quantum chemical calculations. Targeting green emitters, we identified dozens of highly promising candidates for experimental validation. Since the BBAR model inherently considers synthesizability during generation, the selected candidates were readily synthesizable. This streamlined process culminated in four final molecules for device testing. Their device level evaluation showed EQEs of up to 11.22% at 1000 cd m−2, highlighting the potential of generative AI to bridge computational design and experimental realization, particularly in data-scarce scenarios.
Here, the S1 energy represents the vertical excitation energy of a molecule transitioning from the ground state to the first excited singlet state (S1), which closely correlates with the display color range, and the ΔEST represents the energy difference between S1 and the first excited triplet state (T1), known as one of the critical factors for enhanced reverse intersystem crossing (RISC).42–46 These two electronic properties were utilized as the ground-truth generation conditions for training the BBAR model, with further details provided in Section 2.2.
To obtain these two properties, we first generated initial molecular geometries using the ETKDG algorithm,47 followed by universal force field (UFF) optimization,48 implemented in the RDKit library (RDKit: Open-source cheminformatics, https://www.rdkit.org). To reduce computational cost, these geometries were pre-optimized with the GFN2-xTB method49 and then optimized using DFT with the B3LYP functional50 and the 6-31g(d) basis set, as implemented in Gaussian 16.51 From the optimized ground-state geometries, the vertical excitation energies to the S1 state and T1 state were calculated using time-dependent DFT (TD-DFT) at the same level of theory. Out of 1182 molecules, 64 failed during the calculations, mainly due to optimization convergence failures, resulting in a final dataset of 1118 molecules. The distributions of their calculated electronic properties are provided in Section 1 of the SI.
The BBAR model is trained on sequences representing the stepwise assembly of molecular building blocks. These sequences are prepared by first decomposing OLED molecules in the training set using the BRICS (Breaking of Retrosynthetically Interesting Chemical Substructures) algorithm.52 Specifically, we utilized the BRICS implementation in RDKit, which applies 16 retrosynthetically inspired chemical decomposition patterns to partition molecules into synthetically meaningful units. Applying this procedure to the 1118 molecules in the training set yielded a total of 5905 initial fragments. After removing duplicates via canonicalization, we established a finalized library of 589 unique building blocks, which serves as the fundamental component set for molecular generation. Next, random assembly orders of these building blocks are enumerated. For example, if a molecule is composed of three connected fragments (A–B–C), possible training sequences might include A → A–B → A–B–C or C → B–C → A–B–C. These sequences are randomly sampled during training. The BBAR model is trained to reconstruct the original molecules by following the selected assembly sequence, conditioned on the calculated target properties (S1 energy and ΔEST). After several training epochs, the model is then used to generate new molecules tailored to specific property requirements.
We randomly split the dataset into training, validation, and test sets with a ratio of 75
:
15
:
10, and trained the model for 20 epochs. Details on the model architecture and training hyperparameters are provided in Section 2 of the SI. Additional technical information, including the training objective function, can be found in the original work by Seo et al.40
For comparison, we prepared a baseline set of molecules using the “random enumeration” method as described by Kim et al.5 This protocol follows the combinatorial assembly of predefined donors, linkers, and acceptors originally curated by Gómez-Bombarelli et al.,7 specifically forming donor–linkern–acceptor structures with n = 0, 1, 2. Considering all possible combinations results in a pool of approximately 5.49 million candidates, from which we randomly sampled 1000 molecules for testing. These selected molecules were then validated using the same procedure described above. This baseline set is referred to as the “Random” set.
Fig. 3 shows the distributions of S1 energy and ΔEST for the generated molecules targeting blue, green, and red colors, respectively. The red dashed lines indicate the target values for each property in each task. As shown in the figure, a large portion of the randomly enumerated molecules lie far from the target values, with substantial variance. In particular, the distributions of both S1 and ΔEST are approximately Gaussian, centered away from the desired regions, reflecting the inherent randomness of the enumeration process. This not only suggests that heuristic-based design strategies alone are insufficient to guide efficient TADF discovery, but also implies that such strategies may bias exploration toward irrelevant regions of the chemical space.
In contrast, the BBAR-generated molecules exhibit property distributions that are more closely aligned with the target values. For ΔEST, the distributions consistently peak near zero across all color targets, which is a favorable condition for TADF. For S1 energy, although the distributions remain relatively broad, the peak positions shift downward from blue to red, closely tracking the respective target values. These results indicate that the trained generative model can effectively guide exploration of TADF-relevant chemical space, even when trained on a relatively small dataset of approximately 1000 molecules.
Table 1 summarizes the detailed hit rates for each generation task. We considered a molecule to be a hit if its S1 energy falls within 1.6–2.0 eV for red, 2.17–2.5 eV for green, or 2.5–2.75 eV for blue emission, and its ΔEST value is below 0.2 eV. On average, 61.4% of the BBAR-generated molecules satisfy the ΔEST threshold, while only 6.3% for the randomly enumerated molecules. This again confirms the earlier observation that heuristic donor–acceptor combinations alone are insufficient to consistently generate molecules with low ΔEST values.
| Random enumeration | BBAR | |||||
|---|---|---|---|---|---|---|
| ΔESTa | E(S1)b | Bothc | ΔESTa | E(S1)b | Bothc | |
| a Number of molecules satisfying ΔEST < 0.2 eV.b Number of molecules satisfying S1 energy criteria: 2.5–2.75 eV (blue), 2.17–2.5 eV (green), and 1.6–2.0 eV (red).c Number of molecules satisfying both ΔEST and S1 energy criteria. Percentages indicate the proportion relative to all successfully validated candidates. | ||||||
| Blue | 57 (6.3%) | 15 (1.7%) | 8 (0.9%) | 363 (53.0%) | 134 (19.6%) | 83 (12.1%) |
| Green | 57 (6.3%) | 16 (1.8%) | 11 (1.2%) | 379 (60.4%) | 147 (23.4%) | 104 (16.6%) |
| Red | 57 (6.3%) | 1 (0.1%) | 1 (0.1%) | 339 (70.8%) | 144 (30.1%) | 127 (26.5%) |
From the perspective of emission color, the difference is even more pronounced. On average, 24.4% of BBAR-generated molecules fall within the desired S1 energy ranges, whereas only around 1% of the randomly enumerated molecules meet this condition. This larger discrepancy arises because donor–acceptor heuristics are primarily effective in promoting low ΔEST (i.e., enhancing RISC), but are less suited for fine-tuning emission color. One can see that the hit rate for red-emitting molecules with the random enumeration set is particularly low, a mere 0.1%. This is likely because the donor and acceptor fragments used in the original work were primarily prepared for developing blue TADF materials. However, even for the blue emission task, the hit rate remains low at 1.7%, underscoring the limitations of heuristic-based design. On the other hand, the BBAR model has achieved substantially higher hit rates for all emission colors without access to any prior domain knowledge. When both criteria were considered simultaneously, BBAR, as expected, achieved much higher hit rate than the random enumeration method: on average, 18.4% of BBAR-generated molecules were hits, compared to just 0.73% for the random enumeration.
In addition to the hit rate analysis, we evaluated the chemical diversity of the generated molecules, as summarized in SI Tables 2 and 3 (see Section 3 of the SI). Across several metrics, the results indicate high chemical diversity. The novelty scores were close to 1, and the generated molecules exhibited a wide range of scaffolds and building blocks not prevalent in the training set. These findings confirm that BBAR effectively avoids structural redundancy while exploring a broad and diverse chemical space to generate molecules with the desired properties. We refer the reader to the SI for a detailed analysis.
![]() | ||
| Fig. 4 Visualization of the chemical space using t-SNE. Four sets of molecules, molecules generated by BBAR (BBAR), molecules satisfying ΔEST < 0.2 eV from the training set (TADF (train)), randomly enumerated molecules (Random), and TADF materials in Gómez-Bombarelli et al.7 (TADF (ref. 7)) are displayed in the figure. The BBAR-generated candidates near the desired TADF materials demonstrate the model's ability to focus on exploring the desired chemical space. Annotated structures highlight representative molecules with their corresponding S1 energy (E(S1)) and ΔEST values. The pink labels represent randomly enumerated molecules with high ΔEST and S1 energies outside the visible range. The dark blue labels correspond to TADF materials from ref. 7, while the red labels denote BBAR-generated molecules with very low ΔEST and S1 energies within the visible range. | ||
Fig. 4 reveals two key findings. First, the “BBAR” set shows substantial overlap with the “TADF (train)” set, significantly more than the “Random” set, indicating that the model successfully learned the property distributions of the training data. Second, the four experimentally reported TADF molecules (“TADF (ref. 7)”) are located within the same clusters as the training set, confirming that the training data is well suited for guiding the model in TADF material design. In contrast, the “Random” set (generated via donor–acceptor enumeration) shows clear separation from both the “BBAR” and “TADF (train)” clusters. This separation suggests that random heuristic methods often produce molecules with properties incompatible with viable TADF materials. For example, many “Random” molecules exhibit high ΔEST values (unfavorable for high RISC rates) and S1 energies outside the visible range (pink labels). The “BBAR” set, however, contains numerous candidates with low ΔEST and visible-range S1 energies (red labels), aligning with TADF requirements. This analysis underscores that, when well-suited building blocks are provided, BBAR can systematically explore a meaningful subset of TADF-relevant chemical space, representing a significant advancement over conventional heuristic strategies.
We believe this behavior stems from the fact that many of the OLED molecules in the training set were originally designed using conventional donor–acceptor combination strategies, which the BBAR model appears to have learned effectively. Remarkably, chemical concepts such as HOMO, LUMO, donor, or acceptor were never explicitly introduced to the model during training or generation. This suggests that BBAR was able to infer these design principles solely through its data-driven learning process. Beyond these simple heuristics, the model appears to have developed more sophisticated design strategies, as previously demonstrated by its focused exploration of TADF candidates. This distinction is likely to originate from fundamental differences in the generation process. Traditional heuristic approaches are essentially qualitative: they categorize fragments as either donors or acceptors and randomly combine one from each category, regardless of nuanced differences in electronic or structural properties. Defining more advanced design rules that account for such complexities is challenging to do manually. Therefore, these approaches often overlook the current molecular context and generate many unsuitable fragment combinations, leading to a low hit rate in TADF design.
In contrast, BBAR behaves more like a quantitative decision-making system. Given the molecular state and target properties, it flexibly evaluates both fragments and attachment positions as continuous probability values. This probabilistic reasoning allows BBAR to surpass heuristic strategies in TADF design, despite following a superficially similar donor–acceptor combination strategy. A more systematic interpretability analysis, such as statistically characterizing recurring decision patterns across generation processes, could potentially uncover new design principles beyond conventional heuristics. However, such quantitative analysis is challenging, particularly for the BBAR model, which was not designed with interpretability as a primary goal. Therefore, we leave this aspect for future work.
Another key observation is that BBAR can generate structurally novel candidates by incorporating unseen building blocks. Here, we define “unseen” as building blocks that do not appear in any TADF-active molecules within the training set, where TADF-active molecules are defined as those with ΔEST < 0.2 eV. Fig. 5c presents representative examples, where the green-highlighted components indicate such unseen blocks. Notably, these building blocks were initially present in non-TADF molecules, exhibiting large ΔEST values and S1 energies outside the visible range. However, BBAR successfully repurposes these blocks by combining them with other suitable units, shifting S1 into the visible region and significantly lowering ΔEST. This behavior highlights BBAR's ability to enhance chemical diversity by leveraging building blocks that were not previously associated with TADF-relevant properties in the training set. Conventional approaches such as random enumeration would entail substantial computational cost to achieve this, as expanding donor/acceptor fragment sets will exponentially increase the library size. Overall, these results demonstrate the dual advantages of generative AI to materials discovery: enabling efficient exploration of chemical space and facilitating the generation of structurally diverse candidates, even in low-data regimes (here, ∼1000 molecules).
Although BBAR biases generation toward synthesizable structures by assembling building blocks, the generated molecules are not guaranteed to be readily synthesizable. This is mainly because (i) we did not fully dissect the commercial availability of all building blocks in the library, and (ii) even when the structures can be expressed as fragment assemblies, the corresponding synthetic routes may still be impractical under realistic experimental constraints. For these reasons, we performed retrosynthetic analysis on the green candidates using SciFindern55 to assess synthetic accessibility. As a result, we identified 17 viable candidates. Considering building-block availability and synthetic cost, we finally chose four candidates (G-1, G-2, G-3, and G-4) for device-level testing, as shown in Fig. 6a. Detailed synthetic procedures for each candidate are provided in the SI.
A common device architecture was used to characterize the four candidates under consistent conditions (Fig. 6b). Their photophysical and device-level properties are summarized in SI Table 4. Photoluminescence (PL) measurements in toluene confirmed that the emission wavelengths of the four candidates ranged from 515 to 542 nm, falling within the typical green emission range of 495–570 nm. The PL wavelengths were slightly red-shifted compared to the theoretical calculations, but this deviation was not significant. Consequently, BBAR was able to effectively guide the design of green-emitting TADF molecules. Transient PL decay measurements (Section 6 of the SI) further confirmed that all four candidates exhibited distinct TADF characteristics. Following device fabrication, the emission wavelengths exhibited a slight blue shift relative to their solution-state values. The corresponding electroluminescence spectra of the OLED devices are shown in Fig. 6c. Fig. 6d presents EQE as a function of current density, while Fig. 6e and f show current density and luminance as a function of voltage, respectively. The EQE values at 1000 cd m−2 were around 10% for all candidates, with G-4 achieving the highest performance at 11.22%. Remarkably, throughout the entire process, we considered only around 1000 candidates for quantum chemical evaluation, in contrast to a previous HTVS-based study7 that screened over one million molecules and performed quantum calculations on more than 400
000 to identify just four novel TADF emitters. Since model inference typically takes less than a second, which is negligible compared to DFT calculations as they often require tens to hundreds of minutes, this reduction represents more than a 100-fold improvement in discovery efficiency and underscores the potency of generative AI to accelerate materials discovery while successfully yielding experimentally validated materials with device-level performance.
However, we note that our approach may still struggle in extremely low-data regimes (e.g., training sets with fewer than a couple of hundred examples) and should therefore be applied with caution in such scenarios. In these cases, we have observed that model performance can degrade substantially, even for relatively simple tasks such as log
P-conditioned molecule generation. To address such challenges, more advanced methodologies such as pre-training on large-scale datasets and employing improved model architectures will likely be necessary. We leave this as future work and consider it a key direction in materials discovery. Nonetheless, our work demonstrates a successful integration of computational design and experimental validation, highlighting its potential to accelerate the development of functional materials for next-generation optoelectronic applications.
Supplementary information (SI): aditional dataset and model details, extended evaluation results, synthesis procedures for selected TADF candidates, and photophysical and device characterization data. See DOI: https://doi.org/10.1039/d5dd00463b.
Footnote |
| † These authors contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2026 |