Inverse design of thermally active composite via policy-transferred reinforcement learning

Songho Lee; Sukheon Kang; Jisoo Nam; Jecheon Yu; Miso Kim; Seunghwa Ryu

doi:10.1039/D6MH00239K

Inverse design of thermally active composite via policy-transferred reinforcement learning

Songho Lee,

†^a Sukheon Kang,

†^a Jisoo Nam,^a Jecheon Yu,^a Miso Kim^a and Seunghwa Ryu

*^ab

Author affiliations

* Corresponding authors

^a Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
E-mail: ryush@kaist.ac.kr

^b KAIST InnoCORE PRISM-AI Center, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea

Abstract

Active composites (ACs) capable of autonomous shape transformation under external stimuli enable new opportunities for soft robotics, biomedical devices, and intelligent structures. However, the combinatorial design space of multi-material 3D printing makes inverse design computationally intractable. Here, a reinforcement learning (RL)-based framework is proposed that reformulates the inverse design of thermally active composites (TACs) as a sequential decision-making process. A 4 × 24 grid is decomposed into 24 column-wise decisions to minimize deformation error with respect to target trajectories. A single target design was first demonstrated for an individual trajectory. A target-conditioned policy was then learned using multiple targets to enable rapid design across diverse shapes. The multiple target policy was further transferred to accelerate single target optimization. Performance was evaluated against genetic algorithm (GA) and sequential subdomain optimization (SSO) using the number of samples and function evaluations (FEs) under identical termination criteria. Experimental validation was conducted using 4D-printed TAC specimens via grayscale digital light processing (g-DLP), and demonstrations with complex trajectories, including free-form KAIST logo patterns, confirm that the proposed framework achieves target accuracy (root mean square error ≤ 0.1) with low samples and FEs. This study demonstrates that an RL agent can rapidly perform sequential material design through long-term reward optimization, indicating its potential for extension to intelligent design and manufacturing pipelines.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D6MH00239K
Article type: Communication
Submitted: 08 Feb 2026
Accepted: 27 Apr 2026
First published: 06 May 2026

Download Citation

Mater. Horiz., 2026, Advance Article

Permissions

Request permissions

Inverse design of thermally active composite via policy-transferred reinforcement learning

S. Lee, S. Kang, J. Nam, J. Yu, M. Kim and S. Ryu, Mater. Horiz., 2026, Advance Article , DOI: 10.1039/D6MH00239K

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Materials Horizons

Inverse design of thermally active composite via policy-transferred reinforcement learning

Abstract

Supplementary files

Article information

Download Citation

Permissions

Inverse design of thermally active composite via policy-transferred reinforcement learning

Social activity

Search articles by author

Spotlight

Advertisements