Machine learning prediction of self-assembly and analysis of molecular structure dependence on the critical packing parameter
Abstract
Amphiphilic molecules spontaneously form self-assembly structures depending on physical conditions such as the molecular structure, concentration, and temperature. These structures exhibit various functionalities according to their morphology. The critical packing parameter (CPP) is used to correlate self-organized structures with the chemical composition. However, accurately calculating it requires information about both the molecular shape and molecular aggregates, making it challenging to apply directly in molecular design. We aimed to predict the self-assembled structure of a molecule directly from its chemical structure and to analyze the factors influencing it using machine learning. Dissipative particle dynamics simulations were used to reproduce many self-assembly structures comprising various chemical structures, and their CPPs were calculated. Machine learning models were built using the chemical structures as input data and the CPPs as output data. As a result, both random forest and the gated recurrent unit showed high prediction accuracy. Feature importance analysis and sample size dependence revealed that the amphiphilic nature of molecules significantly influences the self-assembly structures. Additionally, selecting an appropriate molecular structure representation for each algorithm is crucial. The study results should contribute to product development in the fields of materials science, materials chemistry, and medical materials.
- This article is part of the themed collections: MSDE Recent HOT Articles and Machine Learning and Artificial Intelligence: A cross-journal collection