Leveraging ensemble machine learning models (XGBoost and random forest) and genetic algorithms to predict factors contributing to the liposomal entrapment of therapeutics
Abstract
Liposomes are regarded as safe and biodegradable drug delivery systems, carrying therapeutic agents to desired body sites for numerous applications. Before any application, the developed liposomes should be properly characterized. Entrapment efficiency (EE%), the percentage of drug entrapped in liposomes, is an important characterization analysis parameter, particularly for expensive therapeutic agents and feeding materials. Digital technologies such as artificial intelligence (AI) and machine learning facilitate the analysis of large datasets and prediction of outcomes to achieve the desired characteristics of drug delivery systems. Herein, we applied an ensemble machine learning approach using random forest- and XGBoost-genetic algorithms for assessing the effect of cargo- and carrier-related factors controlling the EE% of therapeutics using a dataset consisting of 500 data points. Interpretation of results was accomplished with the use of explainable AI. Based on our results, the most influential variables affecting the EE% of all drugs in liposomes were ranked as follows: water solubility, size, cholesterol to phospholipid molar ratio, and drug to lipid molar ratio. Including drugs with log P > 1 in the analysis, results were in the following order: water solubility, log P, phase transition temperature (Tm), and size. When considering drugs with log P < 1, the most influential factors were as follows: water solubility, cholesterol to phospholipid molar ratio, log P, and size. Moreover, the statistical analysis indicated the appropriateness of using the genetic algorithm to optimize the tree-based machine learning models. This optimization could facilitate the selection of appropriate parameters for analyzing the liposomization of various cargoes.