A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides

Abstract

Being able to predict the cell permeability of cyclic peptides is essential for unlocking their potential as a drug modality for intracellular targets. With a wide range of studies of cell permeability but a limited number of data points, the reliability of the machine learning (ML) models to predict previously unexplored chemical spaces becomes a challenge. In this work, we systemically investigate the predictive capability of ML models from the perspective of their extrapolation to never-before-seen applicability domains, with a particular focus on the permeability task. Four predictive algorithms, namely Support-Vector Machine, Random Forest, LightGBM and XGBoost, jointly with a conformal prediction framework were employed to characterize and evaluate the applicability through uncertainty quantification. Efficiency and validity of the models' predictions with multiple calibration strategies were assessed with respect to several external datasets from different parts of the chemical space through a set of experiments. The experiments showed that the predictors generalizing well to the applicability domain defined by the training data, can fail to achieve similar model performance on other parts of the chemical spaces. Our study proposes an approach to overcome such limitations by the means of improving the efficiency of models without sacrificing the validity. The trade-off between the reliability and informativeness was balanced when the models were calibrated with a subset of the data from the new targeted domain. This study outlines an approach to enable the extrapolation of predictive power and restore the models' reliability via a recalibration strategy without the need for retraining the underlying model.

Graphical abstract: A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides

Supplementary files

Article information

Article type
Paper
Submitted
26 Feb 2024
Accepted
24 Jul 2024
First published
30 Jul 2024
This article is Open Access
Creative Commons BY license

Digital Discovery, 2024, Advance Article

A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides

G. Geylan, L. De Maria, O. Engkvist, F. David and U. Norinder, Digital Discovery, 2024, Advance Article , DOI: 10.1039/D4DD00056K

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements