A Multi-Task Learning Approach for Prediction of Missing Bioactivity Values of Compounds for the SLC Transporter Superfamily

Abstract

Solute carrier (SLC) transporters constitute the largest family of membrane transport proteins in humans. They facilitate the movement of ions, neurotransmitters, nutrients, and drugs. Given their critical role in regulating cellular physiology, they are important therapeutic targets for neurological and psychological disorders, metabolic diseases, and cancer. Inhibition of SLC transporters can modulate substrate gradients, restrict the cellular uptake of nutrients and drugs, and thereby facilitate specific pharmacological effects. Despite their pharmaceutical relevance, many SLC transporters remain understudied. Having a complete bioactivity matrix of associated compounds can expand the knowledgebase of SLC ligands, enlarge the information pool to guide downstream processes, and promote informed decision-making steps in discovery on new drug candidates for SLC transporters. To address data sparsity of available compound-bioactivity values causing inhibitory response for SLC transporters, we employed a multi-task learning approach with a data imputation objective. By leveraging relationships between related tasks, deep learning has previously shown promise in imputing compound bioactivities across multiple assays. We developed a multi-task deep neural network (MT-DNN) to predict and impute missing pChEMBL (-Log(IC50)) values across the SLC transporter superfamily. With a data matrix density of 2.53% and an R2 of 0.74, our model demonstrated robust predictive performance. Specifically, we predicted missing values for 9,122 unique compounds across 54 SLC targets spanning various folds and subfamilies, generating 480,133 predictions from 12,455 known interactions. The advantages of the multi-task learning (MTL) approach were indicated in the ability of certain targets to leverage the shared representation of knowledge and acquire increased predictive accuracy over single-task learning (STL) counterparts. Despite the limitations set by low data density, activity cliffs, and inter-protein heterogeneity, the MT-DNN showed promising potential as a tool to address data sparsity within the SLC superfamily.

Supplementary files

Article information

Article type
Paper
Submitted
03 Dec 2025
Accepted
03 Jan 2026
First published
08 Jan 2026
This article is Open Access
Creative Commons BY license

Digital Discovery, 2025, Accepted Manuscript

A Multi-Task Learning Approach for Prediction of Missing Bioactivity Values of Compounds for the SLC Transporter Superfamily

T. Cerimagic, S. Sosnin and G. F. Ecker, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00536A

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements