Leveraging active site information for deep learning prediction of enzyme-substrate Michaelis constants

Abstract

The Michaelis constant (KM) is a key parameter in enzymology. Its experimental measurement is often low-throughput and costly, but machine learning (ML) can identify patterns to make predictions in a high-throughput way. In this work, we introduce a novel approach that explicitly incorporates enzyme-substrate interface information by encoding the enzyme’s active site as a feature. Using a simple multilayer perceptron (MLP) with a gated layer, we demonstrate that this explicit active site information enables our model, Active Site for KM (AS4Km), to achieve competitive performance on the independent GMKM test set, despite its relatively simple architecture. Ablation studies confirm that active site features significantly enhance generalization to unseen data and distant enzyme sequences. Furthermore, our analysis highlights a critical limitation in current enzymology databases: predictive performance is heavily reliant on substrate identity due to low substrate diversity and a bias towards active enzyme-substrate complexes. Our results show that AS4Km, a data-driven approach combined with explicit interaction interface features, displays competitive performance in the prediction of KM values for enzyme-substrate complexes, and may be able to assist in the identification of novel substrates for known enzymes.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
24 Feb 2026
Accepted
30 May 2026
First published
01 Jun 2026
This article is Open Access
Creative Commons BY license

Digital Discovery, 2026, Accepted Manuscript

Leveraging active site information for deep learning prediction of enzyme-substrate Michaelis constants

D. Lepikhov, L. Sandner and A. Nunes-Alves, Digital Discovery, 2026, Accepted Manuscript , DOI: 10.1039/D6DD00094K

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements