Enhancing Predictive Modeling with Molecular Fingerprint Fusion Strategies

Viktoriia Turkina; Melanie  R.W. Messih; Etienne Kant; Jelle T. Gringhuis; Annemieke Petrignani; Gary Corthals; Jake W.  O'Brien; Saer Samanipour

doi:10.1039/D5DD00302D

Enhancing Predictive Modeling with Molecular Fingerprint Fusion Strategies

Viktoriia Turkina, Melanie R.W. Messih, Etienne Kant, Jelle T. Gringhuis, Annemieke Petrignani, Gary Corthals, Jake W. O'Brien and Saer Samanipour

Abstract

A large number of chemicals remain poorly characterized in terms of their physicochemical properties, biological activity, and environmental fate. Quantitative structure-activity relationship (QSAR) models have become indispensable tools for predicting these properties, especially for compounds that lack comprehensive experimental data. The choice of structural representation as an input to such models plays a critical role in ensuring high predictive performance and in identifying molecular features that strongly contribute to activity prediction. Both hashed and non-hashed molecular fingerprints are widely employed as inputs in QSAR modeling across various domains. While some studies have explored combining multiple fingerprints to improve molecular representation, comprehensive investigations into different fingerprint fusion strategies and the generalizability of a fused fingerprint across diverse prediction tasks remain limited. In this study, we applied low-, mid-, and high-levels fusion strategies to combine six non-hashed fingerprints and evaluated model performance across six publicly available datasets, including three regression and three classification tasks. Our results demonstrate that mid-level fusion, where fingerprint bits are selectively combined based on their importance within individual models, consistently improves predictive accuracy, as assessed by RMSE and R2 for regression, and F1-score and ROC-AUC for classification. The algorithm developed for molecular fingerprints fusion is universal and can be applied to a wide range of predictive modeling problems or other non-hashed molecular fingerprints.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

DOI: https://doi.org/10.1039/D5DD00302D
Article type: Paper
Submitted: 10 Jul 2025
Accepted: 09 Apr 2026
First published: 16 Apr 2026
This article is Open Access

Download Citation

Digital Discovery, 2025, Accepted Manuscript

Permissions

Request permissions

Enhancing Predictive Modeling with Molecular Fingerprint Fusion Strategies

V. Turkina, M. R.W. Messih, E. Kant, J. T. Gringhuis, A. Petrignani, G. Corthals, J. W. O'Brien and S. Samanipour, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00302D

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Enhancing Predictive Modeling with Molecular Fingerprint Fusion Strategies

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Enhancing Predictive Modeling with Molecular Fingerprint Fusion Strategies

Social activity

Search articles by author

Spotlight

Advertisements