A Diverse and Chemically Relevant Solvation Model Benchmark Set with Flexible Molecules and Conformer Ensembles

Abstract

We introduce FlexiSol - a flexible solvation benchmark set with molecule ensembles. FlexiSol is the first of its kind to combine structurally and functionally complex, highly flexible solutes with exhaustive conformational sampling for systematic testing of solvation models. The dataset contains 824 experimental solvation energy and partition ratio data points (1551 unique molecule-solvent pairs) at standard-state conditions, focusing on drug-like, medium-to-large flexible molecules (up to 141 atoms), with over 25000 theoretical conformer/tautomer geometries across all phases. The set is publicly available and data points were selected in order to have minimal overlap with existing sets. Using this benchmark, we evaluate a broad spectrum of popular implicit solvation approaches, including physics-based (quantum-chemical and semiempirical) and data-driven models. We find that partition ratios are generally computed more accurately compared to solvation energies, likely due to partial error cancellation, yet most models still systematically underestimate strongly stabilizing interactions while overestimating weaker ones in both solvation energies and partition ratios. Additionally, we investigate the impact of three key ingredients: conformational ensemble, geometry choice (phase-specific vs. single-phase), and underlying electronic energy method. We find that full Boltzmann-weighted ensembles or just the lowest-energy conformers yield very similar accuracy - still both require conformational sampling - whereas random single-conformer selection degrades performance, especially for larger and flexible systems. Geometry relaxation and the level of electronic structure theory both influence results; however, the magnitude and sometimes direction of these effects can vary by method, as fortuitous error cancellation sometimes masks underlying deficiencies present in the models. As a complement to existing data sets, FlexiSol will enable more systematic development and evaluation of solvation models.

Supplementary files

Article information

Article type
Edge Article
Submitted
21 Aug 2025
Accepted
10 Oct 2025
First published
13 Oct 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY-NC license

Chem. Sci., 2025, Accepted Manuscript

A Diverse and Chemically Relevant Solvation Model Benchmark Set with Flexible Molecules and Conformer Ensembles

L. Wittmann, C. E. Selzer and S. Grimme, Chem. Sci., 2025, Accepted Manuscript , DOI: 10.1039/D5SC06406F

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements