NaviDiv: A Web App for Monitoring Chemical Diversity in Generative Molecular Design
Abstract
The rapid progress in generative models for molecular design has led to extensive libraries of candidate molecules for biological and chemical applications. However, ensuring these molecules are diverse and representative of broader chemical space remains challenging, with researchers often over-exploring limited regions or missing promising candidates due to inadequate monitoring tools. This work presents NaviDiv (Navigating Diversity in Chemical Space), a comprehensive web-based framework for managing chemical diversity in the string-based generative molecular design through three integrated capabilities: multi-metric diversity analysis capturing structural, syntactic, and molecular framework variations; interactive real-time visualization enabling immediate detection of model collapse; and adaptive constraint generation that dynamically guides optimization while preserving diversity. Through a singlet fission material discovery case study using REINVENT4, we demonstrate that different diversity metrics (i.e. structural similarity, fragment composition, and sequence patterns) respond differently during optimization, with constraint effectiveness depending critically on representational alignment with the generative model. N-gram-based constraints outperform fingerprint-based approaches due to direct correspondence with SMILES generation, while combined constraints maintain diversity across all metrics while achieving optimization performance within 15\% of unconstrained baselines. The framework is freely available at \url{https://github.com/LCMD-epfl/NaviDiv}, providing accessible tools for data-driven decisions about diversity-property trade-offs in automated molecular discovery.
Please wait while we load your content...