Machine learning-driven molecular engineering of nucleic acids
Abstract
Molecular engineering has played a pivotal role in biomedical fields, driving significant advancements in gene therapy, disease diagnosis, and biosensing. However, nucleic acid molecular engineering faces various challenges including vast design spaces, complex structure–function relationships, lengthy application validation cycles, and inefficient optimization processes. Machine learning (ML), with its superior pattern recognition, multidimensional data integration, and automated optimization capabilities, offers a unique opportunity to construct predictive models of sequence-structure–function relationships, thereby enabling a paradigm shift from empirically driven to data-driven approaches. This review systematically surveys recent progress in ML applications across three major domains: nucleic acid structure construction, performance modulation, and application expansion. It also explores core challenges such as data quality, model interpretability, and experimental validation efficiency, along with potential resolution strategies. These insights are poised to propel nucleic acid molecular engineering from static structure prediction toward dynamic behavior simulation, and from single-molecule design to complex system engineering, guiding future directions in hybrid ML-quantum models and expanded applications to non-canonical nucleic acids for transformative innovation in biomedicine, environmental monitoring, and information technology.

Please wait while we load your content...