Yanyu Zhou†
a,
Lanyang Gao†ab,
Chu Jianga,
Yuxi Lia,
Qi Liua,
Chengguo Xua,
Yingying Liuc,
Huajie Liu*a and
Yinan Zhang
*a
aSchool of Chemical Science and Engineering, Tongji University, Shanghai 200092, China. E-mail: yinan_zhang@tongji.edu.cn
bMetabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, Academician (Expert) Workstation of Sichuan Province, Department of General Surgery (Hepatopancreatobiliary Surgery), Fundamental and Clinical Research on Mental Disorders Key Laboratory of Luzhou, The Affiliated Hospital Southwest Medical University, Luzhou 646000, China
cSchool of Chemistry and Chemical Engineering, Center for Transformative Molecules, Zhangjiang Institute for Advanced Study and National Center for Translational Medicine (Shanghai), Shanghai Jiao Tong University, Shanghai 200240, China
First published on 6th October 2025
Structure-based DNA memory represents a paradigm shift from nucleotide-encoded storage, circumventing the costly repetitive synthesis and sequencing dependencies while harnessing the programmable architecture of DNA nanostructures. However, the absence of random-access capability has constrained practical implementation. Here, we implement a Boolean search-enabled random-access scheme for structural DNA memory, wherein data is encoded on DNA origami tiles with orthogonally dimensioned index strands (1D/2D/3D addressing). Boolean operations are executed by hybridizing biotinylated probes to target index combinations, enabling the magnetic extraction of specific files. Atomic force microscopy validation confirms the precise retrieval of data across single-, dual-, and triple-indexed libraries. This approach establishes a robust framework for enabling random access in complex structural DNA databases.
New conceptsDNA offers a promising solution for information storage due to its high density, fidelity, and durability. While traditional sequence-encoded strategies have been thoroughly researched, they continue to face limitations with sequence synthesis and sequencing. Structure-based DNA memory has emerged to address these issues, but research on random access methods remains limited. In this study, we have developed a DNA origami data memory system that utilizes orthogonal dimension index strands, facilitating random access to specific indexed DNA origami files through Boolean logic operations. Our findings provide an innovative approach to random access in structural DNA memory and establish a foundational framework for logic-gated archive queries within this context. |
Structural DNA memory exploits programmable Watson–Crick base pairing to construct spatially organized data matrices for data encoding.18–21 Through scaffold-stapling self-assembly, DNA origami enables the deterministic arrangement of data-encoding moieties, with an approximate positional variance of 5 nm.22–25 This engenders fully addressable molecular canvases, thereby allowing for the precise positioning of elements in user-defined patterns to encode data.26–28
Random access refers to the selective retrieval of specific files or subsets from a larger database. By enhancing read efficiency and accuracy, this approach significantly reduces the time and costs of data retrieval and analysis, making it essential for large-scale data storage and dynamic operations.29,30 For sequence-encoded DNA storage, a variety of random-access strategies—covering PCR amplification,31 molecular interactions,32 physical separation,33,34 and fluorescence-activated sorting (FAS)35—have been developed. However, as a nascent alternative to sequence-encoded approaches, dedicated random-access methodologies have yet to be systematically established for DNA-nanostructure archives, primarily attributed to its recent emergence.36,37 With the rapid progress of structural DNA memory, developing tailored random-access strategies for structural DNA memory has become an urgent imperative.
Here, we present a random-access architecture for DNA structural data archives, facilitating the on-demand retrieval of arbitrary file subsets. Validated through a 16-file origami database encoding culturally significant Chinese knot motifs, our approach achieves a retrieval specificity exceeding 95%. We further enhanced the architecture by embedding orthogonal index sequences for each file, resulting in multi-tiered data addressing. By manipulating the index length, we demonstrate Boolean OR and AND operations, enabling flexible, logic-gated queries for arbitrary files. This work establishes a molecular foundation for logic-gated archival queries in structural DNA memory.
Fig. 1 illustrates the workflow for the coordinate-indexed random-access architecture. A traditional Chinese knot motif was partitioned into 16 discrete files, each converted into a DNA pattern inscribed onto a DNA origami structure using DNA dumbbells (Fig. S1). To enable precise retrieval, we assigned a unique orthogonal 20-nucleotide (nt) index to each DNA origami, corresponding to the coordinates of its respective file.38 This yielded a DNA origami database containing 16 index-tagged origami structures. For selective retrieval, we designed a 28-nt biotinylated probe consisting of a 20-nt recognition segment complementary to the target index and an 8-nt toehold domain. The probe hybridization facilitated the capture of the target DNA origami via streptavidin magnetic beads, after which toehold-mediated strand displacement triggered the release of the bound origami.39–41 Finally, the retrieved DNA patterns were read out using atomic force microscopy (AFM) imaging.
We initially validated our approach using a two-file sub-database that contained DNA origami files x = 1 and x = 2 (Fig. S2a). AFM imaging confirmed the coexistence of both DNA origami at approximately equimolar ratios before selection (Fig. S2b and c). Upon introducing the x = 2 probe, the retrieved DNA origami predominantly exhibited the x = 2 pattern, achieving a specificity of 99.1% (Fig. S2b and d). To assess the scalability, we expanded the database to include four files. AFM verification revealed that the single-file query maintained a specificity of 98.7% (Fig. S3). These findings demonstrate our approach's ability for precise target file selection and its adaptability for complex DNA origami databases.
Subsequently, we conducted a comprehensive random-access evaluation of the complete single-index Chinese knot database comprising all 16 DNA origami files. The inherent symmetry of the Chinese knot can lead to identical DNA patterns emerging from different coordinate assignments, compromising query validation. To address this, we implemented unique coordinate-specific signatures for each DNA origami by employing four asymmetric pixel markers (four DNA branches per pixel; solid/hollow circles denoting active/inactive states) (Fig. 2a and Fig. S4 and S5), enabling AFM discrimination of all 16 origami (Fig. S6).42 We targeted the file x = 7 using the corresponding probe, with AFM imaging confirming the correct DNA origami and pixel markers. Statistical analysis showed that 96.4% of the retrieved structures matched the intended pattern and marker array, indicating high-fidelity file recovery (Fig. 2b).
Furthermore, we executed multi-file queries, successfully retrieving subsets of two files (x = 2 and x = 15) and four files (x = 1, x = 5, x = 10, and x = 12). AFM imaging corroborated the presence of anticipated DNA patterns, with pixel marker arrays validating correct coordinate assignments. Target DNA origami constituted 95.7% of the structures recovered in the two-file query (Fig. 2c) and 95.4% in the four-file query (Fig. 2d), indicating the retention of specificity even amidst simultaneous multi-file retrieval. Excess probes were used for effective retrieval, demonstrating that the retrieved abundance reflected the initial stoichiometric equivalence across all 16 origami. Additional file-subset queries consistently achieved accurate recovery of the desired DNA pattern with comparable specificity (Fig. S7–S9).
The densely addressable surface of DNA origami enables incorporation of multiple orthogonal indices per file.43–45 This combinatorial strategy enhances the quantity of uniquely addressable files while reducing the dependency on entirely orthogonal sequences, broadening the database capacity and complexity. We constructed a dual-index Chinese knot database in which each DNA origami carries two orthogonal index strands encoding its x and y coordinates for precise 2D addressing (Fig. 3a). Boolean logic queries were implemented by tailoring index length: longer indexes enabled OR operations (activation if either index matches), whereas shorter indexes enforced AND operations (activation only when both indexes match).
To validate this, we assembled a sub-database containing only the (1, 1) and (2, 1) DNA origami files. Each index strand was segmented into non-interacting OR/AND logic domains, with probes exclusively hybridizing to target coordinate-specific segments (Fig. 3b). We systematically evaluated 15-nt and 20-nt OR indexes, confirming both achieved high-specificity target capture with comparable extraction yields of 96.7% vs. 98.1% (Fig. S10). Given oligonucleotide length constraints, the 15-nt configuration was selected for subsequent validation. Employing this design, we observed that probe OR x = 1 selectively enriched the file (1, 1), while the probe OR x = 2 retrieved only the file (2, 1) (Fig. S11a and b). Simultaneous addition of both probes recovered both targets at roughly equimolar ratios, demonstrating effective parallel selection (Fig. 3c, e and Fig. S11c).
Next, we investigated AND logic by systematically testing 7-, 9-, and 11-nt index segments under three conditions: AND x = 2 only, AND y = 1 only, or both probes. AFM analysis indicated that 7-nt indexes exhibited insufficient binding, resulting in no file recovery even with both probes, while 11-nt indexes showed excessive binding, with a single probe sufficient for detection. Critically, the 9-nt index met the AND requirement, as file (2, 1) was detected only with both probes, achieving a specificity of 97.2% (Fig. 3d, e and Fig. S12).
Applying this optimized 9-nt design for the dual-index Chinese knot database enabled unambiguous target retrieval, though the increased non-target structures reduced the average specificity to 78.4% (Fig. 3f and Fig. S13). We omitted pixel markers to simplify AFM recognition. To enhance retrieval capacity, we developed a three-index database of 32 DNA origami featuring the Chinese knot (z = 1) and the Fangsheng pattern (z = 2) (Fig. 4a), the latter containing single-pixel identifiers (Fig. S14).
Implementing a three-condition AND gate required re-optimized index lengths. We compared binding segments of 6-, 7-, and 8-nt designs in a two-file sub-database, in which only the 8-nt indexes yielded detectable signals, retrieving file (1, 2, 2) with 82.9% specificity (Fig. 4b, c and Fig. S15). Using these 8-nt indexes, we successfully isolated all Fangsheng files from the full database, demonstrating precise multi-index selection at scale (Fig. 4d).
Additionally, increasing the number of mutually orthogonal indexes within a DNA origami structure can facilitate the execution of more intricate Boolean logic operations and enable multi-condition retrieval. However, extraction yields of extremely short indexes may be compromised even at higher multiplicities, necessitating systematic evaluation of the length-quantity balance. Simultaneously, more complex random-access operations can be supported without altering index sequences through adjusted probe modifications and optimized magnetic bead selection, where parameters such as biotin affinity tags should be systematically adjusted to maximize hybridization efficiency.
Footnote |
† These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |