Yubo
Gao†
ab,
Zhicheng
Zhu†
a,
Zhen
Chen
a,
Meng
Guo
b,
Yiqing
Zhang
a,
Lina
Wang
*b and
Zhiling
Zhu
*a
aCollege of Materials Science and Engineering, Qingdao University of Science and Technology, Qingdao, Shandong 266042, China. E-mail: zlzhu@qust.edu.cn
bCollege of Environment and Safety Engineering, Qingdao University of Science and Technology, Qingdao, Shandong 266042, China. E-mail: linawang@qust.edu.cn
First published on 6th March 2024
Nanozymes, a distinctive class of nanomaterials endowed with enzyme-like activity and kinetics akin to enzyme-catalysed reactions, present several advantages over natural enzymes, including cost-effectiveness, heightened stability, and adjustable activity. However, the conventional trial-and-error methodology for developing novel nanozymes encounters growing challenges as research progresses. The advent of artificial intelligence (AI), particularly machine learning (ML), has ushered in innovative design approaches for researchers in this domain. This review delves into the burgeoning role of ML in nanozyme research, elucidating the advancements achieved through ML applications. The review explores successful instances of ML in nanozyme design and implementation, providing a comprehensive overview of the evolving landscape. A roadmap for ML-assisted nanozyme research is outlined, offering a universal guideline for research in this field. In the end, the review concludes with an analysis of challenges encountered and anticipates future directions for ML in nanozyme research. The synthesis of knowledge in this review aims to foster a cross-disciplinary study, propelling the revolutionary field forward.
The diversity in the materials used for nanozymes, ranging from metal oxide nanoparticles16 and metal nanoparticles17 to intricate single-atom X–N–C mimetic architectures18 and biomolecular assemblies,19,20 underscores the rapid expansion in both the quantity and variety of enzyme-like species. Nanozymes, mimicking oxidoreductases,21 hydrolases,22 lyases,23 and isomerases,24 exhibit a remarkable diversity that transcends traditional enzyme functionalities. This diversity has paved the way for the realization of artificial enzymes that surpass their natural counterparts.25–27 However, as nanozyme research advances, the conventional trial-and-error approach to discovering novel high-performance nanozymes is becoming increasingly challenging. The intricate design of high-performance nanozymes necessitates consideration not only of the physicochemical properties of the materials themselves, but also of the intricate interplay between these materials and their environments. Nanozymes operating in complex environments face the presence of various chemicals that may compete for binding to the active sites, potentially diminishing the catalytic efficiency. To grapple with the escalating complexity in the design and application of nanozymes, researchers are turning their attention to innovative research methodologies, moving beyond traditional approaches to explore new paradigms.
Machine learning (ML), as a powerful tool for statistical data analysis, utilizes algorithms to analyse and learn from data in order to discover patterns and regularities within the data.28 Through these processes, ML algorithms can continuously optimize and improve as new data are introduced, enabling computers to learn from experience and make decisions or predictions based on the learned content.29 In the realm of materials science, ML has proven highly effective, playing a pivotal role in guiding materials synthesis,30 facilitating materials characterization,31 predicting materials properties,32 and elucidating intricate structure–activity relationships.33 Notably, ML, as a potent tool for statistical data analysis, has recently found its stride in the field of nanozyme research. This has opened up new avenues for tackling challenges related to nanozyme design, performance analysis, and the promotion of applications.34,35
This review is dedicated to an in-depth examination of the multifaceted roles that ML plays in the design and application of nanozymes, offering insights into recent findings in related domains (Fig. 1). Initially, the review delves into the ways in which ML contributes to the rational design of nanozymes. Subsequently, it encapsulates diverse application areas where ML is applied to nanozyme design and delineates a structured roadmap for leveraging ML in the design of high-performance nanozymes. The concluding sections of the review address current challenges and illuminate future trends in the application of ML to nanozymes. This comprehensive review not only consolidates the current research landscape in the ML–nanozyme intersection, but also introduces statistical analysis methods utilizing ML. The aim is to catalyse interdisciplinary collaborations, fostering further advancements in this revolutionary field.
Fig. 1 Research achievements of the combination of ML and nanozymes in recent years. Reprinted with permission from ref. 40. Copyright 2023 Elsevier. Reprinted with permission from ref. 42. Copyright 2022 Wiley-VCH GmbH. Reprinted with permission from ref. 46. Copyright 2023 Wiley-VCH GmbH. Reprinted with permission from ref. 44. Copyright 2022 American Chemical Society. Reprinted with permission from ref. 51. Copyright 2021 American Chemical Society. |
Fig. 2 Prediction of kinetic parameters in the Michaelis–Menten equations. (A) Michaelis–Menten equations. (B) The workflow for classification and quantitative prediction of enzyme-like activity of nanomaterials using ML. (C) Schematic diagram of fully connected DNN-based models. (D) Heatmap images of prediction accuracies for 13 DNN-based quantitative model based on R2. Each box represents one model built from a certain dataset, estimated by R2. Reprinted with permission from ref. 42. Copyright 2022 Wiley-VCH GmbH. (E) Stacking ensemble algorithm scheme for triple-catalytic (POD-, OXD-, and CAT-like) activity prediction. Reprinted with permission from ref. 43. Copyright 2023 Research Square. |
Razlivina et al. conducted an extensive data collection effort from over 100 published papers, culminating in the establishment of a nanozyme database (Dizyme).41 This database comprehensively incorporates various parameters, including component properties (e.g., electronegativity, electron affinity, oxidation state, and ionic radius), material characterization (e.g., surface charge, stability, surface adsorption, and surface area), and reaction conditions (e.g., pH, temperature, substrate type, nanozyme concentration, substrate concentration). Employing the random forest regression (RFR) model, they achieved highly accurate predictions of nanozyme kinetic activity. For lgKcat and lgKm, the coefficients (R2) of determination reached 0.52 and 0.80, respectively. Wei et al. gathered activity data from over 300 papers and identified eight endogenous factors and three exogenous factors influencing nanozyme activity.42 Utilizing these factors as variables, they constructed fully connected deep neural network (DNN) models to predict various nanozyme types (POD-, OXD-, CAT-, SOD-like activities) and nanozyme activities (Km, Vmax, Kcat, Kcat/Km, and IC50) (Fig. 2B–D). For predicting the activity levels of POD- and OXD-like nanozymes, the R2 values reached 0.66 and 0.80, demonstrating robust performance. Furthermore, Vinogradov et al. expanded the data in Dizyme to 1210 samples, incorporating additional descriptors such as molecular weight, topological and electronic coating descriptors, synthesis details, and assay conditions.43 Employing a stacked integrated learning approach, which combines multiple ML models, they ultimately selected a linear regression model as the meta-model to predict POD-, OXD-, and CAT-like activities of nanozymes (Fig. 2E).
It is noteworthy that while predicting the kinetic parameters in the Michaelis–Menten equation can enhance the accuracy of determining nanozyme activity, the experimental determination of these kinetic parameters remains a prerequisite. Predicting the activity of numerous yet undiscovered materials with enzyme-like characteristics is challenging given this experimental dependency. Consequently, the method of characterizing nanozyme activity by predicting the kinetic parameters of the Michaelis–Menten equation, which subsequently informs nanozyme design, is significantly constrained by the size of the available nanozyme databases.
Zhang et al. identified transition metal thiophosphates (MxPySz; x = 1–7, y = 1–4, z = 1–29), characterized by mixed valence states formed by different bonding states, as potential candidates for SOD-like nanozymes.44 Employing DFT calculations, they obtained the Gibbs free energies of the first three steps of the catalytic reaction for these materials. Utilizing seven parameters (e.g., electronegativity, dopant atom radius, dopant atom position, dopant atom concentration, and band gap) as input data, and ΔG of the first three steps of the SOD-like catalytic reaction as output data, they predicted the changes in the Gibbs free energy for the SOD-like catalytic reaction using an RF model. This approach led to the identification of a new highly active SOD-like nanozyme, MnPS3 (Fig. 3A). Yu et al. explored 14 different nonmetallic atoms doped into graphdiyne (GDY) as potential materials, calculating the Gibbs free energy of the catalytic reaction for different materials using DFT.45 With input data including electronegativity, dopant atom radius, dopant atom position, dopant atom concentration, and band gap, and output data representing the maximum energy-consuming step and maximum energy barrier of the POD-like catalysed reaction, they employed the extreme gradient boosting (XGB) algorithm to screen highly active POD-like nanozymes. This led to the identification of boron-doped GDY (B-GDY) and nitrogen-doped GDY (N-GDY) as highly active POD-like nanozymes (Fig. 3B). Gao et al. utilized the extreme gradient boosting regression (XGBR) algorithm with 11 basic atomic features, including electronegativity and electron affinity energy, as input data for predicting Eads,OH and Eads,H.46 They screened 60 nanozymes with high POD- and CAT-like activities from the computational 2D materials database (C2DB). The predictions were found to be consistent with DFT calculations (Fig. 3C). These studies highlight that ML algorithms for predicting and screening nanozymes can significantly reduce the time required compared to exclusive DFT calculations.
Fig. 3 Screening high activity nanozymes by predicting energy change. (A) The workflow of ML-assisted screening and prediction of nanozymes. Reprinted with permission from ref. 44. Copyright 2022 American Chemical Society. (B) Prediction of the reaction maximum energy barrier and the energy consuming step using ML models. Reprinted with permission from ref. 45. Copyright 2022 American Chemical Society. (C) Computational screening of POD- and CAT-like nanozymes with the corresponding adsorption energy criteria and the more stringent criteria. Reprinted with permission from ref. 46. Copyright 2023 Wiley-VCH GmbH. |
The prediction of energy changes during nanozyme catalysed reactions relies on first-principles calculations. By calculating the energy changes during nanozyme catalysed reactions for a large number of nanomaterials with similar or identical structures and integrating the results obtained from these calculations with ML models, it becomes feasible to predict the enzyme-like activity of a specific material. Experimental determination of the kinetic parameters of the Michaelis–Menten equation is susceptible to errors due to various factors, such as the lack of standardized procedures. In contrast, computational predications of energy changes during nanozyme catalysis typically utilize uniform theoretical models and computational programs, reducing the variability associated with experimental procedures and enhancing the reproducibility and comparability of the calculated data. However, the accuracy of computational models strongly relies on the realism of the assumptions and parameter settings of the models, which may not fully capture all the complex interactions and environmental conditions under different experimental settings. Therefore, experimental data provide measured data of the actual reaction process, while computational data offer theoretical predictions and support for analysis. More and more scientists combine both approaches to gain a more comprehensive understanding of the kinetic behaviour of nanozymes. Therefore, predicting energy changes during nanozyme catalysed reactions holds greater promise for accurately designing new highly active nanozymes.
Wei et al. employed SHapley Additive exPlanations (SHAP) to assess the importance of features in the collected data (Fig. 4A) during the development of their ML model for predicting the type and activity of nanozymes.42 By examining the magnitude of SHAP values, the influence of each feature on the type of nanozyme activity was gauged. Notably, metal type, metal proportion, and metal valence emerged as the top three internal factors influencing the type of nanozyme activity (Fig. 4B). These findings underscore the pivotal role of adjusting the metal composition in nanozymes for their rational design. Furthermore, through SHAP analysis, the study revealed that altering the type of metallic elements holds greater significance than changing the type of nonmetallic elements in modulating the type of nanozyme activity.
Fig. 4 Understanding the structure–activity relationship of nanozymes through model interpretability. (A) SHAP analysis of the sensitivity of different factors in the classification model. (B) SHAP sensitive analysis of the independent variables in the POD- and OXD-like quantitative models. Reprinted with permission from ref. 42. Copyright 2022 Wiley-VCH GmbH. (C) Feature engineering to select the most important features of adsorption energies. (D) SHAP sensitive analysis of the independent features in the ML models. (E and F) Selected features for hydroxyl (Eads,OH) and hydrogen (Eads,H) adsorption energies with corresponding importance scores. Reprinted with permission from ref. 46. Copyright 2023 Wiley-VCH GmbH. |
In the development of ML models for predicting 2D materials’ Eads,OH and Eads,H, Gao et al. gathered 66 features for 1019 materials, resulting in two 1019 × 66 feature matrices (Fig. 4C).46 They utilized the XGBoost regression (XGBR) algorithm for feature selection, optimizing the number of features. Subsequently, SHAP was employed for feature importance analysis, revealing the underlying physical laws within the ML model (Fig. 4D). Notably, average electronegativity (χM,avg) proved to be the most significant for Eads,OH, while the average electron affinity of non-metallic elements (Ea,NM,avg) had the greatest impact on Eads,H (Fig. 4E and F). The study also unveiled the substantial influence of the electronegativity or first ionization energy of metallic elements on electron transfer capacity, influencing the OH adsorption energy. The non-concentrated distribution of SHAP values indicates a lack of a simple linear relationship between characteristics and adsorption energy. This comprehensive analysis not only enhances the understanding of the relationship between composition and nanozyme activity, but also holds crucial guiding significance for the rational design of nanozymes.
Examining the interplay between atomic and electronic structures and the catalytic properties of nanozymes at the microscopic level is a crucial step in designing high-performance nanozymes. However, elucidating the structure–activity relationships of nanozymes through empirical means poses significant challenges. Interpretable ML modeling not only aids in comprehending the structure–activity relationships of nanozymes but is also anticipated to provide more effective guidance for the design of nanozymes.
Compared to nanozymes with a single class of enzyme-like activity, the catalytic mechanism of multienzyme-like nanozymes is more intricate, involving different reaction paths and interactions between various enzyme-like activities in cascade, promotion, and antagonism. This complexity poses challenges for theoretically guided design. However, by combining the methods of ML and molecular simulation calculations to explore optimal nanozyme catalytic reaction pathways, the rational design of multienzyme-like nanozymes becomes feasible. Recently, Jiang et al. established a comprehensive nanozyme dataset by collating information from 4159 papers, encompassing element types, element ratios, chemical compounds, shapes, and pH values.48 Based on this dataset, they reorganized the material features of different nanozymes using clustering correlation coefficients of nanozyme features to derive the constituent factors of multienzyme-like nanozymes. Subsequently, they developed a methodology that integrates quantum mechanics/molecular mechanics (QM/MM) and ML to analyse surface adsorption and desorption energies, as well as binding energies of substrates, transition states, and products in the multienzyme-like reaction pathway. This approach enabled the determination of optimal reaction pathways, leading to the design of a genetically evolved class of multienzyme-like nanozymes (Fig. 5). The outcome was a multienzyme-like nanozyme, CuMnCo7O12, exhibiting high POD-, CAT-, OXD-, SOD-like activities. Noticeably, the authors used molecular dynamics simulations of MM along with QM calculations, an approach that incorporates environmental factors and allows for a more comprehensive simulation of the effects of environmental factors on the catalytic reactions of nanozymes.
Fig. 5 Computational and experimental results of data-driven evolutionary design research on multienzyme-like nanozymes. (A) Illustration of the catalytic reaction paths catalyzed by multienzyme-like nanozymes. (B) Reaction path and adsorption and desorption changes of each generation of nanozyme compared to the previous generations. (C) Comparative assessment of the multienzyme-like activities of the second- and third-generation nanozymes in contrast to the first-generation nanozyme. (D) Comparison of the CAT-, OXD-, POD-, and SOD-like activities of the evolutionarily designed multienzyme-like nanozymes. Reprinted with permission from ref. 48. Copyright 2024 American Chemical Society. |
In sum, ML is a reliable method for optimizing the catalytic reaction pathways of nanozymes, offering the potential for the rational design of more high-performance nanozymes in the future.
(1) Small datasets. Nanozyme research is relatively new, resulting in datasets that are often small with insufficient sample sizes to train complex ML models. This limitation hinders the generalization ability of the model, impacting its capacity to predict unknown data.
(2) Data imbalance. Certain types of nanozymes may be studied more frequently than others in practice, leading to a dataset with a much larger number of samples for one type compared to others. This imbalance can bias ML model predictions in favour of the more numerous types, affecting performance on sparser types.
(3) Data quality and consistency issues. Data from published papers often depend on specific experimental conditions and methods. The environments in which the data are acquired may be inconsistent, resulting in uneven data quality that can affect the training of ML models.
(4) Missing data. Published papers tend to focus on the successful applications of nanozymes, with unsuccessful data often deliberately omitted. In ML, understanding both successful and unsuccessful outcomes is crucial for effective learning. The omission of failed data can impact the training effectiveness of ML models, resulting in overfitting to positive (successful) data points and lacking generalization ability. When training ML models, it is important to fully utilize these failed data points and appropriately handle them during the model construction process. Whether to weight them or exclude them will depend on the characteristics of the dataset and the specific goals of the model. Addressing these challenges is essential for enhancing the reliability and applicability of ML models in nanozyme research.
(1) Energy descriptors. Typically, the catalytic activity of nanozymes is associated with the adsorption energy of reaction intermediates. For instance, Gao et al. demonstrated that descriptors like Eads,OH and Eads,H can be utilized for POD- and CAT-like activities of 2D nanozymes.46 Zheng et al. recently extended the use of Eads,OH to describe the POD-like activity of metal-based nanozymes.49
(2) Electronic structure descriptors. Wang et al. proposed and demonstrated that eg occupancy can serve as a descriptor for the POD-like activity of oxide-based chalcogenides and spinel oxides.50,51
(3) Geometrical structure descriptors. Wang et al. constructed a series of heterogeneous molybdenum nanozymes (MoSA–Nx–C), establishing a correlation between conformation and POD-like activity.52 They found that the coordination number of Mo atoms can be a factor in the POD-like activity, serving as a descriptor for POD-like activity. However, the study of activity descriptors for nanozymes faces two key challenges: (1) The reported descriptors are predominantly focused on POD-like activity, with limited exploration of other activities; (2) The reported descriptors are often specific to particular material systems, with weak generalization ability. These limitations underscore the need for broader and more diverse investigations into activity descriptors for various nanozyme activities and material systems.
Zhu et al. utilized benzenedicarboxylic acid-modified graphene quantum dots (TPA@GQDs) in combination with three transition metal ions (Fe2+, Cu2+, and Zn2+) as three sensing units to build a nanozyme sensing array.53 Using 3,3′,5,5′-tetramethylbenzidine (TMB) as the chromogenic substrate, each sensing unit generated different colorimetric signals for six thiol analytes in the presence of H2O2. Employing the linear discriminant analysis (LDA) model, the absorbance of the samples could be analysed. This classification model utilized the absorbance of TMB at 650 nm as the characteristic peak and thiol class or concentration as the classification label. Through LDA, thiols of the same type or concentration were clustered together, while those of different types or concentrations could be completely separated. This allowed for accurate differentiation between various types of thiol analytes as well as their concentrations (Fig. 6A). Liu et al. selected three oligonucleotides specific for tumor exosomal proteins, each modified with C3N4 nanosheets, to form a sensing array.54 Aptamer adsorption enhanced the selectivity of POD-like activity of o-phenylenediamine (oPD) oxidation by C3N4 nanosheets. In the presence of tumor exosomes, binding occurred between tumor exosomes and oligonucleotides, leading to the separation of oligonucleotides from the C3N4 nanosheets and a reduction in the catalytic activity of the nanozymes. This reduction in catalytic products resulted in a weakened fluorescence signal. Calculating the fluorescence intensity ratios as feature values and using exosomal proteins of the five tumors as classification labels, the LDA classification model categorized the five tumor exosomes into five clusters based on the two most significant differentiation factors. This facilitated the detection of tumor exosomes and the identification of different types of cancers (Fig. 6B).
Fig. 6 Analysing nanozyme optical sensors through ML. (A) Schematics of colorimetric sensor array based on metal ion integrated TPA@GQD nanozyme for thiol discrimination. Reprinted with permission from ref. 54. Copyright 2022 Elsevier. (B) Nanozyme sensor array plus solvent-mediated signal amplification strategy for ultrasensitive detection of exosomal proteins and cancer identification. Reprinted with permission from ref. 55. Copyright 2021 American Chemical Society. (C) Schematic diagram of material catalysis, colorimetric sensing principle, and portable smart sensor. Reprinted with permission from ref. 56. Copyright 2023 American Chemical Society. |
All the aforementioned efforts leverage sensor arrays to enhance sensor reliability. A sensor array has the capability to generate diverse cross-reaction signals (i.e., fingerprint signals) for each analyte, facilitating multiplexed detection and identification of multiple detectors. The nanozyme sensor array incorporates multiple sensing units, each comprising different nanozymes undergoing similar catalytic reactions and producing multiple signals. Therefore, the application of ML algorithms allows for rapid processing of these high-dimensional data through techniques such as feature selection and dimensionality reduction. Data analysis capabilities of ML enable quick prediction and classification, outputting the concentration or class of the target in a shorter timeframe.
Sun et al. utilized the POD-like activity of CuO/Fe2O3 for detecting glutathione pesticides and chlortetracycline hydrochloride (CTC).55 Upon the addition of glufosinate, it adsorbed to the surface of CuO/Fe2O3, inhibiting the active center. The subsequent addition of CTC restored the nanozyme activity due to the interaction between glufosinate and CTC. Changes in nanozyme activity resulted in varying colour development reactions of the TMB colour developer. They established a multifunctional intelligent nanozyme sensor platform capable of automatically recognizing input images and building statistical models through deep learning (Fig. 6C). The fundamental principle involved segmenting the image, extracting mean RGB or HSV values, and fitting the relationship between RGB or HSV and the concentration of the target molecule using a linear support vector machine (SVM), enabling intelligent online detection of glufosinate ammonium and CTC concentration.
Dang et al. designed a Ni/CoMoO4 nanozyme sensor with bienzyme-like activity for detecting glufosinate ammonium and CTC concentrations.56 The nanozyme exhibited bienzyme-like activity sensitive to organophosphorus (OP) and zirconium. In terms of OXD-like activity, the sulfhydryl molecules produced by acetylcholinesterase (AChE) are easily coordinated with metal atoms, blocking the catalytic site of the nanozyme. OP enhances the catalytic activity by deactivating AChE. For POD-like activity, zirconium complexes with Co, blocking the active site and reducing the POD-like activity. Using deep learning, the nanozyme sensor detected concentrations of AChE, OP, and zirconium. Deep learning models recognized input images, and colour pattern analysis applications analysed the colours of “useful” output photos. Employing the Yolo V3 algorithm, images were segmented, and average RGB or HSV values were extracted. SVM was then used to fit the relationship between RGB or HSV and the concentration of the target molecule, allowing separate detection of multiple target detector concentrations based on the multienzyme-like activities.
Compared to analysing absorbance data, the analysis of input images demands more sophisticated analytical capabilities from ML, making it more suitable for online intelligent detection. This enables real-time detection and intelligent analysis of the data.
Zhu et al. integrated a nanozyme-based electrochemical sensor with ML for measuring electrical signals produced during the OXD-like catalytic reaction of carbazene (CBZ) residues by MoS2 nanohybrid nanozymes.57 Using the artificial neural network (ANN) algorithm, they constructed a neural network model correlating current and various concentrations of CBZ. The electrical signals served as outputs, facilitating the detection of CBZ residue concentrations. This hybrid approach, combining the nanozyme activity with ML analytics, resulted in a sensor with a lower detection limit and heightened sensitivity. The incorporation of ML allowed the electrochemical sensor to handle multiple signal inputs and outputs while maintaining low error, enabling simultaneous detection of concentrations for multiple targets with the nanozyme-based electrochemical sensor. Likewise, Zhu et al. analysed and processed electrical signals generated during the OXD-like catalytic reaction of two substrates, xanthine (XT) and hypoxanthine (HX), by 3D porous graphene nanozymes.58 Using an ANN algorithm, they determined concentrations for both substrates, demonstrating the capability of the sensor to detect multiple target concentrations. Additionally, Zhu et al. employed the OXD-like activity of single-walled carbon nanohorns (SWCNHs) to catalyse 5-hydroxytryptamine (5-HT), producing an electrical signal.59 Leveraging the derivative technique for signal preprocessing, they improved the signal resolution and sensitivity. The ANN algorithm then effectively modeled the current and various concentrations of 5-HT, enhancing the accuracy of substrate concentration predictions.
Integrating ML with nanozyme-based sensors opens up new possibilities. Given that the intensity of the electrical signal correlates positively with nanozyme activity, ML can enhance the preparation methods for nanozyme-based sensors by analysing electrical signals to identify conditions yielding the highest activity. Xu et al. employed an orthogonal experimental design in combination with the BP artificial neural network-genetic algorithm (BP–ANN–GA) to assess the impact of four factors (volume ratio of graphene oxide (GO) and multi-walled carbon nanotubes (MWCNTs), silver nitrate concentration, CV deposition cycle, and pH of phosphate buffer) on the peak current value (I) of benzyl (BN).60 This approach aimed to optimize the preparation technique for the nanozyme-based sensor and determine the optimal experimental conditions. Under these conditions, the nanozymes catalysed a BN reaction on the working electrode, generating an electrical signal. Subsequently, the support vector machine (SVM) algorithm and least squares support vector machine (LS-SVM) algorithm were employed to achieve intelligent sensing of BN.
Through ML-assisted electroanalysis of chemical reaction signals, nanozyme-based sensors have demonstrated enhanced sensitivity and improved detection capabilities. Furthermore, ML plays a crucial role in refining the preparation and optimizing the performance of nanozyme-based sensors.
Yu et al. developed a multimodal nanozyme-based sensor where liposomes containing hollow Prussian blue nanoparticles (h-PB) were confined in test cells.61 The released hollow ferrocyanine blue nanoparticles were then transferred to a TMB-H2O2 system to generate a colorimetric signal through the classical sandwich immunochromatographic analysis. Simultaneously, the temperature signal was detected via the photothermal effect under 808 nm near-infrared laser excitation. Both signals were subjected to analysis using an ANN model, resulting in multimodal biosensing for precise targeted protein detection (Fig. 7). This multimodal nanozyme-based sensor demonstrated an ultra-wide dynamic range of 0.02 to 20 ng mL−1 and a detection limit of 10.8 pg mL−1, showcasing improved sensitivity. This study serves as a noteworthy example of a multimodal nanozyme-based sensor, highlighting the considerable potential of multimodal sensors in biosensing applications.
Fig. 7 Nanozyme-based multimodal sensors. (A) Schematic diagram of the constructed portable photothermal colorimetric dual-modality biosensor. (B) Schematic diagram of artificial neural network for multimodal data processing. (C) Schematic representation of the immune strategy for target cTnI-triggered bimodal biosensing, where the obtained colorimetric and photothermal data were passed into the ANN model for further processing. (mAb1: monoclonal anti-cTnI capture antibody; mAb2: monoclonal anti-cTnI detection antibody; h-PB NPs: hollow Prussian blue nanoparticles; ox-TMB: oxidation-TMB). Reprinted with permission from ref. 62. Copyright 2022 Elsevier. |
Fig. 8 The universal flowchart for ML in the field of nanozymes. Reprinted with permission from ref. 46. Copyright 2023 Wiley-VCH GmbH. Reprinted with permission from ref. 41. Copyright 2022 Wiley-VCH GmbH. Reprinted with permission from ref. 6. Copyright 2023 American Chemical Society. |
For ML applications in nanozyme detection, researchers must design experiments to gather a substantial amount of data aligned with the specific detection task. The experimental design encompasses selecting the nanozyme type, determining the sensor type, and optimizing detection conditions (e.g., reaction time, temperature, pH, etc.). Under these optimal conditions, the collected detection signals serve to construct a dataset, enabling the establishment of a quantitative relationship between detector concentration and the generated signals.
In summary, the quality of raw data significantly influences the performance and reliability of ML models. The approaches to data collection are not limited to the aforementioned methods, and the overarching objective for researchers is to construct a comprehensive database encompassing all nanozymes and their properties.
In nanozyme design, researchers often collect data from published publications. However, the quantity and quality of data from different works may vary, leading to datasets with missing values. Effective data preprocessing is crucial, where three methods are commonly used, including deletion, interpolation, and omission. For instance, Razlivina et al. encountered a dataset with missing values from over 100 papers on nanozymes.41 They utilized the K-Nearest Neighbor (K-NN) algorithm to impute missing values, setting a maximum data sparsity threshold of 80%. In cases where features have a high number of missing values, they can be directly deleted. Additionally, Wei et al. excluded data with a significant number of missing values, particularly at the ζ-potential and the catalytic interface of nanoparticles, as these were not chosen as inputs for the ML model.62
In the analytical sensing of nanozymes, experimental data may be affected by noise, outliers, or corrupted data points due to various factors, such as environmental conditions, operational errors, or equipment failures. Data cleaning is a crucial step that involves identifying and removing these inconsistencies. Thresholds can be established to filter out readings that fall outside reasonable ranges, or statistical methods can be applied to identify and correct problematic points. The goal of data cleansing is to enhance the quality of the dataset, ensuring the accuracy and reliability of subsequent analyses. This process enables researchers to analyse data effectively and make accurate predictions.
To summarize, researchers can initially screen models based on their study purpose (classification or regression). For selecting the specific model, researchers should consider their experience and refer to the dataset's size and characteristics. Using multiple models for learning and subsequently selecting the best-performing model through evaluation is a common practice.
Currently, there are two publicly accessible nanozyme databases at https://dizyme.net/ and https://nanozymes.net/. The former database is divided into three tiers, including a basic version (chemical formula), an progressive version (crystal system, particle size, shape, surfactant), and an advanced version (pH, temperature, hydrogen peroxide concentration, substrate concentration, catalyst concentration, etc.). These tiers are designed to predict kinetic parameters such as Vmax, Km, and Kcat in the Michaelis–Menten equation. Users have the option to contribute to the content of database, which currently contains information on over 300 nanozymes. The latter database is an aggregation of more than 1000 publications, encompassing thousands of materials with details about their kinetic parameters, applications, and references. It is important to note that the former database provides a prediction function for nanozyme activity, whereas the latter database focuses on collecting and organizing information about materials and the applications of nanozymes.
However, the lack of uniform research and testing conditions and methods has resulted in many nanozyme performance data lacking comparability, hindering the effective expansion of nanozyme databases. Therefore, developing a set of standardized operating procedures (SOPs), including reaction conditions (such as temperature, pH, substrate concentration, etc.), dosages, detection methods, etc., is an effective approach to ensure the comparability of data generated in different laboratories. Additionally, establishing nanozyme characterization standards, including catalytic efficiency, stability, toxicity, biocompatibility, etc., is crucial. These evaluation standards help define which performance data are considered important and necessary. Standardizing data and research through the implementation of protocols contributes to the further expansion of databases, enabling the design of more efficient and specific nanozymes.
The current understanding of the catalytic mechanism of nanozymes relies heavily on molecule simulations and empirical judgments by researchers. However, the catalytic mechanisms may vary across different material systems. The development of interpretable ML models holds the potential to unveil unified descriptors of nanozyme activity, leading to the discovery of activity laws governing nanozymes. While the Michaelis–Menten equation has been utilized in nanozyme research, the differences between nanozymes and natural enzymes may necessitate the development of more tailored kinetic equations. Gao et al. proposed a microkinetic equation for POD-like activity on the material surface, which outperformed the Michaelis–Menten equation.46 Interpretable ML models could aid in deriving microkinetic equations for various nanozyme activities, offering a more accurate representation of catalytic mechanisms. Interpretable ML models have the potential to provide microkinetic equations specific to different nanozyme activities, potentially replacing or updating the traditional Michaelis–Menten equation. This shift could result in a more nuanced and accurate representation of the catalytic mechanisms of nanozymes.
Fortunately, the integration of ML potentials and ML force fields stands out as a promising avenue for overcoming the challenges posed by the intricate material systems and complex application environments of nanozymes. The utilization of ML potentials facilitates the swift calculation of intricate material systems. ML-driven potentials enhance the simulation of complex electronic structures and multi-catalytic active sites, offering a more efficient approach to understanding nanozyme behaviour. The incorporation of ML force fields contributes to maintaining calculation accuracy while significantly expediting large-scale MD studies. ML-driven force fields reduce computational costs, enabling a more comprehensive exploration of the nanozyme reaction mechanisms.
Footnote |
† These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2024 |