Rodrigo P.
Ferreira‡
ab,
Rui
Ding‡
ab,
Fengxue
Zhang
c,
Haihui
Pu
ab,
Claire
Donnat
*d,
Yuxin
Chen
*c and
Junhong
Chen
*ab
aPritzker School of Molecular Engineering, University of Chicago, 5640 S Ellis Ave., Chicago, IL 60637, USA. E-mail: junhongchen@uchicago.edu
bChemical Sciences and Engineering Division, Physical Sciences and Engineering Directorate, Argonne National Laboratory, 9700 S Cass Ave., Lemont, IL 60439, USA
cDepartment of Computer Science, University of Chicago, 5730 S Ellis Ave., Chicago, IL 60637, USA. E-mail: chenyuxin@uchicago.edu
dDepartment of Statistics, University of Chicago, 5747 S Ellis Ave., Chicago, IL 60637, USA. E-mail: cdonnat@uchicago.edu
First published on 4th March 2025
Improving the sensitive and selective detection of analytes in a variety of applications requires accelerating the rational design of field-effect transistor (FET) chemical sensors. Achieving high-performance detection relies on identifying optimal probe materials that can effectively interact with target analytes, a process traditionally driven by chemical intuition and time-consuming trial-and-error methods. To address the difficulties in probe screening for FET sensor development, this work presents a methodology that combines neuromorphic machine learning (ML) architectures, specifically a hybrid spiking graph neural network (SGNN), with an enriched dataset of physicochemical properties through semi-automated data extraction using large language models. Achieving a classification accuracy of 0.89 in predicting sensor sensitivity categories, the SGNN model outperformed traditional ML techniques by leveraging its ability to capture both global physicochemical properties and sparse topological features through a hybrid modeling framework. Next-generation sensor design was informed by the actionable insights into the connections between material properties and sensing performance offered by the SGNN framework. Through virtual screening for the detection of per- and polyfluoroalkyl substances (PFAS) as a use case, the effectiveness of the SGNN model was further validated. Density functional theory simulations confirmed graphene as a promising active material for PFAS detection as suggested by the SGNN framework. By bridging gaps in predictive modeling and data availability, this integrated approach provides a strong foundation for accelerating advancements in FET sensor design and innovation.
Design, System, ApplicationFor sensitive and specific analyte detection in a variety of applications, such as environmental monitoring and biomedical diagnostics, field-effect transistor (FET) chemical sensors are promising technologies. However, finding the best probe materials is necessary to achieve high performance in these sensors; this process has historically been limited by trial-and-error methods and small datasets. A neuromorphic spiking graph neural network (SGNN) framework that connects material-level insights with device-level functionality is presented in this work. The framework predicts sensor sensitivity with a high accuracy and offers practical design guidelines for maximizing FET sensor performance by combining enriched datasets with hybrid machine learning models. By using virtual screening for the detection of per- and polyfluoroalkyl substances (PFAS), we validate this methodology and show that graphene is a promising option for high-sensitivity probes. This study shows how a data-driven design approach can accelerate the development of FET sensors, allowing for scalable fabrication and optimization for next-generation sensing applications in the industrial, medical, and environmental fields. |
Researchers have used modeling tools to solve these issues and offer more trustworthy advice for sensor design. Conventional computational techniques, such as molecular dynamics (MD) simulations and density functional theory (DFT) calculations, have been crucial in expanding our knowledge of material properties at the atomic and molecular levels.25,26 However, the application of these approaches to device-level FET sensor modeling remains limited to qualitative analysis. Numerous real-world variables, including interactions with the environment, flaws caused by manufacture, and changing operating circumstances, are frequently overlooked by them. These models' predictive potential and suitability for creating sensors with the best possible performance are constrained by their qualitative nature.27–29 Alternative approaches that can bridge these materials insights and device-level functionality are required, as evidenced by the discrepancy between theoretical predictions and real-world applications.
The absence of high-throughput experimental synthesis and evaluation techniques for FET sensors further complicates the shift from material-level insights to device-level predictions. The intricate assembly procedures of FET sensors prohibit such high-throughput methods, in contrast to other domains where automation and robotics have made it possible for quick screening and device and material optimization.30–33 Implementing automated, large-scale experimentation is difficult due to the complex fabrication processes and the devices' susceptibility to slight production differences. The investigation of novel material combinations and device architectures is restricted by this bottleneck, which also slows down the rate of invention.34 As a result, the development cycle for FET sensors continues to be drawn out, and the creation of new, high-performance sensors is severely hampered.
Currently, researchers in chemistry/materials science and engineering have gradually embraced machine learning (ML) approaches.35,36 ML models could rapidly uncover the underlying pattern to speed up the discovery processes based on the data from, as we previously mentioned, either significant number of theoretical simulations37 or experiment trials.38 However, when it comes to FET sensors, ML is mostly used to analyze real-time operating data, calibrate devices, and improve performance by correcting signal drift.39,40 Although this use of ML increases the accuracy and reliability of sensors, it ignores the urgent need to find new materials or probe–target combinations that could substantially improve sensor performance. The potential of ML to direct the rational design and development of next-generation FET chemical sensors remain undeveloped due to the current focus on operational data analysis.41 The lack of extensive, consistent datasets is the reason why researchers in the field steer clear of such investigation. The scarcity of large-scale, high-quality data hinders the development of robust data-driven ML models capable of both making accurate predictions and providing meaningful insights for bottom-up sensor design. Therefore, there is a huge opportunity to broaden the use of ML beyond its current function as a data interpretation tool to that of a catalyst for meta-level innovation.
To overcome the challenges discussed above, we have proposed a novel solution as shown in Fig. 2. It streamlines from data preparation (phase I–III) to ML modeling (phase IV) and finally application in obtaining domain insights for FET chemical sensors as well as virtual screening of probe candidates (phase V–VI). Addressing the issue of data scarcity, we employed large language models42 (LLMs) in a semi-automate manner by automatically parsing the vast corpus of publications into structured data with a carefully designed structure incorporating necessary information for downstream ML. In addition, manual inspection and correction were further conducted to ensure the reliability of the extracted data and mitigate potential inaccuracies from LLMs. To further enhance the quality of the dataset, we enriched the extracted data with substantial physicochemical properties from established databases and computational tools – including experimentally measured values from sources like PubChem, as well as theoretically calculated descriptors using cheminformatics tools like RDKit – grounding the information in experimentally verifiable parameters.
For predictive modeling, we used several ML algorithms. In particular, we developed a novel hybrid model combining a graph neural network43 (GNN) and a neuromorphic spiking neural network44–46 (SNN) for learning physicochemical properties and structural characteristics of the substances,47 respectively. The spiking graph neural network (SGNN) model, as a combination of both SNN and GNN, has shown a promising performance for classifying FET chemical sensors into categories ranging from very high to very low sensitivity. When testing the model's ability to correctly predict sensor sensitivity categories, it successfully identified the correct category in 89% of the cases, substantially outperforming other ML architectures in comparison (a random guess baseline would only achieve 22% accuracy). From the as-trained best-performing SGNN model, we could derive further physicochemical insights – e.g., most impactful features for determining sensing performance – that can be used for guiding FET sensor design. Finally, we were able to broaden its impact by using the domain knowledge-informed SGNN model to find promising probe candidates toward certain challenging contaminants that have not been fully studied in the field. Perfluorooctane sulfonic acid (PFOS)48 is a typical per- and polyfluoroalkyl substance (PFAS), also known as “forever chemicals”,49,50 that widely exists in many bodies of water. Through comprehensive dry lab DFT simulations, we successfully validated new candidate that might offer high sensitivity and selectivity toward PFOS based on the SGNN prediction.
Building upon the initial data extraction, we systematically transformed and enriched the dataset to prepare it for ML analysis. We began by compiling a comprehensive list of publications related to FET sensors from scientific databases using customized query strings. Utilizing LLMs, we semi-automated the extraction of key experimental parameters from these publications, including sensor types, detection targets, detection limits, probe materials (the functional materials that directly interact with and recognize target analytes), operational conditions, and mediums. To ensure data quality and consistency, we performed manual validation to correct inaccuracies, standardize units, and unify terminologies. This included converting detection limits reported in various units to a common scale (e.g., parts per million) and resolving ambiguities in chemical nomenclature. We also excluded entries with complex substances that lacked retrievable properties or those that did not align with our focus. This rigorous process resulted in a curated dataset of 1433 data entries extracted from 1192 publications, providing a solid foundation for subsequent modeling efforts.
Each data sample (experimental trial) was encoded as a JavaScript Object Notation (JSON) object, with blocks for target (T), probe (P), medium (M), and conditions (C)—as we can see in Fig. 3a. Additionally, as illustrated in Fig. 3b, the blocks target, probe, and medium are each composed of a set of one or more substances, described by their names, substance types (e.g., small molecule, inorganic solid, polymer), and other relevant corresponding descriptors (e.g., molecular mass, volume, topological polar surface area namely TPSA, and also sparse fingerprints like Morgan binary fingerprint which is typically used for molecule virtual screening51). The conditions block, on the other hand, is made of key values describing how the experimental measurement environment: operating temperature, maximum and minimum pH—all of them referring to the electrolyte medium.
Given this data structure, we then proceeded to encode the data samples in a specific way that best fits each state-of-the-art baseline classification algorithm: (vanilla) gradient boosting, CatBoost, XGBoost, multi-layer perceptron (MLP), GNN, SNN, and SGNN. In all cases, the output is always an integer 1–5 that corresponds to the LDL category as defined. For traditional ML algorithms (gradient boosting, CatBoost, XGBoost, and MLP) which require vectorized input, numerical features from each JSON object were concatenated into fixed-length input vectors. As shown in Fig. 3c, this straightforward encoding approach included padding to ensure uniform vector dimensions across all inputs. The graph encoding scheme for the vanilla GNN (and part of the SGNN) is represented by Fig. 3d. Each node (T, P, M, C) is made of their respective features and—analogously to the sequential encoding—we performed data padding to make sure all nodes had the same dimension. The proposed architecture—T, P, M fully connected, and C connected only to M—was derived from the observation that target, probe, and medium greatly interact with each other for determining the LDL value, while the conditions node (pH, temperature) mainly influences the sensing performance through its direct effects on the medium's properties. While minor thermodynamics effect may exist directly on target and probe, the testing conditions would majorly determine the aqueous/gaseous sensing environment by ionic strength, charge distribution, and molecular diffusion.
Given their proficiency with complex, graph-structured data, GNNs effectively modeled the interconnected chemical and material properties of FET sensors. Compared with the sequential encoding method (Fig. 4a), the graph-based encoding for GNN (Fig. 4b) better captured interactions between T, P, and M descriptors, aligning with the physicochemical process of sensor probe detecting target in the medium.
To further enhance the model, we integrated an SNN (based on standard “leaky-integrate-and-fire” mechanism as shown in Fig. S2†) with the GNN model, resulting in the SGNN model. As illustrated in Fig. 4c, our proposed SGNN further naturally handles sparse node features such as the Morgan fingerprint, a molecular descriptor representing the structure and connectivity of a molecule by encoding information about atomic neighborhoods. Specifically, the Morgan fingerprints use a binary string based on the presence or absence of different substructures and atomic environments at various topological distances from each atom in the molecule. Hence, these binary fingerprints in the T, P, and M nodes are treated as spike trains, as we show in the bottom pipeline of Fig. 4c. In the meantime, the upper pipeline maintains the graph representation of global physicochemical properties (e.g., TPSA, volume, mass), similar to the vanilla GNN encoding method (Fig. 4b). The key difference is that the vanilla GNN encodes all descriptors together, while the SGNN separates global descriptors and sparse fingerprints to leverage the strengths of both GNN and SNN.
As shown in Fig. 5, SGNN (Fig. 4c) significantly outperformed the vanilla GNN (Fig. 4b) and other algorithms relying on sequential encoding, including (vanilla) gradient boosting, XGBoost,53 CatBoost,54 and MLP. Even when using only the bottom pipeline of the SNN (i.e., the vanilla SNN with sparse, geometric fingerprints as spike train input), the classification performance exceeded that of the vanilla GNN, which utilized both data components. This result confirms our expectation that using GNNs for dense physicochemical features and SNNs for sparse geometric descriptors is superior to employing a single encoding method for all features. Given the baseline accuracy of 0.22 for random guessing, SGNN's 0.899 accuracy demonstrates its reliability and suitability for this classification task. On top of that, the combination of GNNs and SNNs offers additional advantages such as enhanced temporal information processing55 and energy-efficient models for learning large-graph data.56 For those reasons, the proposed SGNN, along with potential future variations, represents a powerful tool for capturing the complex interactions within FET sensors and offering predictive insights for the rational design of novel materials and device architectures.
There are multiple ways to perform feature selection. In our case, we used integrated gradients57 and Shapley values58 for determining each feature's contribution to the prediction for each sample. For both approaches, we obtained the relative frequency (a number between 0 and 1) of each feature to be in the top 10 most relevant features. As described in the Methods section in ESI,† we also employed random matrix theory (RMT) techniques—such as the Marchenko–Pastur law and sparse principal component analysis (PCA)—to further verify the feature selection results obtained via integrated gradients and SHAP values.
Fig. 6 contains the breakdown of the final results—both globally and within each block (for target, probe, medium, conditions). As suggested by these results, the target block contains the most important features overall: mass being the most important, followed by volume, TPSA and complexity. Probe is also an important block, contributing to the LDL prediction via its space group number (SGN), bulk modulus, shear modulus, and mass. Temperature and pH value in the condition block are highly significant, consistent with the chemical intuition. Temperature influences thermodynamics, particularly in gas sensing, where it affects sensor response.59 In liquid environments, pH is crucial as it impacts the charges of probe molecules, influencing their isoelectric points.60 Finally, medium properties (XLogP namely octanol–water partition coefficient with atom-additive approach, TPSA, mass) are far less impactful in the LDL category prediction. Such results are also consistent with the chemical intuition and domain consensus of those sensing phenomena.
![]() | ||
Fig. 6 Feature importance analysis results: (a) global and (b) by specific node blocks based on SHAP values and integrated gradients analysis. |
In FET chemical sensors, the detection limit is influenced by various factors related to the target analyte, the probe material, and the surrounding medium. The mass, volume, and TPSA of the target analyte are critical, as they affect how the analyte interacts with the sensor's probe, thereby impacting sensitivity. Larger mass and volume can enhance van der Waals interactions, while a higher TPSA may increase hydrogen bonding potential, both of which can strengthen binding affinity.61,62 The probe's properties, such as its bulk modulus and space group number, determine its mechanical stability and crystalline structure, influencing its ability to transduce chemical interactions into electrical signals.63,64 The medium's characteristics, including pH and temperature, can alter the analyte's ionization state and the probe's surface chemistry, affecting the overall sensor response.65 However, the medium often serves as a passive environment, making its features less impactful compared to those of the target and probe. Understanding these relationships aligns with established principles in sensor design, where the interplay between analyte properties and probe characteristics is pivotal for achieving optimal performance. Moreover, by such black-box interpretation analysis, we validated that the SGNN effectively captures domain knowledge, making it a reliable predictor for designing novel materials.
As a result, we could find highly consistent predictions. The top 4 most recurrent probe materials associated with this behavior were: graphene, zinc oxide, aluminum oxide, and carbon nanotube. This result can also be analyzed from the recall perspective, since it measures how well the SGNN has identified all correct instances by computing the fraction of true positives relative to the total of true positives and false negatives. In this case, a true positive corresponded to a category 1 probe material that was classified as such, while a false negative represented a category 1 (high sensitivity) probe missed by our model. Since recall values oscillated between 0.877 and 0.899 through our modeling results (Fig. 5), we concluded that, with high probability, we identified most of the best category 1 probe materials. Had we used algorithms with a smaller recall, it would be more significantly likely to miss a promising potential candidate.
In order to validate the practical values of these predictions which originate from our advanced ML modeling on domain knowledge, dry lab simulations of theoretical sensing performances are conducted. We choose PFOS as the prototypical PFAS analyte and sodium dodecyl sulfate (SDS) as the interferent, respectively, based on previous work.69 While binding energies with PFOS (ΔEPFOS) and SDS (ΔESDS) are considered qualitative representations of sensitivity, their difference (ΔΔEPFOS–SDS = ΔEPFOS − ΔESDS) is used for selectivity evaluation. First for initial screening, we conducted ab initio simulations in the standard vacuum condition with periodic boundary conditions considering that zinc oxide (ZnO) and aluminum oxide (Al2O3) are inorganic materials. Beyond the above mentioned 4 SGNN-predicted substances, 6 more substances that have been reported as probe for PFOS sensing are also included for comparison:70 β-cyclodextrin (β-CD),71 ferrocenecarboxylic acid (FcCOOH),72,73 1H,1H,2H,2H-perfluorodecanethiol (FDT-SAM),74o-phenylenediamine (o-PD),73 polyaniline75 (represented by aniline), and polypyrrole76 (represented by pyrrole). As shown in Fig. 7, in this simulation environment, most substances would show higher propensity to combine with SDS rather than PFOS, except for FcCOOH showing an exceptional 0.58 eV advantage of PFOS over SDS. Among the rest, though two inorganic substances predicted by SGNN: Al2O3 and ZnO, have shown exceptional binding energies (between 2.42–3.79 eV) toward the two analytes, their preference is on SDS over PFOS. In comparison, the other two: graphene and single-walled carbon nanotube (SWNT) possess the lowest degree of such disadvantage. To further comprehensively examine the binding behavior, quantum chemistry DFT simulation with a higher precision is initially conducted in two different simulation environments: default vacuum and implicit water solvent field. As summarized in Fig. S3a and b,† the results in different scenarios are generally consistent. Although FcCOOH's selectivity advantage is again confirmed, we could observe that graphene predicted by our SGNN model, followed right behind in terms of ΔΔEPFOS–SDS. To further explore higher fidelity, explicit solvent cluster: water molecules are added into the system. As shown in Fig. S3c,† though FcCOOH is still promising, graphene now shows a higher ΔΔEPFOS–SDS of 7.94 eV overtaking that of FcCOOH's. To obtain deeper insights into the difference in binding mechanism, we discussed the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) in Fig. S4.† It was concluded that FcCOOH likely has stronger electronic coupling and orbital interactions with PFOS, indicated by drastically changed gap values—mostly due to its ability to engage in hydrogen bonding and electrostatic interactions with PFOS's sulfonic acid headgroup. In contrast, graphene's mechanism might be through weaker interactions like π–π stacking, hydrophobic interactions, and physisorption.77,78 This results in minimal HOMO–LUMO gap changes despite favorable binding energies. By combining these two probes—e.g., grafting FcCOOH onto a graphene channel—we can leverage FcCOOH's molecular specificity as well as graphene's high surface area, conductivity, and broad adsorption capacity. This integration increases sensitivity and selectivity, yielding a more stable and reliable sensing platform for PFOS detection in real-world scenarios.
![]() | ||
Fig. 7 (a) ΔEPFOS and ΔESDS, (b) ΔΔEPFOS–SDS of different probe substances in ab initio simulation with periodic boundary condition. |
To further validate our hypothesis, we investigated PFOS binding to our probe systems under explicit solvent AIMD conditions. The results—illustrated in Fig. S5†—clearly show that while FcCOOH alone exhibits a significant binding advantage, the graphene–FcCOOH hybrid displays an even stronger relative binding affinity for PFOS. These findings support our prediction of a synergistic effect between graphene and FcCOOH for enhanced PFOS sensing.
In general, our dry lab simulation results qualitatively support graphene as a promising candidate for PFOS sensing against SDS, showcasing potentially comparable selectivity and sensitivity to FcCOOH by previous reports72,73 validated in this study. The robustness, scalability, and environmental compatibility of graphene further highlight their practical advantages, aligning with the overarching goal of real-world sensor development.79 Moreover, due to its superior electrical properties, graphene itself is already frequently used as the two-dimensional material as the channel of FET sensors. Our study suggests that with potentially straightforward modifications of currently well-established graphene-based FET sensors, it may be possible to develop effective PFOS sensors-though here we have only demonstrated using SDS as one representative interferent.
Our integrated approach advances FET sensor development through two key contributions: addressing the data scarcity challenge and providing a predictive framework for screening novel materials and architectures. In summary, we present a comprehensive methodology that bridges the gap between the need for comprehensive, high-quality data and the development of advanced ML models for FET chemical sensors' design and optimization. By integrating neuromorphic SGNNs with enriched datasets, we offer a powerful tool for researchers to navigate the complexities of FET sensor development and to unlock new possibilities in sensor technology.
Despite these advancements, there is still room for improvement. The current SGNN architecture and dataset oversimplify the graph representation of sensing systems, potentially overlooking complex interactions between targets, probes, and media. For instance, under varying pH, ionic strength, and temperature—will be essential for refining model predictions and ensuring real-world relevance. Enhanced graph architectures that capture multi-level relationships and complex interactions may provide more comprehensive understanding of the underlying mechanisms.
Furthermore, the dataset's lack of temporal dynamics restricts the SNN's ability to fully utilize its event-driven capabilities. In addition to this aspect, while the SNN we implemented relies on intrinsic spiking features, there weren't any explicit time-dependent or event-driven features in our dataset, which also limits the SNN impacts. More thorough modeling of dynamic sensor behaviors would be possible with the inclusion of time-series data, such as response kinetics and recovery times or signal drift patterns over operational lifetimes. The combination of more flexible, hierarchical graph representations and timeseries data—accounting for dynamic behaviors—is a research direction that we will pursue in future work, allowing us to generalize the SGNN framework across different types of sensing design.
Additionally, real-world experiments—as planned for future work—will enhance the robustness and generalizability of our proposed SGNN framework. While our current simulation results (under realistic solvation conditions) successfully validate our model's predictions, wet lab experiments will serve as the ultimate benchmark, being the final step to close the loop and confirm the SGNN results.
Regarding alternative applications, one promising direction is biosensing.80,81 The proposed SGNN framework could indeed be adapted to optimize the sensing design to detect biomarkers—e.g., proteins, nucleic acids, metabolites—or even pathogens (via viral or bacterial genetic material).80,81 This application would require biological datasets with interaction patterns between proteins, DNA/RNA sequences, and sensor surfaces. In this context, we could encode, for instance, enzymatic activity features as a graph representation, while using the SNN to represent the biomolecular temporal dynamics. On top of that, bio-sensing applications can also include wearable devices82—especially if we use 2D materials due to their flexibility82,83—for real-time monitoring of glucose, lactate, and cortisol levels.
In the biosensing context, it is worth understanding the relationship between our SGNN framework and some state-of-the-art models, such as AlphaFold 3.84,85 Even though it is quite challenging to perform a fair comparison due to their substantial differences in data encoding and main objectives—classifying sensor performance based on material properties versus protein structure prediction from amino-acid sequences—we can still use our model for combining different data formats. For example, the GNN would be the “global encoder” with nodes containing global descriptors (e.g., molecular weight, polarity, charge) and edges with chemical bonding features (e.g., hydrogen bonding potentials, electrostatic interactions). The SNN, on the other hand, would serve as the “local and temporal encoder”, capturing stepwise folding kinetics and transient structural states.
Expanding the dataset to include real-world metrics, such as stability, interference effects, and automated synthesis with high-throughput validation, could transform this framework into a closed-loop system for rapid sensor development. These advancements would further enhance the design and optimization of sensors for diverse applications, extending the methodology to tackle pressing challenges in environmental monitoring and healthcare diagnostics.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4me00203b |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |