Open Access Article
Monika
Poonia
*a,
Kathryn
Terceiro
b and
Geoffrey D.
Bothun
*b
aDepartment of Civil and Environmental Engineering, Wayne State University, Detroit, MI 48202, USA. E-mail: monika.poonia@wayne.edu
bDepartment of Chemical, Biomolecular, and Materials Engineering, University of Rhode Island, Kingston, RI 02881, USA. E-mail: gbothun@uri.edu
First published on 20th February 2026
Per- and polyfluoroalkyl substances (PFASs), such as the legacy C8 compound perfluorooctanoic acid (PFOA), pose significant environmental and health risks due to their persistence and widespread use. While surface-enhanced Raman spectroscopy (SERS) has shown promise for PFAS detection, challenges remain in achieving high sensitivity and understanding the underlying molecular interactions. This study combines fluorinated thiol-modified SERS substrates with machine learning techniques to enhance PFOA detection and elucidate fluorophilic interactions. Three different fluorinated thiols were used to modify SERS substrates, and their performance in PFOA detection was compared to bare substrates. Surface modification improved SERS signal enhancement and lowered detection limits compared to unmodified surfaces. Machine learning algorithms, including partial least squares-discriminant analysis (PLS-DA), partial least squares regression (PLSR), and support vector machine (SVM) regression, were employed to classify and quantify PFOA based on Raman spectral features. The PLS-DA model successfully distinguished ligand-specific interaction patterns, demonstrating strong robustness and predictive performance, while SVM regression achieved a limit of detection of 3.61 ppb. We further validated the proposed machine learning-guided SERS approach for PFOA detection directly in water, demonstrating ligand-specific Raman responses arising from fluorophilic ligand–PFOA interactions. A linear and sensitive response was observed at environmentally relevant concentrations, confirming the practical applicability of the approach for aqueous PFAS monitoring. This approach offers a novel SERS-based strategy for detecting PFOA, with results anticipated to inform future developments in applying this strategy to a broader range of PFAS compounds and more complex environmental matrices.
Environmental significancePer- and polyfluoroalkyl substances (PFASs) are a group of persistent, toxic chemicals that pose significant risks to human health and the environment due to their widespread presence in water sources and ecosystems. The detection and quantification of PFAS in environmental samples remain challenging due to their unique physicochemical properties, including high chemical stability and low polarizability. Surface-enhanced Raman spectroscopy (SERS) is a promising technique for the detection of PFAS, but weak interactions between PFAS molecules and traditional SERS substrates lead to poor signal enhancement and limited sensitivity. Aided by machine-learning and using perfluorooctanoic acid (PFOA) as a model PFAS, we have identified fluorophilic interactions between PFAS and functionalized gold nanostructures at nanomolar concentrations that may provide a new approach to selective PFAS detection. |
SERS provides remarkable sensitivity for identifying and analyzing chemical compounds and molecular structures and is widely applicable in the field of environmental monitoring.9–11 The Raman signal intensity of analyte molecules can be amplified by a factor of 106 to 1010 when they are adsorbed onto specifically engineered plasmonic nanostructures.12 However, the detection of PFAS using SERS faces several challenges, including weak interactions between PFAS molecules and traditional SERS substrates, leading to poor signal enhancement and limited sensitivity. Several strategies have been reported to improve sensitivity. Fang et al. employed cationic dyes (ethyl violet and methyl blue) to facilitate the adsorption of PFOA from firefighting foams onto graphene oxide mixed with colloidal silver nanoparticles. This method achieved a detection limit of 50 ppb for PFOA.13 An improvement in sensitivity was demonstrated using jet-printed silver nanoparticles and graphene on Kapton as SERS substrates with reported PFOA detection at 1 nM or approximately 0.4 ppb.4 Detection was attributed to the ability of graphene to adsorb analytes. Park et al. developed silver nanograss substrates coated with self-assembled p-phenylenediamine nanoparticles, achieving a detection limit of 1.28 pM (0.53 ppt) for PFOA in distilled water.14 Feng et al. synthesized Ag NP/Au@Ag core–shell nanorod SERS substrates capable of detecting PFOA, PFHxA, and PFBS with a detection limit of 0.1 ppm.15 Lastly, Lada et al. employed SERS with Ag nanocolloid suspensions to detect methylene blue as an indicator for both short- and long-chain PFAS, achieving a detection limit of 5 ppt.16
Recent studies have explored the use of functionalized SERS substrates to improve the detection of PFAS. Rothstein et al. enhanced silver nanorod (AgNR) substrates through alkanethiol functionalization for SERS applications, achieving limit of detections (LODs) of 1 ppt for PFOA and 4.28 ppt for PFOS.7 The incorporation of fluorinated compounds into SERS substrates has shown potential in enhancing the capture of PFAS through fluorous interactions.17 Fluorophilic interactions can potentially enhance the affinity between PFAS molecules and SERS substrates, leading to improved detection sensitivity. Fluorinated thiols have shown promise in leveraging both fluorophilic and electrostatic interactions for PFAS capture.18–20 However, the underlying mechanisms of these interactions and their impact on improving SERS detection remain poorly understood. The complexity of PFAS–substrate interactions and the vast number of possible fluorinated thiol structures make it challenging to identify optimal combinations using traditional experimental approaches alone. Machine learning (ML) algorithms have demonstrated their ability to handle complex datasets, consider variable interactions, and build predictive models in various scientific domains.6,21,22 The integration of ML with SERS offers the potential to extract meaningful information from complex spectral data and provide insights into the underlying molecular interactions.6,7,23 For example, Rothstein et al. showed that combining Raman and SERS spectroscopies with ML, specifically support vector machine and support vector regression models, enabled classification of PFOA, PFOS, and reference samples with up to 95% accuracy and achieved low detection limits for both PFOA and PFOS.7 However, there is a gap in applying supervised ML methods specifically to optimize fluorophilic interactions for SERS-based PFAS detection.
In this study, we present an innovative strategy that integrates fluorinated thiol-functionalized SERS substrates with ML techniques to improve the detection and analysis of fluorophilic interactions. The fluorinated species exhibit overlapping vibrational bands, and small interaction-driven spectral shifts cannot be reliably isolated using conventional SERS spectral inspection. ML-based classification and regression therefore provide a quantitative means to distinguish subtle yet chemically meaningful variations arising from fluorophilic interactions.24 The goal is to develop a foundational SERS-based method for detecting PFAS in environmental samples, initially focusing on PFOA as a representative target compound. We systematically explore various fluorinated thiols as surface modifiers and evaluate their effectiveness in detecting PFOA compared to unmodified SERS substrates. A commercially available gold-coated nanostructured SERS platform, selected for its reproducibility and robustness, serves as the sensing surface.25,26 By leveraging ML algorithms such as partial least squares-discriminant analysis (PLS-DA), partial least squares regression (PLSR), and support vector machine (SVM) regression, we aim to elucidate the interaction mechanisms between the fluorinated thiols and PFOA. We assess the analytical capability, sensitivity, and detection limits of the proposed method, which is essential before applying the technique to more complex sample types. Ultimately, this work lays the groundwork for expanding the methodology to a broader spectrum of PFAS compounds and transitioning toward the analysis of complex, real-world environmental samples in future investigations.
Surface modification of the substrates was carried out by forming self-assembled monolayers (SAMs) on plasma cleaned and pre-leaned SERS substrates. Three distinct thiol solutions were prepared in ethanol, each at a concentration of 1 mM. The cleaned substrates were immersed in 10 mL of the respective thiol solutions for a duration of 24 h. A close-packed thiol monolayer is expected to have between 4–6 molecules per nm2.28,29 Using 5 molecules per nm2 as a conservative estimate gives a required number of 4.5 × 1013 molecules to form a monolayer on the substrate. The 1 mM solution in 10 mL contains approximately 6 × 1018 molecules or 1.3 × 105 times the number needed for monolayer coverage. Thus, the chosen 1 mM concentration provides a large molar excess and was selected to favor rapid assembly and high surface coverage. After the incubation period, the substrates were thoroughly rinsed with ethanol to remove any unbound thiol molecules from the surface and dried under ambient conditions for 30 min. We used bare gold-coated substrates as controls for comparison with the functionalized substrates.
000 and further normalized to the maximum intensity. Background-subtracted spectra were analyzed with OriginPro (Version 2019b, OriginLab Corp., Northampton, MA) and MATLAB (MATLAB R2022a, The MathWorks, Inc.). Partial least squares regression discriminant analysis (PLS-DA) model was constructed using 10-fold Venetian blind (VB) cross-validation method in MATLAB with PLS_Toolbox 8.7.1 (Eigenvector Research, Inc. Manson, WA, USA). PLS2-DA was applied to construct a single multi-class model.32 The optimal number of latent variables (LVs) for each model was used as per the model suggestions, based on appropriate root mean square error (RMSE) value for cross-validation. To quantitatively assess PFOA concentrations across different substrate treatment conditions and measure LODs, linear regression models were developed using the VB cross-validation method. For comparative analysis, support vector machine (SVM) regression models were also implemented, utilizing the same VB cross-validation approach to ensure methodological consistency.
| Atomic% | ||||
|---|---|---|---|---|
| Element | Bare | TFMB | TFMtFTP | TDFOT |
| Au 4f | 43.48 | 39.29 | 34.47 | 22.44 |
| O 1s | 15.68 | 9.28 | 7.44 | 5.62 |
| C 1s | 36.68 | 35.14 | 25.25 | 24.75 |
| F 1s | — | 14.76 | 29.55 | 45.42 |
| S 2p | — | 1.32 | 3.28 | 1.77 |
Fig. 1b–d show the high-resolution XPS spectra of C 1s, S 2p, and F 1s, respectively. The C 1s spectra (Fig. 1b) exhibit a dominant peak at ∼285.4 eV for both the unmodified and TFMB-modified substrates, corresponding to carbon. In contrast, the C 1s peaks for the TFMtFTP and TDFOT modified substrates show a slight shift toward higher binding energies, indicating changes in the chemical environment of carbon upon thiol functionalization. Additional features in the 290–292 eV range are attributed to oxidized carbon species, such as carbonyl (C
O) or carboxyl (O–C
O) functional groups, consistent with reported C 1s assignments in this energy region.35–37
Sulfur was detected in all thiol-modified substrates through the presence of an S 2p peak at ∼162.7 eV (Fig. 1c), a characteristic signature of thiolate bonding to gold.38 This sulfur signal is absent in the unmodified substrate, further confirming successful surface modification. Atomic percentage analysis also detected S 2p exclusively in the functionalized substrates, providing strong evidence for the formation of self-assembled monolayers on the gold surface. The F 1s spectra (Fig. 1d) reveal a fluorine peak in the 687.7–688.6 eV range for all thiol-modified substrates,39–41 while no fluorine signal is observed for the unmodified SERS substrate. This peak arises from the fluorinated moieties present in the thiol ligands, confirming their attachment to the substrate surface. A slight shift in F 1s binding energy toward lower values is observed for the TFMB- and TFMtFTP-modified substrates. In addition, the TDFOT-modified substrate exhibits a notably higher fluorine atomic percentage (Table 1), consistent with its fluorine-rich molecular structure.
The SERS spectra of a bare gold substrate and substrates modified with thiol SAMs are shown in Fig. 2c. The SERS spectrum of gold-coated silicon substrate shows a sharp silicon peak at 520 cm−1 along with a broad, low-intensity background due to the gold surface.25 The three thiol compounds exhibited robust Raman signals, indicating successful chemisorption on the gold-coated SERS substrate due to the strong affinity between the thiol groups and the gold surface, resulting in the formation of stable gold–sulfur (Au–S) bonds. Each thiol compound has a unique spectral signature in the fingerprint region (200–1800 cm−1), reflecting their unique molecular structures. The detailed peak assignments, corresponding vibrational modes, and relevant literature references are summarized in Table S1 in the SI. In the lower wavenumber region, prominent bands observed at 311, and 337 cm−1 correspond to Au–S bending modes, indicating strong chemisorption of the sulfur groups on the gold surface.42 The bands near 412 and 477 cm−1 arises from C–S stretching vibrations, further supporting the formation of a stable thiolate–gold interface on the functionalized substrates.42,43 The spectral peaks observed at 290 and 382 cm−1 are attributed to C–C and CF2 vibrational modes, respectively.24,44
Key features in TFMB SERS spectrum include strong peaks around 1000–1600 cm−1 due to aromatic ring vibrations.45 The peak near 1000 cm−1 is indicative of the benzene ring breathing mode and the peak at 1616 cm−1 is attributed to C
C stretching vibrations within the benzene ring. A range of signals is observed between 1100–1350 cm−1, corresponding to C–F stretching vibrations and a notable strong peak at 1222 cm−1 is associated with the C–F stretching vibration specific to the trifluoromethyl group.45,46 Potential C–S stretching vibrations appeared in the 600–800 cm−1 region.43,45,46
The SERS spectrum of TFMtFTP exhibited notable similarities to that of TFMB, with the dominant spectral characteristics primarily originating from the aromatic ring and trifluoromethyl moieties. Subtle variations in peak positions were observed, likely due to the specific location of the trifluoromethyl group on the molecule. TFMtFTP exhibited pronounced features at 828 cm−1 and 1390 cm−1 corresponding to the symmetric C–F stretching modes of its tetrafluorinated benzene ring, and at 1643 cm−1 attributed to C–C stretching in benzene ring. The increased fluorination in TFMtFTP resulted in a more pronounced SERS signal intensity compared to TFMB, highlighting the influence of the electronegative fluorine atoms on the overall SERS response.47–49
The SERS spectrum of TDFOT modified surfaces exhibited characteristic peaks at 1360 cm−1 and 1420 cm−1, attributed to C–F stretching vibrations in the fluorinated alkyl chain. The thiol group C–S stretching mode is observed in the 640–700 cm−1 range.43 The spectral features corresponding to the C–C and CF3 stretching vibration is observed in the 720–767 cm−1 region.7 Despite its extended perfluorinated chain, TDFOT likely contributes minimally to the overall SERS signal intensity due to its weaker Raman scattering compared to the aromatic structures found in TFMB and TFMtFTP. Aromatic thiols on gold surfaces exhibit sharp and intense spectral peaks, particularly associated with aromatic ring vibrations and C–S stretching, which reflect strong π-system interactions and significant charge transfer at the metal–ligand interface. In contrast, long-chain fluorinated thiols display broader and less intense peaks dominated by C–F stretching and backbone modes, as a result of their distinct packing arrangements and comparatively weaker interactions with the gold substrate.47–51 Additionally, it is important to note that the substrate restructuring upon thiol adsorption-especially for long-chain thiols has been reported, leading to changes in the gold surface morphology and electronic properties.50,52
The SERS analysis of PFOA on gold-coated substrates (Fig. 2a) showed interesting spectral changes as the concentration of PFOA increased. At 0 nM PFOA concentration, several peaks were observed in the SERS spectrum. These peaks can be attributed to gold background and methanol signals. The presence of these background peaks is consistent with previous studies that have reported Raman signals from gold nanostructures and methanol on SERS substrates.25,53,54 The intensity of the peaks attributed to gold and methanol initially showed an increment followed by a gradual decrement as the PFOA concentration increased (Fig. S1a in SI). PFOA molecules may adsorb onto the gold surface, altering the plasmonic properties and consequently affecting the background signal intensity.14 The presence of PFOA might also influence the orientation or interaction of methanol molecules with the gold surface, leading to changes in the methanol SERS signal.4
PFOA itself has relatively weak Raman scattering, making direct detection challenging.13 The fluorinated thiols create a functional layer on the SERS substrate that increases the affinity for PFOA through hydrophobic and fluorine–fluorine interactions.16 As discussed in Fig. 2c, the thiol molecules themselves act as SERS reporters, providing strong and distinct spectral features. While these signals are stronger than those of PFOA, they serve as a sensitive backdrop against which small spectral changes induced by PFOA can be detected.4 By using the thiols as intermediary sensors, we leverage their strong SERS activity to indirectly probe PFOA. The presence of PFOA alters the local environment of the thiol molecules, leading to subtle changes in their SERS spectra. These changes, such as peak shifts, intensity variations, or the appearance of new features, indirectly indicate the presence and concentration of PFOA.7
The SERS spectra revealed notable changes upon the addition of PFOA to the functionalized substrates. For the TFMB-modified surface, as illustrated in Fig. 3b and S1b, a gradual increase in intensity was observed for peaks associated with C–F stretching (1074 and 1222 cm−1) vibrations as PFOA concentration increased. Conversely, the intensity of the C
C stretching band at 1616 cm−1 decreased with increasing PFOA concentration. Interestingly, another C–F stretching peak exhibited a decrease in intensity and a slight shift from 1340 cm−1 to 1329 cm−1. This shift suggests a potential interaction between the fluorinated groups of TFMB and PFOA. Similarly, the TFMtFTP-functionalized substrate displayed comparable sensitivity to PFOA. As shown in Fig. 3c and S1c, the intensities of peaks corresponding to C
C stretching and C–F stretching vibrational modes also increased gradually with rising PFOA concentration. Interestingly, another C–F stretching peak at 1074 cm−1 exhibited a decrease in intensity.
The TDFOT-modified substrate exhibited a distinct response to PFOA compared to the other functionalized surfaces. Notably, a significant decrease in peak intensities was observed in the lower wavenumber region (400–1100 cm−1). In the 650–800 cm−1 region, the Raman features of TDFOT do not decrease uniformly with increasing PFOA concentration. Instead, mode-specific behavior is observed. The peak at ∼767 cm−1, assigned to the CF3 stretching vibration, shows the largest and most consistent decrease, indicating strong perturbation by PFOA adsorption. In contrast, the neighboring bands at 720–748 cm−1, associated with C–C and CF3 stretching modes, exhibit smaller or non-monotonic changes, the 720 cm−1 band decreases only slightly, while the 748 cm−1 mode initially increases before decreasing at higher concentrations. These inconsistent variations reflect differences in how each vibrational mode interacts with the local chemical environment.55 In the higher wavenumber range (1150–1616 cm−1), the peaks became broader and showed either red or blue shifts of approximately 10 wavenumbers. These spectral changes suggest the occurrence of hydrophobic interactions between the fluorinated chains of TDFOT and PFOA. The TDFOT thiol, which shares structural similarities with PFOA, also demonstrated new peaks emerging around 1120 and 1538 cm−1 in the presence of PFOA (Fig. 3d). The intensity of this peak increased with the PFOA concentration, indicating a direct correlation between the spectral response and the amount of PFOA present. These observations highlight the sensitivity of the TDFOT-functionalized substrate to PFOA and suggest that the structural similarity between TDFOT and PFOA may contribute to enhanced detection capabilities.56 Small error bars for the main Raman peak intensities, shown in Fig. S1a–d, demonstrate the high reproducibility and consistency among SERS spectra collected at the same PFOA concentration.
The high electronegativity and low polarizability of fluorine atoms create an electron-deficient environment along PFOA's fluorocarbon (C–F) chain.57 This environment drives selective interactions with electron-rich thiol groups, through a synergistic interplay of dipole–dipole forces and lone pair–π interactions, collectively termed as C–F⋯F–C intermolecular dispersion interactions, or fluorophilic interactions.20,57,58 These fluorophilic interactions enable PFOA to preferentially adsorb onto thiol-functionalized surfaces, thereby concentrating the analyte near the SERS substrate's plasmonic hotspots.58
The classification models incorporated three latent variables: LV1, LV2, and LV3, which accounted for 50.78%, 34.36%, and 12.72% of the variance, respectively. As illustrated in Fig. 4a, these three LVs effectively differentiated all four types of surface treatments and captured the majority of the variance. The PLS-DA model's performance was evaluated using specificity, sensitivity, and root mean square error (RMSE) metrics, as presented in Table 2. The results demonstrated the model's high effectiveness and robustness, achieving clear classification of sample groups (with sensitivity and selectivity values of 1).
| Surface treatment | Specificity | Sensitivity | RMSE |
|---|---|---|---|
| Bare (n = 69) | 1.00 | 1.00 | 0.058 |
| TFMB (n = 71) | 1.00 | 1.00 | 0.023 |
| TFMtFTP (n = 72) | 1.00 | 1.00 | 0.049 |
| TDFOT (n = 69) | 1.00 | 1.00 | 0.056 |
The variable importance in projection (VIP) scores were used to measure the influence of the initial X variables (i.e., Raman shift) on the classification model. A VIP score in PLS-DA is a metric that estimates how important a variable is in a PLS-DA model. A variable with a VIP score close to or greater than 1 indicates that the spectral bands and wavenumbers are discriminative between the two or more groups. Fig. 4b shows the VIP score evaluated from the PLS-DA classification model. As evident from the spectral difference between all four surface treatments and PFOA interactions, the VIP score also shows that the aromatic and aliphatic types of thiols and PFOA conformations are the most discriminative in the model. From the VIP score that is evaluated using the PLS-DA classification model, the spectral regions at 415 cm−1, 477 cm−1, 633–671 cm−1, 745 cm−1, 826 cm−1, 1222 cm−1, 1360–1390 cm−1, 1560 cm−1 and 1616–1644 cm−1 exhibit VIP scores greater than one. These bands are discussed in detail in the previous section and consistent with the vibrational assignments reported by Cho et al.24 and Kumar et al.,61 whose experimental and DFT-supported analyses provide reference positions for C–F stretching, CF2 deformation, and CF3 modes. This demonstrates that these wavenumbers serve as discriminative spectral features for distinguishing the thiol groups in the PLS-DA classification model.
The receiver operating characteristic (ROC) curve was also plotted to examine the efficiency and performance of the PLS-DA classification model. The ROC curve was constructed between 1-specificity (false positive rate) and sensitivity (true positive rate) for each surface modification, and the efficiency of the PLS-DA model is illustrated by the area under the curve (AUC). Fig. S2 in SI shows the ROC with an AUC value of 1.00, showing the fitting model with high accuracy as an AUC near to zero indicates an inaccurate model. PLS-DA showed that the SERS data sets for the different surface treatment and PFOA interactions are well differentiated and provided a best fit model to discriminate the interaction process of PFOA. While the presented SERS-based detection method demonstrates promising performance under controlled conditions, it is important to acknowledge that environmental samples often contain potential interferents such as chloride, sulfate, and surfactants, which may affect signal intensity and reproducibility. Although the functionalized thiol layer was designed to enhance analyte–specific interactions and minimize nonspecific binding, the presence of structurally similar fluorinated compounds (e.g., PFOS) may still lead to spectral overlap and reduced classification specificity. This limitation highlights the need for future studies to incorporate a broader range of co-existing substances to assess robustness and selectivity in more complex matrices.
To assess whether chemometric classification can reliably differentiate PFOA-exposed substrates from unexposed substrates, independent of surface ligand chemistry, we performed PLS-DA separately on all four substrate conditions using spectra collected without and with PFOA in methanol (Fig. S4). Across all substrates, the PLS-DA score plots show a clear two-class separation with well-defined clustering of the methanol and PFOA samples. The corresponding cross-validated Y-prediction plots further confirm the robustness of this discrimination, showing very little to no overlap between classes for each substrate type (Fig. S5). These results demonstrate that the model identifies PFOA-specific spectral features that are consistent across all ligand chemistries and are not driven by inherent differences between the substrates themselves.
The validity of the PLSR model was evaluated in terms of method linearity and accuracy. The linearity of the method was represented in terms of the correlation coefficient (R2) between the actual concentration of PFOA and the predicted values from the successive PLSR model for the test samples. The PLS regression models demonstrate varying degrees of effectiveness in PFOA detection across the different surface modifications (Fig. 5). The bare gold-coated substrate showed moderate performance with an R2 of 0.83 and root-mean square error cross validation (RMSECV) of 11.09 (Fig. 5a). This suggests that while the unmodified surface can detect PFOA, there is room for improvement in terms of prediction accuracy and precision. Among the thiol-modified surfaces, TDFOT demonstrated the best performance with the highest R2 (0.95) and lowest RMSECV (6.09) (Fig. 5d). The RMSECV is a measure of the model's predictive performance and a lower RMSECV indicates a more accurate and precise predictive model. This indicates that TDFOT modification significantly enhances the substrate's ability to detect and quantify PFOA concentrations with high accuracy and precision. TFMB modification also showed excellent results (Fig. 5b), with an R2 of 0.93 and RMSECV of 7.44, suggesting that it too provides a substantial improvement over the bare substrate for PFOA detection. As illustrated in Fig. 5c, TFMtFTP modification showed the least detection capabilities among the modified surfaces, with an R2 of 0.84 and RMSECV of 11.45.
The support vector machines (SVM) regression models (SVR) depicted in Fig. S3 exhibit distinct performance levels in PFOA detection across various surface modifications. Analysis showed RMSECV values of 8.06 (bare substrate), 7.77 (TFMB), 5.67 (TFMtFTP), and 2.88 (TDFOT-modified substrate), indicating progressively improved accuracy. Table 3 compares the detection limits (parts per billion or μg L−1; 1 ppb = 2.42 nM PFOA) derived from PLS and SVM regression models for each modification. The detection limit for both models was estimated using the validation dataset and defined as the lowest concentration for which the model's prediction error is smaller than the analyte concentration and statistically distinguishable from the blank within the 95% confidence interval of the calibration residuals. This approach is consistent with commonly used chemometric criteria for multivariate calibration models.62,63 PLS regression yielded detection limits of 13.77 (bare), 9.248 (TFMB), 14.22 (TFMtFTP), and 7.56 ppb (TDFOT), whereas SVM regression achieved enhanced sensitivity with limits of 10.02, 6.44, 7.08, and 3.60 ppb, respectively. These results highlight SVR's superior performance in lowering detection thresholds across all substrate modifications. At low PFOA concentrations, a small number of PLSR-generated predictions fell below the blank, which is attributed to model-dependent sensitivity to spectral noise and baseline fluctuations. PLSR, particularly when multiple latent variables are retained, can partially overfit low-signal regions, producing slight negative deviations.64 In contrast, SVM regression applied to the same dataset did not exhibit this behavior (Fig. S3) and yielded superior sensitivity limits, reflecting its stronger regularization and reduced susceptibility to noise. A detailed comparison table (Table S2 in the SI) presents our LOD results alongside those reported for similar SERS-based techniques and liquid chromatography-mass spectrometry (LC-MS) methods in recent studies.
| Limit of detection (ppb) | ||||
|---|---|---|---|---|
| Bare | TFMB | TFMtFTP | TDFOT | |
| PLSR | 13.77 | 9.24 | 14.22 | 7.56 |
| SVR | 10.02 | 6.44 | 7.08 | 3.60 |
The VIP score analysis reveals that different surface modifications lead to distinct spectral signatures in PFOA detection (Fig. 6). The bare gold substrate shows the highest VIP scores observed in the regions of 1015 cm−1 and 1232, 1360, and 1560 cm−1 (Fig. 6a). As discussed in Fig. 3a, these spectral regions are likely associated with vibrational modes of gold or methanol.25,65 A high VIP score was observed in the lower region at 290 cm−1 and 520 cm−1, corresponding to Au–S bonding and the Si substrate, as well as at 382 cm−1, which is attributed to CF2 twisting.61 These peaks are also present at 0 nM PFOA, indicating that they are not associated with PFOA vibrational modes. However, the observed alterations in these regions can be attributed to the interaction between PFOA and the substrate or solvent molecules. This interaction leads to subtle changes in the local chemical environment, which are reflected in the SERS spectrum and captured by the VIP scores.
In contrast, the thiol-modified surfaces demonstrate distinct spectral patterns, indicating interactions between the different thiol moieties and PFOA molecules. TDFOT shows significant VIP scores in the 290–477 cm−1, 700–770 cm−1, 1350–1450 cm−1 and 1538 cm−1 regions (Fig. 6d). These regions likely represent various C–F vibrations,66 and C
C stretching46 suggesting strong fluorine–fluorine interactions between TDFOT and PFOA, which aligns with its molecular structure. The TFMB modification shows prominent VIP scores in the 660–826 cm−1, 1070 cm−1, and 1340 and 1600 cm−1 regions (Fig. 6b). These regions may indicate C–S, C–F, C
C, and potential π–π interactions between the TFMB aromatic ring and PFOA, explaining its lower error value. The moderate performance of TFMtFTP is reflected in its VIP score profile, which shows a mix of thiol and fluorine interactions (Fig. 6c). High VIP scores in the 470 cm−1, 700–800 cm−1, 1070–1190 cm−1, 1384 and 1640 cm−1 regions might represent C–S stretching, C–F stretching and C
C stretching, respectively, indicating thiol–PFOA interactions.
The superior performance of TDFOT modifications can be attributed to their molecular structures and interactions with PFOA. TDFOT, being a long-chain fluorinated thiol, likely provides a more favorable environment for PFOA molecules, enhancing the SERS effect and improving detection sensitivity. TFMB and TFMtFTP, despite aromatic structure, also demonstrates strong affinity for PFOA, possibly due to PFOA altering π–π interactions between thiols. The moderate improvement seen with TFMtFTP suggests that while it does enhance PFOA detection compared to the bare substrate, its molecular structure may not be as optimal for PFOA interactions as TDFOT.
Analysis of the model interpretability outputs revealed that the spectral regions most heavily weighted by the ML models correspond to fluorinated functional groups in PFOA. The VIP scores showed dominant contributions in the 700–780 cm−1 region (CF2 deformation), 1070–1190 cm−1 region (CF2 stretching), and 1340–1390 cm−1 region (CF3 symmetric stretching). These vibrational bands are well-established PFOA markers based on prior Raman/DFT studies.24,61 Their prominence in the ML feature importance analysis indicates that the models rely on C–F and CF2/CF3 vibrational signatures to distinguish PFAS species and concentrations. This is consistent with the proposed fluorophilic interaction mechanism, whereby selective adsorption and orientation of PFOA on the functionalized SERS surface enhances these characteristic fluorinated-group modes.
Overall, the ligand–PFOA interactions produce distinct and concentration-dependent Raman signatures that enable the detection of PFOA in deionized water. The strong agreement between experimentally responsive Raman bands and those identified by the PLSR–VIP analysis further validates the machine learning-guided ligand selection strategy. The observed linear response at environmentally relevant concentrations, followed by saturation behavior at higher levels, underscores the robustness of the SERS platform while highlighting its potential applicability for quantitative monitoring of PFOA in water.
The absence of experimental anti-interference validation and real sample testing is a notable limitation. These aspects are critical for establishing the robustness, and selectivity of the proposed method in realistic, complex sample matrices where co-contaminants may affect sensor performance. Future work should focus on expanding this approach to a broader range of PFAS compounds and exploring its applicability in complex environmental matrices, further validating their suitability for real-world detection scenarios. Through continued refinement and validation, this SERS-based platform has the potential to become a versatile and powerful tool for environmental PFAS monitoring.
Supplementary information (SI): the SI contains additional figures and tables that complement the main text. Fig. S1–S5 include SERS intensity plots, ROC curves, regression parity plots, and PLS-DA score and cross-validation plots for bare and thiol-functionalized substrates. Tables S1–S2 provide Raman peak assignments for fluorinated thiols and PFOA, and a summary of reported PFOA/PFOS detection limits from the literature. See DOI: https://doi.org/10.1039/d5en00721f.
| This journal is © The Royal Society of Chemistry 2026 |