Molecular Connectivity Indices and Soil Properties to Predict Sorption of Per- and Polyfluoroalkyl Substances
Abstract
This study presents a modeling approach to predict soil-water partitioning coefficient (log Kd, L/kg) for per- and polyfluoroalkyl substances (PFAS) as a function of molecular connectivity indices (MCIs) and soil properties (soil organic carbon−SOC, %, and cation exchange capacity− CEC, cmol/kg). The modeling framework involved compiling data, developing models, and evaluating model performance via interpretation, external validation, and scenario analyses. Two datasets consisted of simple and valence MCIs per each PFAS were used: (i) Carboxylic-PFCAs dataset (N = 327) had only carboxylic compounds (C4-C12) and (ii) PFAS-Full dataset (N = 699) entails carboxylic acids (C4-C12), sulfonic acids (C4-C10) and fluorotelomers (C4-C8). Our muti-criteria approach revealed that the seventh‐order valence path (VP-7) related to polarizability and molecular size, plus the third-order simple path (SP-3) related to molecular size and chain structure emerged as key predictors for the Carboxylic-PFCAs and PFAS-Full datasets, respectively. Elastic net-regularized linear regression (MLREN) and artificial neural networks (ANN) demonstrated that MCIs improved predictive accuracy. For the PFAS-Full dataset, six-predictor models (MCIs + soil properties) yielded high predictive accuracy (R²pred = 83.7–84.9%); however, a three-predictor MLREN model (SP-3, SOC, CEC; R²pred = 77.9%) achieved the highest external generalization (R²ext = 52.4%). SP-3 accounted for the largest share of predictive power (68–95%), dominating model performance (94–97%). Scenario analyses revealed that while deterministic predictions remained stable, probabilistic modeling is crucial for capturing rare but impactful extremes. Overall, our study highlights the practical advantage of MCIs as versatile and scalable tools for predicting the adsorption of diverse PFAS, including short-chain, partially fluorinated, and less commonly studied PFAS. In the long term, this tool can provide data for preliminary, rapid, site-specific risk assessment for PFAS impacted sites.
Please wait while we load your content...