Open Access Article
Andrea
Gredelj
*ab,
Jayne
Roberts
*a,
Eoin M.
Kearney
c,
Elin L.
Barrett
a,
Nicola
Haywood
a,
David
Sheffield
a,
Geoff
Hodges
a and
Mark A.
Miller
*c
aSafety, Environmental and Regulatory Science (SERS), Unilever, Colworth Park, Sharnbrook MK44 1LQ, UK. E-mail: Jayne.Roberts@unilever.com
bDepartment of Environmental Engineering, Norwegian Geotechnical Institute (NGI), P.O. Box. 3930 Ullevål Stadion, N-0806 Oslo, Norway. E-mail: andrea.gredelj@ngi.no
cDepartment of Chemistry, Durham University, South Road, Durham DH1 3LE, UK. E-mail: m.a.miller@durham.ac.uk
First published on 18th March 2025
Anionic surfactants are widely used in commercial and industrial applications. For assessment of their environmental fate and effects, it is highly desirable to quantify the membrane-water partition/distribution coefficient (Kmw/Dmw). Here, we further develop a computational route to Dmw for anionic surfactants based on coarse-grained molecular dynamics simulations, validating it against new and existing experimental measurements. Having parameterised molecular fragments for the coarse-grained models, the simulations are used to predict Dmw for molecules where no experimental values are available. This expanded set of simulated Dmw values is then used to derive QSARs for acute toxicity of mono-constituent anionic surfactants in daphnids and fish, allowing for extrapolation to similar compounds without experimental Dmw values. For this study, we have selected hydrocarbon-based (HC) surfactants because of their widespread use, and perfluorinated (FC) surfactants as a challenging case study. Separate daphnid and fish QSARs demonstrate good fits, robustness and predictivity, and highlight differing toxicity relationships for HC and FC surfactants in daphnids. Overall, the combined use of simulated Dmw and derived QSARs is a promising approach for ecotoxicity screening of surfactants.
Environmental significanceRegulation and screening of chemicals for environmental risk assessment often rely on the octanol–water partitioning coefficient (Kow). However, for ionogenic and charged molecules, Kow is hard to measure or predict and has limited biological relevance. The membrane-water partitioning/distribution coefficient (Kmw/Dmw) is acknowledged as a more appropriate descriptor of bioaccumulation and toxicity, especially for surfactants. Here, we address the need to reliably predict Dmw by developing, validating and applying coarse-grained methods for molecular dynamics simulations. The results are used to build QSAR models to facilitate ecotoxicity screening. We focus on families of anionic surfactants with hydrocarbon and perfluorocarbon backbones. Due to their environmental persistence and unique physicochemical properties, the latter group is both important and challenging. |
With more than 100
000 chemicals estimated to be in regular industrial use across Europe and North America alone,4 many of which have insufficient hazard data for regulatory submission, the development and acceptance of computational (in silico) approaches provide opportunities to address endpoint data requirements either directly or as part of a weight-of-evidence approach. Historically, in silico approaches established in regulations to determine aquatic toxicity endpoints have been focussed on the application of structure-based predictions using quantitative structure–activity relationship (QSAR) models coupled with well-established Modes or Mechanisms of Action (MoA/MechoA) such as that described by non-specific toxicity (narcosis).5–7 The majority of such published QSARs derived for narcosis rely on the octanol–water partition coefficient (Kow) as a proxy for the lipophilicity/hydrophobicity of chemicals.5,8–12 Since narcosis is driven by critical accumulation in tissues and phospholipid membranes,13,14 resulting in disturbance in their integrity and function, the toxicity of chemicals demonstrating this MoA/MechoA is highly correlated with lipophilicity. However, biological membranes are ordered, anisotropic three-dimensional structures. Therefore, as octanol is a homogenous isotropic medium, it cannot adequately describe the membrane interactions of chemicals such as ionisable organic compounds and surfactants.15,16
Hydrocarbon (HC) based surfactants are widely used around the globe in both industrial and consumer products.17 They are amphiphilic structures containing both hydrophobic and hydrophilic components17,18 and are commonly classified according to the charge of their hydrophilic “head” as anionic, non-ionic, cationic or zwitterionic. Anionic surfactants are by far the most widely used class of surfactants17 with 45 individual anionic surfactants (with hydrocarbon backbones) currently registered under the European Union's REACH legislation in volumes greater than 100 tonnes per year.19 In addition to the hydrocarbon-based surfactants, surfactants with perfluorinated carbon (FC) chains, which belong to a large group of chemicals known as poly- and perfluoroalkyl substances (PFAS), are currently used in a variety of specialist applications from electronics and textiles to fire-fighting foams.20 Perfluorinated surfactants are widely reported as an environmental and health problem, particularly due to their high environmental persistence.20–25 Their surfactant structure with fully fluorinated carbon backbone gives them unique properties (e.g., high stability), but also causes challenges when modelling the physical–chemical properties and environmental fate/partitioning parameters of PFAS.26,27 Deriving animal-based toxicity endpoints for such an abundance of compounds (currently estimated at more than 14
000,24,28 many being surfactants20) is also not possible or ethical and emphasises the need for their grouping.29 Perfluorinated surfactants have a high affinity for binding to proteins27,30 and membrane phospholipids, making the membrane-water partitioning coefficient a highly relevant parameter for these compounds26,27,31,32 and a potential proxy for (eco)toxicity predictions. Hence, considering their unique properties and methodological challenges, the FC group of surfactants provide an interesting case study.
The amphiphilic nature of both hydro- and perfluoro-carbon surfactants makes it difficult to accurately determine Kowvia empirical methods due to these species' tendency to accumulate at the octanol–water interface.26,33 For ionised compounds, Kow can lead to a significant underestimation of lipophilicity when used as a proxy for membrane partitioning, represented by the membrane lipid–water partitioning coefficient, Kmw.16,34 The speciation-corrected membrane lipid–water distribution coefficient (Dmw) provides a more biologically relevant and methodologically defensible alternative to Kow. It has been previously shown to perform as well as Kow for predicting the toxicity of neutral narcotics and is superior as a predictor of toxicity for ionisable chemicals,15,26,35–37 which is influenced by the degree of ionisation at environmentally relevant pH.37–39 Hence, in these cases, Dmw is considered a more appropriate descriptor than Kow26,40 and is also methodologically more robust to derive both experimentally and computationally. Previously, QSARs based on Dmw as a single predictor have shown a good correlation with baseline toxicity in different aquatic species (e.g., algae, Vibrio fischeri, daphnids and fish), for both neutral and ionisable narcotics.15,37,41,42 Despite this, QSARs based on Kmw/Dmw are much less prevalent in the literature and regulatory applications due to the limited availability of experimental membrane liposome-water partitioning coefficients, and the need for method standardisation. However, in recent years, there has been more effort to generate experimental Kmw/Dmw values for different classes of surfactants.19,26,43–45
Several in silico methods are capable of calculating Kmw/Dmw. While linear regressions based on log
Kow,36 and poly-parameter Linear Free Energy Relationships (ppLFERs)46 work well for neutral organics, for ionisable chemicals commercial software such as COSMOtherm's COSMOmic module was shown to perform better.34,46 Still, COSMOmic has limitations for some chemical groups,47 including those containing short hydrocarbon26 or perfluorinated alkyl chains.26,27 Additionally, for long or flexible molecules, running the associated conformer generation software COSMOconf can be prohibitively slow, limiting COSMOmic's utility as a screening tool.
Another computational approach for calculating Kmw/Dmw is molecular dynamics (MD) simulation, where the chemical species of interest, along with the lipid membrane and water, are all represented with atomistic detail. In order to reduce the computational expense of MD simulations, coarse-graining is sometimes used, grouping sets of atoms into “beads” and simulating the movement and interaction at this slightly reduced resolution. This approach simplifies the representation, allowing rapid simulation of large systems.48 In particular, the Martini coarse-grained force field is an appealing candidate for the calculation of Dmw49,50 because the interactions between beads are derived from solvent partitioning data. Previous work by Potter et al.47 integrated the newest version of the force field, Martini 3, with an automatic procedure47 for generating coarse-grained representations of organic solutes.47 Using these tools, Dmw was calculated for a series of charged and zwitterionic molecules. That work exploited some, but not all the new features of Martini 3, leaving room for further expansion of the method to a wider chemical space. In the context of this work, the new halogenated beads included in Martini 3 are particularly relevant for PFAS. In addition, the recent increase in the availability of Kmw/Dmw surfactant data19,26,43–45 provides a timely opportunity to validate the refined Martini force field for membrane-water partitioning of a broader range of molecules.
The application of Kmw/Dmw measurements for surfactants thus far has been almost entirely focused on its potential use as a predictive proxy for assessing bioaccumulation in fish.19,44,51,52 To our knowledge, the approach has not yet been applied to developing ecotoxicity QSARs for charged surfactants.
The present study, therefore, aims to: (1) develop a general hydrocarbon-based anionic surfactant toxicity QSAR model for fish species and QSAR models for hydro- and perfluorocarbon -based anionic surfactants for daphnids, using simulated Dmw; (2) build on our previous work to further develop coarse-grained MD simulations for anionic surfactants, increasing confidence in the approach as an efficient computational method for deriving the Dmw for both hydro- and perfluorocarbon-based anionic surfactants.
(1) Liposome-water partitioning, based on the approach described by Ebert et al.27 using the extrusion method of preparation of large unilamellar liposome vesicles. Samples of the commercial mixture at 200 μM and single-component references at 50 μM and 100 μM were dosed in triplicate into wells of the rapid equilibrium dialysis (RED) plate containing phosphate buffered saline at natural pH 7.4. Incubation was carried out for 48 h for the mixture and 24 and 48 h for the single components at 37 °C. Triplicate data are reported for the C12 isethionate and duplicate for the C14 isethionates commercial mixture and the C14 single component reference standards. For the C12SO4 reference standard, triplicate data are reported for 50 μM and 100 μM after 48 h incubation (n = 6) and 50 μM and 100 μM after 24 and 48 h incubation for the C12EO3S (n = 12), respectively.
(2) Solid-supported lipid membranes (SSLM), using a commercially available pre-dosed assay as described by Timmer et al.43 Here, the analysis was carried out using acetate buffer with pH 7.4, following the manufacturer's guidance. The incubation time was optimised for each component and ranged from 0.5 to 144 h.
All other experimental Dmw values were obtained from existing literature (full list given in Table S11†). Full details of the experimental methods and a list of all newly generated values can be found in ESI Section 2.† In cases where multiple values for the same surfactant were available, an arithmetic mean was taken to give a single value.
The liposome Dmw values naturally emerge in units of L kg−1 (as do the simulation results). However, the protocol for SSLM measurements returns a dimensionless Dmw. The differing factor is the lipid density, which is close to 1 in units of kg L−1. Therefore, in practice, the two conventions are interchangeable. The equations associated with each of the methods are given in the ESI 2.†
The present work expands the automatic procedure from Potter et al.,47 to exploit the new Martini 3 halogenated bead types, re-optimise the bead assignment of some charged groups, and modify the mapping algorithm to comply with the Martini developers' latest guidance.54 In this section, we outline these developments, which are implemented in the most recent version of the cg_param script (available on GitHub at https://github.com/cgkmw-durham/cg_param_m3/tree/martini3_v3).
The automatic script assigns neutral fragments using their log
Kow values, allowing a broad range of groups to be represented without arbitrary fitting. However, charged “Q” groups partition so strongly into water that they cannot be reliably assigned on Kow values. Instead, assignments can be chosen to reproduce other properties, such as experimental Dmw values for simple species containing the charged groups. These assignments can then be transferred without further adjustment to other molecules, where experimental Dmw values might not be available.
In line with previous work, parameterisations for charged groups have been hard-coded for carboxylate, sulfate and sulfonate functional groups before applying the general graph-based algorithm for mapping the rest of a given molecule. The beads for these groups were SQ5n, Q2, and Q3 respectively. The carboxylate SQ5n and sulfate Q2 assignments are the starting points recommended by the Martini developers.47 The sulfonate Q3 assignment has changed from the SQ4p recommendation in a previous publication47 which contained only one example of a sulfonate;47 the new assignment is derived from the larger number of experimental log
Dmw sulfonate values made available by the present work, as detailed in Section S3.1 of the ESI.†
The new Martini 3 halogenated “X” beads have been added to the cg_param algorithm. Any aliphatic group containing two or more fluorine atoms is now assigned to a bead selected from the X category according to the Kow of the molecular fragment in the same way that other bead types are chosen. For PFAS molecules, where experimental log
Dmw values are available, the use of X beads improves the agreement between experiment and simulation, providing some confidence also for molecules where experimental log
Dmw is not known. We have not extended the use of X beads to other halogens, aromatic groups, or cases where a bead contains only one fluorine, all of which are outside the scope of this study.
To describe PFAS accurately, it is also necessary to recognise the influence of fluorine atoms on adjacent charged groups, even if they are not part of the same coarse-grained bead. Martini normally adopts a building-block approach where all fragments are parametrised independently. However, fluorine has such strong and complex electronic effects on adjacent moieties51 that they cannot satisfactorily be neglected. We have shifted the assignment of charged beads for the sulfonate and carboxylate groups from Q3 to Q1p, and SQ5n to SQ1p, respectively, whenever these groups are directly bonded to a bead containing at least two fluorine atoms. Detailed information on the parameterisation of PFAS is given in ESI Section S3.2.†
Another functional group requiring attention is the ester fragment in AI and FAES surfactants. In general, it is preferable for esters to be kept intact in a single bead, rather than split over adjacent beads. Furthermore, a dedicated study of esters (in preparation) shows that the optimal bead assignment is slightly different from that returned by routine matching to Kow. We therefore protect the –COO– fragment from being split by topological analysis in the spectral mapping algorithm and assign it to an SP2 bead, wherever possible. However, there are some cases where splitting the –COO– group is inevitable. Details of how these cases are handled are given in ESI Section S3.3.†
The present work tackles larger molecules than previously applied to this algorithm. Long chains with highly branched heavy-atom sites, such as sulfur in sulfate, can lead to numerical difficulties in the spectral mapping calculations56 as implemented in our coarse-graining procedure.50 In brief, the centrality scores of atoms in such molecules (which are used to determine the order in which they are grouped into beads) may span so many orders of magnitude that they cannot be accurately calculated with normal machine precision. We have introduced an adaptive weighting procedure to mitigate the problem while retaining the influence of mean mass and bonding path length as in our original algorithm. This modification is described in more detail in ESI Section S3.4.† It can alter the precise mapping of molecules in cases where more than one option is plausible.
In line with new recommendations of the Martini 3 developers,57 bead coordinates are now defined by the centre of geometry (COG) of the atoms that they represent, rather than by the centre of mass used in previous generations of the force field. The change to COG coordinates is focused on capturing molecular packing densities54 and only has a minor impact on the resulting Dmw values, as shown in ESI Section S3.5.†
Once the coarse-grained models have been derived for each surfactant, umbrella sampling molecular dynamics simulations are used to obtain the free energy profile in a 1-palmitoyl-2-oleoylphosphatidylcholine (POPC) membrane, from which Dmw is then derived. These methods match those used in the previous work.47,50 The simulation parameters are given in ESI Section S3.6† and the connection between the free energy profiles and the calculation of Dmw (in L kg−1) is detailed in ESI Section S3.7.†
In addition to the bulk database search described above, an extensive review of scientific literature was conducted using both chemical and common surfactant names as keywords. Included in this were also a variety of externally available review documents originally prepared for the environmental risk assessment of surfactants.60–63 These documents contained industry-shared data which have been collated and evaluated by independent bodies. Hence, some toxicity values from those documents are from proprietary studies and full access to the study details was not possible, but the values were still considered scientifically relevant. A few additional data points were obtained from Unilever's historic proprietary (unpublished) internal studies. As a result, a small proportion of the values used in deriving the fish HC and daphnids HC QSARs (equivalent to 16% and 14%, respectively) were from proprietary studies. For the daphnid FC QSAR, proprietary study data were used only as part of averaged values for perfluorooctanoic acid (PFOA).
For all data points, quality criteria were established and applied as follows:
(1) Only values reported as 96 h LC50 s (for fish) and 48 h LC/EC/IC50 (for daphnids) were included in compliance with the current OECD Guidelines.64,65
(2) No restriction was made regarding fish species.59 Interspecies sensitivity of fish is a complex topic and is often chemical-specific with insufficient evidence to state whether one species or another would be more sensitive.66 Similarly, data for both Daphnia sp. and Ceriodaphnia have been included since it was previously demonstrated that there is no significant difference in their sensitivity to different chemicals.67 This approach, therefore, provides the broadest applicability to predicting fish and daphnids acute toxicity on the trophic level basis rather than by individual species.
(3) The solubility of surfactants is difficult to define due to the formation of micelles, which are soluble aggregates of surfactants. The critical micelle concentration (CMC) at which they form defines the maximum solubility of the single-molecule form.33 Hence, when evaluating the reported toxicity threshold concentrations against solubility considerations, we also relied on descriptive evaluation of test systems; in cases of described solubility issues in original publications or reported visible residuals of surfactants, these data points were omitted regardless of them being reported as below the solubility threshold.
(4) Data reported for the “active ingredient” were preferred to “formulation” data from the USEPA ECOTOX database.
(5) Data reported as measured concentrations were given preference to nominal concentrations. However, nominal concentrations were included in the absence of measured data.
(6) When data from ECHA's registration dossiers were used (initially accessed via eChemPortal and individually assessed afterwards), only data marked with high reliability (Klimisch scores of 1 or 2) were used.
(7) Only data for mono-constituent surfactants were used to develop the QSARs.
(8) When more than one equivalent endpoint for the same surfactant was available — either for the same or different fish/(Cerio)daphnia species — the geometric mean of these values was calculated and used.
Compilation of ecotoxicity data for PFAS chemicals classified as anionic surfactants was performed using the USEPA ECOTOX database.58 Data for all PFAS chemicals were exported based on the “organic compounds” ECOTOX classification by selecting the group of “Per- and Polyfluoroalkyl substances (PFAS)” and subsequently filtering to obtain the ecotoxicity data for PFAS chemicals for both fish and daphnids. The extracted chemicals were further evaluated based on the provided chemical identifiers to keep only those that can be classified as “anionic surfactants”. Toxicity data quality criteria were applied as described above. As the fish studies reporting 96 h LC50s were mostly limited to perfluorooctane sulfonic acid (PFOS) and perfluorooctanoic acid (PFOA), data scarcity for other perfluorinated (anionic) surfactants prevented us from developing the fish QSAR alongside the daphnids QSAR.
Dmw
Dmw for surfactants, the simulated values were compared with measured values for selected chemical species from the existing literature and new experiments in this study (ESI 4†). This process has resulted in an expansion of the molecular fragments that can be accurately represented and consequently in an increase in the chemical space covered by the models. Simulation-generated log
Dmw values and the corresponding experimental data are compared in Fig. 1, demonstrating a good fit (root mean squared error = 0.36).
![]() | ||
Fig. 1 Correlation of simulation log Dmw values with the corresponding experimental log Dmw values (SSLM and liposome data). When multiple experimental data points were available for the same surfactant they were averaged, with SSLM data and liposome data treated as equal quality. Solid symbols indicate molecules used to select the Martini charged bead assignments (linear sulfate, sulfonate and PFAS species given in Table S12†). Open symbols indicate predictions for chemical species that were not used in the model development. The black full line is a regression line for the entire data set; its equation is shown on the graph with standard errors of the slope and intercept (in brackets) and coefficient of determination. The dotted grey line is a line of 1 : 1 correlation, log Dmw (simulation) = log Dmw (experimental). | ||
Of the 43 molecules in Fig. 1, 23 have been used to inform the selection of Martini bead types for key molecular fragments, as detailed in ESI Section S3.† These chemical species can be regarded as a training set for parametrisation of the coarse-grained models. The log
Dmw values for the remaining 20 species were then produced by simulation of models constructed using the mapping and parametrisation algorithm without further adjustment. These values constitute a partial test of the method.
The QSARs in this paper are based on a total of 71 unique chemicals for which relevant ecotoxicological data are available (some of which apply to both fish and daphnids). Experimental log
Dmw values are available for only 29 of these molecules. Hence, the coarse-grained models are supplying log
Dmw for 42 species that could not otherwise be included in construction of the QSARs.
While other Kmw/Dmw predictive methods exist, they are either limited to neutral organics36 or, in the case of COSMOmic, struggle to predict Dmw for both shorter anionic hydrocarbons (4–8 C atoms of alkyl sulf(on)ates)26 and molecules containing perfluorocarbon chains.26,27 Droge used COSMOmic to calculate log
Dmw of neutral and ionisable alkyl sulf(on)ates and perfluoroalkyl acids (PFAAs), but only COSMO simulations for neutral species were able to replicate experimentally determined chain length increments, with experimental Dmw values still based on ionised species.26 Ebert et al. used the same tool and observed a similar effect for PFAAs and their industrial alternatives; for shorter chain molecules, i.e., for PFCAs with 3–6 CFx and PFBS, log
Dmw was overpredicted by COSMOmic by 0.5–2
log units. Other Dmw results were varied, with perfluoroalkyl carboxylates HFPO-DA (GenX) and NaDONA overpredicted while perfluoroalkyl sulfonates PFHxS and PFOS and their (cyclic) alternative PFECHS were underpredicted.27Dmw was well captured for these compounds using coarse-grained simulations (ESI Section 4.1†). A comparison of those COSMOmic values and coarse-grained simulation log
Dmw for anionic surfactants against experimental data is provided in more detail and visualised in ESI Section 4.2 (Table S13 and Fig. S9).†
Regardless of the good fit between the simulated and experimental Dmw values, in the specific case of alkylethoxy ethers the simulations predict an increase in log
Dmw with respect to the number of ethoxylate units (EO), in contrast to the decrease seen in experimental results35 (a trend also predicted by the ALOGPS70 neural network in log
Kow of these molecules). The disagreement is particularly pronounced for compounds containing more than four ethoxylate units. The difficulty with alkylethoxy ethers in simulations was recognised by Rossi et al.,71 who published a custom extension to Martini 2 for polyethylene glycol. We find that, even with the extended chemical space covered by the new Martini 3 beads, we cannot capture the behaviour of both short- and long-chain polyethers with a single ether mapping and parameterisation. This class of molecule, therefore, needs dedicated attention, which is beyond the scope of this work.
Due to this known issue, to include C12EO8S in our training set for predicting (eco)toxicity, log
Dmw was calculated using the multiple regression equation from Droge et al. derived for anionic surfactants.51 This equation uses a fragment-based approach to predict Dmw and has demonstrated a good fit for anionic surfactants, providing they are within the applicability domain and contain fragments which were initially included in the regression. Hence, it is limited to the surfactant sub-classes it was derived with and, therefore, not applicable for groups such as AI, FAES and PFAS other than perfluoroalkyl acids.
Dmw could be predicted for all anionic surfactants where ecotoxicity data were available, avoiding the need to eliminate data points based on a lack of experimental liposome-water partitioning values. The least-squares linear regression QSAR models were developed using logarithmic transformation of ecotoxicity endpoints for fish and daphnids correlated with log
Dmw. All QSAR models are presented in Fig. 2a–c, with the regression equations and coefficients of determination presented on the corresponding graphs. Due to limitations in the number of suitable ecotoxicity data points, they were all used as a training set. All regressions for fish and daphnid toxicity demonstrated a good fit, with R2 greater than 0.7, meeting the robustness and predictivity criteria indicative of a good QSAR, with the full details of their statistical evaluation reported in ESI Section 7.†
![]() | ||
Fig. 2 Fish and daphnid log Dmw-based QSAR models. Respective regression equations and coefficients of determination are shown for each set of chemicals (HC indicating the hydrocarbon and FC perfluorocarbon backbones) and fish/daphnids toxicity from (a) to (c). Standard errors for slopes and intercepts are reported in brackets. Inner dotted lines represent 95% confidence intervals of regression lines, while outer dotted lines represent prediction intervals of the regressions. Log Dmw are simulated values, except for C12EO8S where Droge et al., 2021 method51 was used for calculations. | ||
The resulting QSAR equations (with number of data points n, regression coefficient R2, standard error SE and statistical significance F) are:
![]() | (1) |
![]() | (2) |
The applicability domains for the QSAR models have been determined based on the structural and physicochemical descriptors within the training set (Table 1). To give each QSAR model the broadest level of applicability, we have included as many different headgroups as data availability allowed.
| Fish | (Cerio)daphnia | |
|---|---|---|
| Structural domain: functional groups contained (anionic surfactant sub-classes) | Sulfate | Sulfate |
| Sulfonate | Sulfonate | |
| Sulfosuccinate | Sulfosuccinate | |
| Carboxylic acid | Carboxylic acid | |
| Ester sulfonate | Isethionate | |
| Ether sulfate | Ester sulfonate | |
| Taurate | Ether sulfate | |
| Dithiophosphate | Sulfoacetate | |
Physicochemical domain: log Dmw range |
0.03–6.91 | 1.94–6.91 |
| Molecular weight | 158–566 | 216–619 |
The general consensus in scientific literature through multiple lines of evidence including toxic unit approaches, structure–activity analysis and both phenotypic and genotypic observations, is that anionic surfactants such as AS, AES, FAES and LAS have a narcotic mode of ecotoxic action14,62,75,76 such that toxicity is predominantly driven by adsorption, penetration and disruption of cell membranes.72,77,78 In fish, the primary target organs are the gills, where disruption of the epithelium of primary and secondary lamellae can result in hypertrophy and oedema, causing hypoxia and respiratory failure.79 Such adverse effects are well described by hydrophobicity-based descriptors and even at a gene level a cross-species transcription switch correlation with Kow is observed when an organism is exposed to chemicals with a narcotic mode of action.14,80
The additional charge that is integral to anionic surfactants, and the resulting association with headgroups of membrane lipids, have been previously proposed as a reason for observed enhanced toxicity, particularly for surfactants of lower chain length/hydrophobicity, compared to neutral compounds.9,81 As chain lengths increase, hydrophobic tendency becomes the predominant factor in toxicity causation and other factors such as biotransformation and variability in uptake kinetics as hypothesised by Droge et al. (2019)26 may influence observed regression slopes. This is reflected in eqn (1) and (2) which deviate from perfect positive correlation, the deviation being consistent with similar observations for widely used and accepted QSARs derived using log
Kow for both baseline and polar narcosis (for both surfactants and non-surfactants) such as those of Hodges et al. (2006),75 Könemann (1981),8 Saarikoski and Viluksela (1982)9 and Verhaar et al. (1995).11
Previously, Müller et al.35 applied experimental log
Kmw to derive toxicity QSARs based on a small group of alcohol ethoxylate (nonionic) surfactants. Here, we present the first membrane-water-based fish and daphnid QSAR models for anionic surfactants which are applicable to a broad range of charged moieties associated with a number of surfactant subclasses. To date, the only published QSARs that use this descriptor (and the experimentally derived ecotoxicity endpoints) while not relying on Kow–Kmw regression equations, relate to toxicity in Aliivibrio fischeri37 and the zebrafish embryo test (FET).15,41 Both studies demonstrated the applicability of log
Kmw/Dmw for predicting baseline (narcotic) toxicity of neutral and ionisable chemicals. However, neither included surfactants. Existing published surfactant subclass-specific QSARs have so far focussed predominantly on log
Kow or chain length-based descriptors.73–75,82
There are advantages and drawbacks in using both descriptors when predicting surfactant toxicity. Regulatory and general acceptability of Kow as a proxy for hydrophobicity and hence toxicity of neutral, organic chemicals,83–85 coupled with readily available computational tools,86 are an important advantage of the descriptor and use of Kow-based QSARs for neutral organic chemicals. Whilst QSARs based on log
Kow are widely used,85,87 as already mentioned, the determination of accurate and reliable log
Kow values is highly challenging for surfactants due to their amphiphilic properties.33,87 Their tendency to aggregate at solvent interfaces and emulsify solvent phases, combined with a lack of a defined solubility limit, results in significant difficulties in determining accurate and reliable empirical values using currently available methods (e.g. OECD 123 (ref. 88)). In addition, Quantitative Structure–Property Relationships (QSPR)-predicted Kow/Dow values, with the exception of non-ionic surfactants, do not correlate well with experimental values for surfactants and also exhibit large inter- as well as intra-method-dependent variability.33 This variability has been exemplified by a study of two similar sub-classes of anionic surfactant which identified systematic differences in the log
Kow calculation method when applied to the two classes.75 Despite these limitations, some Kow-based QSARs with good predictivity have been successfully derived for individual surfactant sub-classes using a range of modifications to existing QSPR methods to address some of the limitations of the existing Kow calculation methods.81,89–92 However, these QSARs are still restricted to a small range of surfactants, rather than to the general group of anionic surfactants.
Surfactant chain length has also been demonstrated to be positively correlated with aquatic toxicity and simple rules relating chain length to toxicity are straightforward to apply when assessing homologues.72–74,77 However, this approach does not account for the differing polar headgroups which may exert differing levels of toxicity.93 For this reason, QSARs based on chain length are restricted to small sub-classes with the same headgroup moiety. This limitation is evident, for example, in the US EPA ECOlogical Structure–Activity Relationship Model (ECOSAR),82 developed initially to support data gaps filling under TSCA.
A total of 17 perfluorinated anionic surfactants with acute daphnid toxicity data fulfilling the quality criteria defined in Section 2.4.1 were used to generate a QSAR model (Fig. 2c and Table S16†). Initial plots indicated that fluorotelomer acids (6 individual chemicals) formed a different relationship with log
Dmw than the remaining surfactants and were consequently eliminated from the regression.
Fluorotelomer acids are known precursors of perfluoroalkyl acids and have already been associated with eco(toxicity) several orders of magnitude higher than corresponding PFAAs of the same chain lengths in different organisms.99,100 This excess toxicity was postulated to come from unsaturated and saturated fluorotelomer carboxylic acids (FTUCAs and FTCAs) releasing HF (hydrogen fluoride) as they biotransform into PFAAs (their terminal degradation product), and generally additive effects of FT(U)CAs with their metabolites.100 While more research is needed to understand the mode of action of fluorotelomer acids in aquatic organisms, vertebrates and invertebrates, this is beyond the scope of this paper.100
The resulting QSAR model equation for FC surfactants in daphnids is:
![]() | (3) |
The structural applicability domain includes PFCAs and PFASs of varying CFx chain lengths and their two substitutes belonging to the subclass of perfluoroalkyl ether carboxylates and cyclic PFSA, with molecular weights ranging from 213 to 613 and log
Dmw 0.86 to 6.02 (Fig. 2c).
The lower magnitude of the slope for the FC QSAR compared to that of the hydrocarbon equivalent indicates a smaller increase in toxicity to daphnids with a log
Dmw increase, with higher toxicity of FC surfactants with low log
Dmw shown in the higher intercept. Statistical comparison of the slopes and intercepts of FC and HC QSARs indicated that both parameters are statistically significantly different between the two QSARs (correlation p-values of 0.00013 and 0.003, respectively). In addition to their affinity for phospholipids,26,27,101 for some anionic PFAS (e.g. PFOS, PFOA and some other long-chain PFAAs), protein binding has been highlighted as an important mechanism in their bioaccumulation and toxicity in vertebrates30,98,102,103 due to their strong affinity for proteins. This additional consideration of binding behaviour makes understanding their mode of action and the link between their structure and “hydrophobicity” highly complex and uncertain.26,103 Despite these difficulties, and the limited available data, here we have been able to demonstrate that log
Dmw can not only be successfully captured by the simulation method for different perfluorinated structures but that it also provides a useful proxy for the toxicity of anionic FC surfactants to daphnids.
While the application of Dmw in bioaccumulation assessments of surfactants, either directly or as an input parameter for bioaccumulation models, is now becoming established,19,51,52 the use of Dmw in aquatic toxicity assessments of surfactants has so far been limited.26,35 The use of coarse-grained Dmw values (instead of only the available experimental Dmw values) enabled the inclusion of all available ecotoxicity data in our QSARs. If restricted to experimental Dmw, the fish HC QSAR would contain only 9 data points of its current 21, and the daphnid HC QSAR only 17 of 51. We were, therefore, able to develop QSAR models for predicting aquatic toxicity of HC and FC anionic surfactants that span multiple anionic surfactant sub-classes and incorporate them into a single anionic surfactant QSAR for fish and daphnids. Ultimately, enhanced predictive capability helps to eliminate unnecessary animal testing. Derived QSARs facilitate the application of in silico New Approach Methodologies (NAMs) for aquatic toxicity screening, and hazard and risk assessment of anionic HC surfactants, which are high-tonnage, high-use substances commonly registered under REACH.
For FC surfactants, the derived daphnid QSAR model advances the ability to predict their aquatic toxicity as a group. Further work is needed to develop QSARs for other trophic levels and individual chemicals, including diverse PFAS structures and functional groups. However, even the potential of grouping some of these molecules as “anionic surfactants” is significant, given the sheer variety of PFAS chemicals in use, and current and ongoing regulatory initiatives for targeting them on a class basis.104–107
As shown in this study, coarse-grained models for new moieties can be parametrised from high-quality experimental data. The model parameters can then be transferred to related chemical species for which no experimental partitioning data are available. The strength of this modelling approach is that the full response of a molecule's conformations to the aqueous and membrane environments is then fully explored by the molecular dynamics. This approach can significantly increase the breadth of chemical coverage for future (eco)toxicity predictions over QSARs based on Kow/Dow.
Developments in coarse-grained simulation methods for deriving Dmw of surfactants are a significant step in modelling the environmental fate, behaviour and toxicity of surfactants and of ionisable chemicals in general. Further research will be directed towards expanding this approach to other ionisable chemicals including cationic and zwitterionic surfactants.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4em00649f |
| This journal is © The Royal Society of Chemistry 2025 |