DOI:
10.1039/D5GC00904A
(Paper)
Green Chem., 2025,
27, 9679-9695
Metal bioaccumulation prediction via QSPR-q-RASPR synergy and cross-species risk analysis†
Received
19th February 2025
, Accepted 14th May 2025
First published on 28th May 2025
Abstract
The bioconcentration factor (BCF) is a critical parameter for evaluating the ecological impact of chemical pollutants, reflecting their potential to accumulate in living organisms, particularly through respiratory pathways in aquatic ecosystems. Pollutants such as nanomaterials, organic compounds, metals, metal halides, and metal oxides can bioaccumulate within ecosystems, posing significant threats to biodiversity and ecosystem stability. For instance, elevated BCFs of metal oxides in aquatic environments have been linked to oxidative stress in marine invertebrates, highlighting the urgency of accurate bioaccumulation assessments. This study employed advanced Quantitative Structure–Property Relationship (QSPR) and Quantitative Read-Across Structure–Property Relationship (q-RASPR) modeling techniques to evaluate the bioconcentration potential of metals, metal halides, and metal oxides across diverse species. Additionally, Species bioaccumulation Sensitivity Distribution (SbSD) models were applied to analyze sensitivity patterns across 10 species groups, including algae, amphibians, fish, crustaceans, and molluscs, using BCF data. Key chemical descriptors, including total electronegativity, crystal ionic radius, and molecular bulk, significantly influence bioaccumulation patterns. Total electronegativity and crystal ionic radius negatively impact bioconcentration in algae. Molecular bulk positively correlates with accumulation in crustaceans. In fish, bioaccumulation is positively associated with electron count but negatively correlated with crystal ionic radius. The q-RASPR models consistently outperformed traditional QSPR approaches, offering robust predictive frameworks and deeper mechanistic insights into bioaccumulation processes. To support the application of QSPR and q-RASPR models, the authors offer access to the NanoSens CalTox platform at https://nanosens.onrender.com/pages/apps/, serving as a valuable resource for researchers, industry professionals, and regulatory authorities to conduct informed environmental and ecological risk assessments.
Green foundation
1. This work pioneers computational models (QSPR/q-RASPR) to predict bioaccumulation of metals and metal compounds, enabling early identification of ecotoxic hazards. By linking molecular descriptors like electronegativity and ionic radius to species-specific bioaccumulation, it supports the design of safer, less persistent chemicals, aligning with green chemistry's goal of pollution prevention.
2. The q-RASPR models outperformed traditional QSPR (test R2: 81–86%), reducing reliance on animal testing while offering mechanistic insights. The open-access NanoSens CalTox platform democratizes risk assessment, empowering regulators and industries to prioritize low-bioaccumulation chemicals, quantified via HC5 thresholds (protecting 95% of species).
3. Expanding datasets for metal oxides and underrepresented species (e.g., amphibians) would enhance model robustness. Integrating lifecycle assessment (LCA) metrics and green solvent compatibility could further guide sustainable material design. Collaborative efforts to validate models with in vitro assays would bridge computational-empirical gaps.
|
1. Introduction
The bioconcentration factor (BCF) is pivotal in assessing the ecological risks posed by chemical contaminants. The BCF estimate indicates how likely a chemical will accumulate within living organisms, particularly through the lungs or gills in aquatic environments.1 Nanomaterials (NMs), such as metals, metal halides, and metal oxides, alongside organometallic compounds and a variety of organic chemicals like per- and polyfluoroalkyl substances (PFASs), polybrominated diphenyl ethers (PBDEs), polychlorinated biphenyls (PCBs), pharmaceutical and personal care products (PPCPs), and agrochemicals (pesticides, herbicides, insecticides, and biocides), gradually accumulate as contaminants in the environment.2,3 Over time, these substances can reach hazardous concentrations, leading to chronic exposure and long-term consequences for affected species.
These repercussions may include behavioral changes and reproductive disruptions, posing an existential threat to exposed populations. For instance, elevated BCFs of cadmium (Cd) have been linked to adverse effects in aquatic plants, such as hindering photosynthesis, highlighting the importance of robust ecological monitoring to prevent environmental degradation.4 Lead (Pb) bioaccumulation in aquatic fauna, especially fish, correlates with severe neurological impairments and reproductive failures, necessitating strict regulatory frameworks to manage these toxicants in marine environments.5 Triclosan, phthalates, and bisphenol A are among the well-documented hormone disruptors known to impact reproductive health, as highlighted in studies by.2,3,6 The persistence of metal oxides in aquatic environments due to their high BCFs induces oxidative stress in marine invertebrates, endangering biodiversity and ecosystem stability. These findings emphasize the immediate need for detailed studies on metal oxide bioaccumulation mechanisms and their broader ecological impacts.7
BCF is a key metric in ecotoxicological assessments, quantifying a chemical's concentration within organisms relative to its concentration in the surrounding environment under steady-state conditions, excluding dietary contributions. The BCF is defined by the equation:
where
Corganism represents the concentration of the chemical within the organism and
Cenvironment is its concentration in the surrounding medium.
8 The concept of lipophilicity, typically measured by the
n-octanol/water partition coefficient (log
P), is central to BCF dynamics. Compounds with higher log
P values readily integrate into cellular membranes, enhancing their accumulation within organisms. This correlation is fundamental to understanding bioaccumulative properties.
9 Furthermore, the physicochemical properties of substances, such as molecular size and polarity, play a critical role in their bioconcentration. Compounds with larger molecular sizes or higher polarity generally show reduced membrane permeability, resulting in lower BCF values. Understanding this relationship is essential for identifying the bioaccumulation limitations of specific chemicals.
10
The influence of metabolic transformation on organisms significantly affects BCF. Metabolic processes convert lipophilic compounds into more hydrophilic metabolites, facilitating their excretion and thereby reducing their bioconcentration. Differences in metabolic rates across species contribute to variability in BCF values, emphasizing the significance of species-specific factors in ecotoxicological assessments.11 Together, these mechanisms elucidate why certain substances are more likely to bioaccumulate than others, underscoring the need for comprehensive risk assessments that account for a wide range of ecological and physiological variables.
Nanomaterials pose significant toxicological risks due to their potential for bioaccumulation in biological organisms, which can disrupt physiological processes and compromise ecosystem integrity. Despite their significance, these inorganic compounds have been underrepresented in bioaccumulation research, particularly in studies concerning their BCFs. This research gap is critical, as inorganic compounds exhibit distinct chemical behaviors compared to organic substances, underscoring the necessity for dedicated investigations into their environmental behavior and impact. Fig. 1 below provides an overview of the consumption, exposure, and bioaccumulative potentials of the studied chemicals.
 |
| Fig. 1 The overview of consumption, exposure, and bioaccumulative potential of studied metals, metal halides, and metal oxides. | |
Assessing BCFs provides essential insights into the environmental persistence and potential toxicity of chemical substances. For organic compounds, BCFs are primarily influenced by their lipophilicity, which is typically measured by the n-octanol/water partition coefficient (log
P/log
Kow). Higher log
P values indicate a greater propensity for bioaccumulation in lipid-rich tissues, a relationship often corroborated by quantitative structure–property relationship (QSPR) models.9 However, for inorganic chemicals such as metals and metalloids, BCF assessment involves more intricate interactions. The accumulation of these substances depends not only on lipophilicity but also on factors such as ionic charge, water solubility, and biological processes including uptake, regulation, and sequestration by organisms. Metals, in particular, can bind to proteins and accumulate in specific organs, leading to significant toxicity regardless of their lipophilic properties.8
Computational models for predicting bioconcentration have made significant progress in the context of organic chemicals.12–14 QSPR frameworks using hydrophobicity (log
P), molecular descriptors (e.g., ETA indices, topological parameters), and advanced machine learning15,16 have achieved reliable predictive accuracy. However, equivalent models for inorganic compounds, including metals, metal oxides, and nanomaterials remain comparatively underdeveloped. These substances exhibit distinct physicochemical behaviors, such as ionic speciation, redox activity, and unique bio-nano interfacial dynamics, that cannot be adequately captured by traditional organic-centric modeling approaches. Recent efforts, such as the CD-MUSIC model for arsenic phytoavailability,17 and the QSPR analysis of PFAS protein binding,18 have begun to incorporate inorganic-specific interactions. However, these studies remain fragmented and often lack broader mechanistic generalizability. Existing models often rely on oversimplified assumptions – such as static hydrophobicity parameters for metals - or omit critical descriptors related to metal–ligand coordination, surface reactivity, and species-specific bioaccumulation mechanisms. This discrepancy underscores the need for advanced modeling approaches that are explicitly tailored to account for the unique behavior of inorganic substances, thereby improving the predictive accuracy and reliability of environmental risk assessments.7
To address these concerns, this study pioneers the development of QSPR models to predict the bioconcentration factors of metals, metal halides, and metal oxides, an area that remains largely unexplored in ecotoxicological research. Focusing on key environmental species, including algae, crustaceans, fish, molluscs, and plants, this research aims to fill a critical gap in understanding how these inorganic substances bioaccumulate across diverse taxa. Using simple multiple linear regression models (MLR), we identified chemical attributes that directly or indirectly influence systemic bioaccumulation in these organisms. To enhance the predictive performance of the QSPR models, we integrated them with Species bioaccumulation-Sensitivity Distribution (SbSD) models. SbSD models account for interspecies variability in chemical sensitivity, offering a broader ecological perspective on selective bioaccumulation and its potential environmental impacts. This novel integration represents a significant advancement in toxicological modeling, addressing a critical gap in the existing literature. The QSPR and SbSD models were further optimized using the quantitative Read-Across Structure–Property Relationship (q-RASPR) technique, which is particularly effective for enhancing predictions from small datasets.19–21 This comprehensive framework, which combines QSPR, SbSD, and q-RASPR, offers a robust tool for accurately predicting the BCFs of metals and their compounds. In addition to improving predictive accuracy, these models have the potential to revolutionize environmental risk assessments and support regulatory agencies in managing the ecological risks posed by these pollutants. This research is expected to drive significant advancements in environmental science and toxicology, laying a strong foundation for future studies and regulatory frameworks.
2. Materials and Methods
2.1. Data collection
This manuscript presents a comprehensive evaluation of the bioaccumulative potential of 19 metals, 28 metal halides, and 15 metal oxides, using highly reliable data sources, including the ECOTOX database (https://cfpub.epa.gov/ecotox/), the Japan Chemicals Collaborative Knowledge (J-check) database (https://www.nite.go.jp/chem/jcheck/top.action?request_locale=en), and Environment Canada (https://www.canada.ca/en/environment-climate-change/services/canadian-environmental-protection-act-registry/substances-list/persistence-bioaccumulation-inherent-toxicity.html). The BCF serves as the primary metric for this analysis, providing critical insights into the bioaccumulation potential of these substances. The assessment covers 10 species groups, including amphibians, algae, crustaceans, fish, insects, invertebrates, molluscs, mosses, worms and plants. However, robust datasets were primarily available for metals and metal halides in algae, crustaceans, fish, mollusks, and plants, each comprising more than 10 data points. In contrast, for metal oxides, significant bioconcentration data were available only for fish, consisting of 14 entries. These limitations emphasize the uneven distribution of bioconcentration data across species and underscore the need for future research to address these gaps.
To ensure the reliability and broader acceptance of the models developed in this study, significant emphasis was placed on collecting consistent, high-quality data. This meticulous approach minimizes variability and enhances the predictive robustness of the resulting models, providing a solid foundation for understanding the bioaccumulative behaviors of these substances. The findings and methodologies discussed herein aim to support both scientific research and regulatory decision-making concerning the ecological risks associated with metals and their compounds.
2.2. BCF data curation
This study represents the first systematic exploration and modeling of the bioaccumulative potential of metals, metal halides, and metal oxides. To ensure the reliability of the proposed QSPR models, chemical and biological data were curated in accordance with established best practices.22 Proper data curation is essential, as poorly curated datasets can result in inflated statistical performance and overestimated predictive accuracy in QSPR modeling. The collected BCF data underwent a thorough review to correct errors, such as unit inconsistencies (e.g., misplaced commas and decimal separators), ensuring a consistent and reliable dataset.
For substances with multiple entries, a single representative value was selected for each Chemical Abstracts Service Registry Number (CASRN). To adopt a conservative approach and reflect a “worst-case scenario”, the highest available BCF value was chosen, representing the most bioaccumulative concentration.23 Although efforts were made to collect uniform data under consistent test conditions, variability in exposure duration persisted due to limited data availability. Since exposure time can influence BCF estimates, this study prioritizes identifying the chemical properties that drive bioaccumulation rather than focusing on the in vivo accumulation dynamics.24–26 Despite the variability in exposure times, data with differing durations were included to broaden the scope of the analysis, with all BCF values standardized to a common unit (L kg−1). Prior to analysis, all values marked as “NA” were replaced with NaN (standard representation for missing or undefined numerical values). Rows containing any missing values were then removed to ensure a complete dataset for statistical analysis. A subsequent log
10 transformation was applied, as is customary in QSPR studies. Full details of the collected BCF data are provided in ESI-1 (Sheets 1–3†).
2.3. Descriptor calculation
The chemical structure information was retrieved from the PubChem database using CASRNs as identifiers. The structures were recorded within SMILES notation in Excel sheets to facilitate descriptor calculations. First-generation elemental descriptors were computed using the Elemental-Descriptor 1.0 software tool (freely available at https://www.qsar.eu.org/software) following the guidelines provided in the Elemental Descriptors Manual (https://nanobridges.eu/wp-content/uploads/2015/05/Element_Descriptors_Manual.pdf). This initial set of 31 descriptors was then expanded by 16 seconds-generation descriptors derived from the first-generation descriptors, according to the methodology described by De et al. (2018).27 Additionally, five third-generation descriptors, capturing the physicochemical properties of specific compound groups were incorporated based on the work of Kar and Yang (2024)28 (compiled from https://pubchem.ncbi.nlm.nih.gov, https://www.lenntech.com/periodic-chart-elements/ionization-energy.htm/, and https://environmentalchemistry.com/yogi/periodic/ionicradius.html).
The inherent lipophilicity of the molecules, expressed as log
P (or log
Kow), was calculated using the U.S. EPA EPI Suite version 4.11 (https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface). These descriptors collectively formed a comprehensive set of 53 variables. The initial data processing step involved removing descriptors with intercorrelations greater than 0.9 to reduce redundancy. Following this refinement, no variables were excluded, resulting in a robust and non-redundant set of descriptors for modeling. The complete list of descriptors used in this study is provided in Sheet 4–6 of ESI-1.†
2.4. Dataset division and feature selection
In QSPR model development, selecting appropriate training and test sets is crucial for building balanced and reliable models. Developing a robust model using only a single descriptor is inherently challenging. Following the guidelines proposed by Topliss & Costello,29 we aimed to use at least two descriptors, requiring a minimum of 10 compounds in the training set to ensure statistical reliability. Although a dataset of this size may seem limited, the reliability of a QSPR model depends more on the complexity and relevance of the selected descriptors than solely on dataset size. The dataset was partitioned into training and test sets using a random division approach, following the methodology recommended by Martin et al. (2012).30 For datasets with sufficient samples, a 75
:
25 training-to-test ratio was applied. In cases of limited data (e.g., datasets with fewer than 15 compounds), a minimum of 10 compounds were reserved for training, with the remaining allocated to testing based on availability. The details of the assignment to the modeling set (i.e., training set or test set) for all models are shown in the ESI (Sheets 7–9 of ESI-1†). The training set was used for feature selection and model development, while the test set was exclusively reserved for validating the predictive performance of the constructed QSPR models.31–33 This methodology enhances the model's generalizability and ensures accurate predictions of toxicological properties for new compounds. Given the modest size of the dataset and the complexity of the employed descriptors, we utilized the best subset selection (BSS) analysis tool (available at https://teqip.jdvu.ac.in/QSAR_Tools/) to identify the most optimal combination of molecular descriptors. This approach adhered to the Topliss & Costello ratio of 5
:
1,29 ensuring a balanced proportion between the number of training compounds and selected descriptors. The final MLR-based QSPR models were evaluated using a suite of quantitative metrics to assess their robustness and predictive power, confirming that they meet established QSPR validation standards. Further details on the data set size and the number of compounds in the training and test sets used for the development and validation of QSPR, q-RASPR, and SbSD models for metals, metal halides, and metal oxides across different species can be found in Sheet 10 of ESI-1.†
2.5. Model development strategies
The model development process remained consistent throughout all stages, adhering strictly to OECD guidelines for QSAR validation.34–36 Both the QSPR and SbSD endpoints (mean (yμ) and standard deviation (yσ)) were modeled independently to ensure methodological rigor and alignment with international standards. Given the relatively small dataset, this study employed a straightforward yet effective MLR technique. This method was selected to establish linear relationships between inherent molecular features derived solely from periodic table properties, and the bioaccumulative potential of inorganic chemicals across key species groups, including algae, crustaceans, fish, molluscs, and plants. The simplicity and interpretability of MLR make it particularly well-suited for identifying meaningful correlations within smaller datasets while providing a transparent mathematical representation of the relationship between descriptors and observed bioaccumulative properties. The significance of descriptors was statistically verified through p-value analysis (<0.05), confirming that even descriptors with smaller coefficients contribute meaningfully to the model's predictive performance. Furthermore, this study emphasizes identifying the sensitivity hierarchy of these species based on their bioaccumulative potential across taxa. In this context, MLR models offer a dual advantage: they enable both the prediction of bioaccumulation potential and the assessment of species-specific sensitivity, thereby providing a comprehensive understanding of chemical behavior in ecological systems. To achieve these objectives, two distinct modeling approaches were employed.
2.5.1. Simple QSPR models.
In the first approach, periodic table-based descriptors were utilized to explore correlations between elemental properties and both the bioaccumulation potential and species-specific sensitivity to metals, metal oxides, and metal halides, using MLR as the modeling method.
2.5.2. q-RASPR models.
The second approach involved calculating statistically derived read-across descriptors (RASPR descriptors) based on the original periodic table-based descriptors.20,21 These refined descriptors were then used to model the relationships between read-across features and both bioaccumulation potential and species sensitivity across the studied taxa, employing the MLR technique. The q-RASPR methodology, which integrates the principles of read-across and QSPR, enables quantitative predictions by leveraging both error-based and similarity-based measures. This approach generates innovative descriptors such as Neg.Avg.Sim, SE, SD Activity, CVact, and others (details provided in Sheet 11, of ESI-1†). The descriptors were calculated using the Read-Across-v4.2.1 tool (available at https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home). Key hyperparameters for q-RASPR descriptor computation included: sigma-value = 1, gamma-value = 1, number of similar training compounds = 3, distance-threshold = 1, and similarity-threshold = 0.19,21 These settings were optimized to ensure robust predictions.
2.6. Model validation and applicability domain analysis
The development of the QSPR model followed internationally recognized OECD QSAR validation guidelines34–36 to ensure both internal robustness and external predictive accuracy. Internal validation metrics included the determination coefficient (R2) to assess goodness-of-fit and the leave-one-out cross-validation coefficient (QLOO2) to evaluate model's stability. External validation was conducted using predictive performance metrics (Rpred2 or QF12, QF22, QF32) and the root mean square error (RMSE) for both training (RMSEc) and test sets (RMSEp). This approach provided a robust and unbiased measure of model accuracy. Additionally, the developed models were evaluated using the universally accepted Golbraikh and Tropsha criteria,35 along with MAE-based evaluation metrics,36 to further reinforce their reliability and acceptability.
Model robustness was evaluated using the Y-randomization test, a validation method designed to identify chance correlations. In this approach, the response variable (y) is randomly shuffled while the predictor matrix (X) remains unchanged, thereby eliminating any true structure-response relationship.37 Permuted models are expected to demonstrate significantly lower predictive performance (i.e., reduced R2 and Q2 values) compared to the original model, thus confirming that the observed correlations are not due to random chance. To quantify robustness, the cRp2 metric was calculated as follows:
|  | (1) |
where
Rcal2, and
Rrand2 represent the determination coefficients of the original and randomized models, respectively. A threshold of
cRp2 > 0.5 was applied, ensuring that the model's predictive power arises from meaningful chemical relationships rather than statistical artifacts (Tables S1 and S2 of ESI-2
†).
The applicability domain (AD) was evaluated using a standardized approach for MLR models,38 with a focus on the structural similarity between the training and test compounds. Compounds that were markedly different from the training set (X-outliers) or those with descriptor values (S_i) exceeding the mean ± 3SD threshold were considered outside the AD. Conversely, predictions for compounds with S_i values within the mean ± 1.28 SD were classified as reliable. This systematic validation and AD assessment framework, illustrated in Fig. 2, ensures the robustness, reliability, and practical utility of the models in predicting bioaccumulative potential and species-specific sensitivity across key taxa.38
 |
| Fig. 2 The overall methodology adopted for modeling BCF in the current research. | |
3. Results and discussion
The QSPR and q-RASPR models for bioaccumulation prediction were developed using a chemically diverse dataset, including metals, metal/semimetal halides, and oxides, capturing key structural and toxicological features across species. For metals, the dataset comprised alkaline earth elements (Ca, Ba), transition metals (Fe, Mn, Hg), and post-transition metals (Al, Pb, Sn). Correlation analysis (Fig. 3A) revealed strong interspecies relationships, such as algae-crustaceans (53%) and mollusks-plants (64%). Notably, mollusks showed a high correlation with mean toxicity (82%), while algae were closely aligned with mean BCF (60%), indicating consistent bioaccumulation patterns (Fig. 3A). For metal halides, the models included representatives from the alkaline (LiBr, NaF, CsCl), alkaline earth (BeCl2, BaCl2, CaCl2), transition (CuCl2, HgCl2, FeCl3), post-transition (AlCl3, PbCl2, InCl3), and semimetal halides (SbCl3). Strong correlations were observed between algae and fish (65%) and between crustaceans and fish (38%). Crustaceans exhibited particularly a high correlation with mean toxicity (87%), while fish aligned with mean BCF (64%), reflecting species-specific bioaccumulation patterns (Fig. 3B). In the case of metal oxides, the dataset included transition metal oxides (CdO, ZnO, CuO, Fe2O3), post-transition oxides (Pb3O4, Al2O3, In2O3), semimetal oxides (As2O3, Sb2O3), and reactive non-metal oxides (SeO2). However, cross-species comparisons were limited due to inconsistent BCF data, limiting the generalizability of bioaccumulation trends for oxides (Fig. 3C).
 |
| Fig. 3 Correlation plot of (A) metals, (B) metal halides, and (C) metal oxides explaining intercorrelations of bioaccumulation potential of various chemicals among different species groups. | |
3.1. Bioconcentration factor modeling of metals towards various species
QSPR models using periodic table-derived descriptors showed weak correlations in predicting BCF across algae, crustaceans, fish, mollusks, and plants (eqn (2), (4), (6), (8), and (10)). However, integrating similarity-based read-across (RASPR) variables significantly improved performance across all species. The enhanced q-RASPR models explained 71%–91% of the variance (R2) in the training set, with 55%–84% captured in leave-one-out (LOO) validation. Test set predictive accuracy ranged from 66% to 83% (QF12), meeting validation thresholds. Prediction errors were quantified using RMSE for both training and test sets. Eqn (1)–(10) provide model compositions, including training/test set sizes, descriptor combinations, and validation metrics. Additional details are available in Sheets 7–9 of ESI-1.† This analysis highlights the role of periodic table-derived and similarity-based descriptors in modulating metal bioaccumulation across targeted species.
Metal BCF model towards algae:
pBCF (Algae) = 7.5 − 0.739 × ∑ε − 0.081 × ionic radius |
|
| (2) |
pBCF (Algae) = 3.94 + 5.51 × SE (GK) − 10.7 × CVact (LK) |
|
| (3) |
The MLR-based QSPR model for metal bioaccumulation in algae identifies the sum of electronegativities (∑ε) and crystal ionic radius as key BCF predictors, both negatively influencing bioaccumulation. Higher ∑ε values correlate with lower BCF, as seen in cadmium (pBCF = 3.26, ionic radius = 97) and chromium (pBCF = 4.47, ionic radius = 52). ∑ε reflects metal reactivity and stability, while ionic radius affects ion uptake, with smaller radii enhancing bioaccumulation. However, exceptions like silver and mercury, despite ionic radii >100, exhibit high BCF due to unique properties. q-RASPR models incorporating SE(GK) and CVact(LK) further improved predictive accuracy, refining bioaccumulation estimations.
Metal BCF model towards crustaceans:
|
| (4) |
|
| (5) |
The descriptor (∑α)2 captures molecular bulk and core environment, reflecting metal count and electron affinity, both influencing bioaccumulation. Higher (∑α)2 values enhance pBCF in crustaceans, as seen in silver (132.25, pBCF = 4), mercury (60.84, pBCF = 3.10), and cadmium (33.06, pBCF = 3), while iron (0.56) shows lower bioaccumulation. Crystal ionic radius, representing atomic size, inversely affects bioaccumulation, smaller radii promote uptake, evident in barium (135, pBCF = 2.00) and manganese (46, pBCF = 3.88). These descriptors underscore the critical role of molecular bulk and atomic size in metal bioaccumulation.
Metal BCF model towards fish:
|
| (6) |
|
| (7) |
Electron_ActiveM, representing the number of active electrons in a metal, drives reactivity, cation formation, and interactions with biological membranes. In pBCF(Fish) modeling, it positively correlates with bioaccumulation, while ionic radius shows a negative correlation. Metals with Electron_ActiveM = 0 (e.g., antimony, arsenic, selenium) exhibit low bioaccumulation (−0.40 to 0.86 pBCF), whereas mercury (80) has a high pBCF of 4.43. Larger crystal ionic radius values, as seen in calcium (99), lead (119), silver (126), and barium (135), correspond to lower bioaccumulation (∼0.50 pBCF). These findings highlight the roles of electron activity and atomic size, alongside solubility, ionic charge, and complex formation, in fish bioaccumulation. Aluminum and manganese were excluded from q-RASPR models due to anomalous behavior.
Metal BCF model towards molluscs:
|
| (8) |
|
| (9) |
Neutrons_Activ_SM, representing neutron count in active metals and semi-metals, negatively correlates with pBCF(Molluscs), suggesting higher neutron counts reduce bioaccumulation, as seen in arsenic (42, pBCF = 2.04) and antimony (71, pBCF = 0.63). Conversely, Zv_metal (valence electrons) positively influences bioaccumulation, with vanadium (5, pBCF = 3.75), tin (6, pBCF = 4.51), and iron (pBCF = 5.06) showing strong accumulation. Silver (pBCF = 3.00) defies this trend, exceeding antimony despite its single valence electron. Lead's bioaccumulation in bones and mollusks underscores the interplay of neutron count, valence electrons, and bioaccumulation.39 Silver (Compound 7), a response outlier, was excluded from the QSPR model.
Metal BCF model towards plants:
|
n
total = 10, R2 < 0.6, Q(LOO)2 = <0.5, RMSEc = 0.64,
| (10) |
|
n
total = 10, R2 = 0.86, Q(LOO)2 = 0.64, RMSEc = 0.34
| (11) |
The electronegativity descriptor (χ) inversely correlates with bioaccumulation in plants, as lower χ values facilitate electron loss and cation formation. While decreasing χ generally increases bioaccumulation, exceptions exist—e.g., cadmium (χ = 1.69, pBCF = 5.26) accumulates more than nickel (χ = 1.91, pBCF = 2.06), disrupting photosynthesis and nutrient uptake.17 Ionic radius also shows a positive correlation, with higher radii linked to increased bioaccumulation from aluminum (53.5, pBCF = 2.34) to lead (119, pBCF = 4.02). q-RASPR models incorporating MaxNeg(GK) and MaxNeg(Euc) further improved predictive accuracy.
Fig. 4 below highlights key molecular attributes that influence the bioaccumulation potential of metals across targeted species.
 |
| Fig. 4 Chemical features controlling bioaccumulation potential of various metals in targeted species. | |
3.2 Bioconcentration Modeling of metal halides towards various species
The enhanced q-RASPR models for metal halides explained 71–93% of the variance (58–87% LOO variance) in the training set (R2), while predictive accuracy for the test set ranged from 62–82%, evaluated using the QF12 metric and meeting the minimum validation threshold. Prediction errors were quantified separately for training and test sets using RMSE. Further details on the quantitative metrics are provided in eqn (13), (15), (17), (19), and (21). Eqn (12)–(21) present the MLR equations for QSPR and q-RASPR models for metal halides. This section explores the key chemical features influencing the bioaccumulation potential of metal halides, highlighting factors that either enhance or reduce their bioaccumulating behavior across various targeted species.
Metal halide BCF model towards algae:
pBCF(Algae) = −1.94 + 0.0173 × MW + 9.93 × μ |
|
| (12) |
|
| (13) |
The algal MLR-based QSPR model identifies molecular weight (MW) and μ (a periodic descriptor) as key predictors of metal halide bioaccumulation. Higher MW correlates with increased solubility and lower lattice energy, enhancing bioconcentration e.g., mercury chloride (Hg2Cl2, MW = 472.1, pBCF = 5.10).40 The descriptor μ (1/PNmetal) reflects atomic structure and cationic charge propensity, influencing bioaccumulation. Together, these factors explain size, solubility, and reactivity trends in algae. q-RASPR models incorporating similarity-based features further improved predictive accuracy.
Metal halide BCF model towards crustaceans:
|
| (14) |
|
| (15) |
The crustacean bioconcentration model identifies nHalogen and ∑χ/nHalogen as key predictors, both positively correlated with bioaccumulation (pBCF). These descriptors capture halogen electronegativity and metal-halogen interactions, explaining high pBCF in compounds like mercury chloride, lead chloride, and copper chloride. However, q-RASPR models using SE(EUC) and gm(GK) (Banerjee–Roy Coefficient) demonstrated superior predictive accuracy.
Metal halide BCF model towards fish:
|
| (16) |
|
| (17) |
The negative coefficient of the Metals_SumIP descriptor suggests that the sum of ionization potential energies (kJ mol−1) of metals reduces pBCF in fish. Metals_SumIP reflects the energy required to remove an electron, with higher values indicating greater difficulty in forming cations, increased electronegativity, and lower bioavailability in aquatic systems. For example, increasing Metals_SumIP values from barium chloride (502.7) to cadmium chloride (866) results in decreased pBCF values (1.78 to 0.30). Additionally, the molecular weight (MW) of metal halides, previously shown to enhance pBCF in algae, also plays a significant role in fish bioaccumulation, emphasizing its predictive reliability across species.
Metal halide BCF model towards molluscs:
|
| (18) |
|
| (19) |
The predictive model for molluscs reveals that two elemental descriptors, Neutons_ActiveM (number of neutrons of active metal) and SuMElectrons_Active_M_SM (summation of electrons of active metals and semimetals), effectively estimate pBCF. Neutons_ActiveM shows a positive correlation with fish pBCF, as observed in antimony trichloride (Neutons_ActiveM = 0, pBCF = −0.82), aluminum chloride (Neutons_ActiveM = 14, pBCF = 2.40), and copper chloride (Neutons_ActiveM = 35, pBCF = 2.71). Conversely, SuMElectrons_Active_M_SM exhibits a negative correlation, indicating that metals with higher electron counts, such as mercury chloride (SuMElectrons_Active_M_SM = 80, pBCF = 0.08), show reduced bioaccumulation potential. Similar trends are seen in lead chloride, antimony trichloride, and cadmium chloride, highlighting the role of electron density in bioaccumulation. Additionally, q-RASPR studies using SE(GK) and CVact(Euc) descriptors demonstrated higher predictive accuracy than these two elemental descriptors.
Metal halide BCF model towards plants:
|
| (20) |
|
| (21) |
The descriptor D3_HeteroNonMetals, representing the total number of non-metallic heteroatoms (e.g., N, O, F, P, S, Cl, Se, Br, I, At), shows a negative correlation with plant bioaccumulation potential. In contrast, VWR_Activ_M (van der Waals radius of active metal) correlates positively with pBCF, as seen in selenium chloride (VWR_Activ_M = 0, pBCF = −0.34), nickel chloride (VWR_Activ_M = 0.124, pBCF = 2.00), copper chloride (VWR_Activ_M = 0.128, pBCF = 2.95), and cadmium chloride (VWR_Activ_M = 0.154, pBCF = 4.08). This suggests that larger van der Waals radii enhance bioaccumulation in plants. Moreover, q-RASPR studies with g(Euc) and gm(Euc) descriptors (Banerjee–Roy Coefficient) provide superior predictive accuracy over D3_HeteroNonMetals and VWR_Activ_M.
Fig. 5 below highlights key molecular attributes that influence the bioaccumulation potential of metal halides across targeted species.
 |
| Fig. 5 Chemical features controlling bioaccumulation potential of various metals halides in targeted species. | |
3.3 Bioconcentration Modeling of metal oxide towards various species
Data on metal oxides across species were limited, with only fish having enough data (14 points) to develop predictive models, while other species (e.g., crustaceans, algae, molluscs) lacked sufficient datapoints (>10). QSPR models for fish BCF, using periodic table descriptors, showed weak correlations, as indicated by their performance metrics (eqn (22)). Incorporating RASPR variables did not significantly improve model quality (eqn (23)). The fish q-RASPR models explained 55% of the variance (R2) and had a predictive accuracy of 28% for the test set (QF12), falling below the optimal threshold. Despite these limitations, the analysis identifies key chemical features influencing metal oxide bioaccumulation in fish.
Metal oxide BCF model towards fish:
pBCF (Fish) = 20.0 − 43.0 × μ − 0.0777 × ionic radius |
|
| (22) |
|
| (23) |
The fish QSPR model for metal oxides highlights that the second-generation periodic table descriptor μ and the ionic radius can effectively predict BCF. μ, which negatively correlates with BCF, reflects the metal's atomic number, period, valence electrons, and core environment. For instance, metal oxides like CdO, HgO, PbO, and Pb3O4, with lower μ values, show higher BCFs. While the ionic radius also negatively correlates with BCF, larger metal oxides, such as Pb3O4 and HgO, may enhance BCF due to their bulkiness. Despite lower solubility, the particle size and surface reactivity of metal oxides, like ZnO nanoparticles, can release toxic ions that impact organisms, highlighting the need for further research and careful ecosystem management.41
3.4 Species bioaccumulation-sensitivity distribution modeling
Gaussian distributions, widely applied in SSD models, characterize bioaccumulation variability using three key endpoints: the mean (μ), standard deviation (σ), and hazardous concentration for 5% of species (HC5). HC5, representing the concentration at which 5% of species are adversely affected, is calculated as (eqn (24)): | HC5 = exp(yμ + z0.05 × yσ) | (24) |
where yμ and yσ denote the mean and standard deviation of log-transformed BCF values. A narrow standard deviation (yσ) in SbSD models indicates uniform bioaccumulation sensitivity across species, whereas a broader yσ reflects heterogeneous responses, driven by divergent uptake mechanisms among phylogenetically distinct organisms (e.g., algae vs. fish) (Fig. 6).42 In this study, SbSD models were applied to metals, metal halides, and metal oxides, utilizing log-transformed BCF data from ecologically critical species, including algae, crustaceans, and fish. Strong correlations in BCF values among taxonomically related species reduced yσ, elevating the protective HC5 threshold (hazardous concentration for 5% of species). Conversely, weak correlations broadened yσ, lowering HC5 and signaling heightened ecological risk. While robust SbSDs were developed for metals and halides across ten species groups (Sheet 8, ESI-1†), metal oxide models were limited by sparse data (n ≥ 3 species). Despite this constraint, preliminary SbSDs provided actionable insights into oxide bioaccumulation trends, underscoring the need for expanded datasets to refine risk thresholds for underrepresented contaminants.42
 |
| Fig. 6 Examples of SbSD plots showing bioaccumulation sensitivity across targeted species among metals (A and B), metal halides (C and D), and metal oxides (E and F). | |
3.4.1 Modeling bioaccumulation sensitivities (SbSD models) for various chemicals.
The SbSD models prioritized HC5 (Hazardous concentration for 5% of species) as the primary regulatory endpoint, calculated as HC5 = exp(yμ + z0.05 × yσ), where yμ (mean) and yσ (standard deviation) of log-transformed BCF values quantify central bioaccumulation trends and interspecies variability, respectively. While yμ and yσ models (Table S3 in ESI-2†) support mechanistic interpretation, HC5 integrates these parameters to define thresholds protecting 95% of species from adverse bioaccumulation effects. The q-RASPR-based SbSD models achieved robust predictive performance, explaining 80–98% of training variance and up to 92% of test variance (Rpred2/QF12) for metals and halides. Optimal models, selected via RMSE minimization,36 demonstrated strong generalizability, except for metal halides, where suboptimal μ predictions highlighted data limitations (Table S3 in ESI-2†). Key descriptors driving HC5 sensitivity (eqn (25) and (26)) are interpreted below, linking chemical properties to taxon-specific bioaccumulation mechanisms.
3.4.2 Modeling SbSD towards metals.
|
| (25) |
The metal HC5 model identifies atomic radius and electronegativity (MaxNeg(Euc)) as key determinants of HC5 (L kg−1), with larger atoms and lower electronegativity reducing bioaccumulation. For instance, HC5 decreases from 1.21 (Pb, 194 pm) to 0.51 (Sb, 203 pm), while Cd (MaxNeg(Euc) = 0.870, HC5 = 21.41) shows higher bioaccumulation than Ag (MaxNeg(Euc) = 1, HC5 = 2.48). Metals with fewer outer-shell electrons (Electrons_ActivM, PN_metal) form cations more readily, increasing bioaccumulation via negatively charged membranes. SD_p(BCF) models highlight Neutrons_ActiveM (positive) and PN_metal (negative) as influential, with q-RASPR refinements (MaxNeg(GK), gm(LK)) enhancing predictions. Cobalt, an outlier (absolute residual >1.5), was excluded (Table S4 in ESI-2†).
3.4.3 Modeling SbSD towards metal halides.
|
| (26) |
The Metal Halide HC5 model links lower X_ActivM (electronegativity) to higher HC5(L kg−1), as seen in MnCl2 (X_ActivM = 1.5, HC5 = 8.436) versus HgCl2 (X_ActivM = 1.9, HC5 = 1.143). MaxPos(LK), measuring toxicity similarity, correlates positively, exemplified by copper chloride (CuCl2) (MaxPos(LK) = 1.0, HC5 = 16.036) compared to another (MaxPos(LK) = 0.89, HC5 = 2.89). VWR_ActivM (van der Waals radius) and Nhalogen (halogen count) also contribute to bioaccumulation, but q-RASPR models favor SD Activity(Euc) and CVact(LK) for accuracy. Log
Kow and atomic radius (pm) refine SD_p(BCF) predictions, with MaxPos(LK) further improving q-RASPR performance. Iron chloride was excluded for poor predictive reliability (Table S4 in ESI-2†).
3.4.4 Modeling SbSD towards metal oxide.
Bioconcentration data for metal oxides across species were notably limited. Only three compounds, arsenic pentoxide (As2O5), arsenic trioxide (As2O3), and selenite (SeO32−), met the minimum criterion of having bioconcentration data available for three or more species, which is essential for evaluating SSDs. These compounds exhibited comparable bioaccumulation potential across species, as evidenced by their low standard deviation values in the normal distribution plots (Fig. 6E and F). This consistency suggests a uniform bioaccumulation response pattern for these metal oxides across the species studied.
3.5 Mechanistic relevance of similarity-based read-across (RASPR) variables
The RASPR descriptors were generated using advanced data analysis techniques applied to the QSPR model, incorporating both dependent (y) and independent (X) variables. Statistical methods such as weighted standard deviation, coefficient of variation, and similarity metrics (Euclidean distance, Ghose–Crippen analysis, and Lennard-Jones potential) were employed to capture key patterns, improving the robustness and predictive accuracy of BCF models. After generating the descriptors (detailed in Sheet 11 of ESI-1†), they were integrated with the optimal model descriptors for read-across calculations. The dataset was split into training and test sets to ensure unbiased model evaluation. Initial QSPR and SbSD models produced suboptimal results due to small dataset size and inconsistencies from integrating multiple data sources.20
To address these issues, the q-RASPR method was applied. This technique, effective with small and heterogeneous datasets, combines read-across and QSTR principles for better predictive accuracy. q-RASPR improved performance across all five target species (algae, crustaceans, fish, molluscs, and plants), with the fish BCF model achieving an R2 of 0.87 and RMSE of 0.08, significantly outperforming the QSPR model (R2: 0.62, RMSE: 2.68). Mechanistically, q-RASPR incorporates descriptors such as electronegativity, ionic radius, molecular bulk, and charge, which align with bioaccumulation mechanisms like membrane transport and ion channel transfer. By combining these with similarity-based variables, q-RASPR enhances predictive accuracy for diverse metals and compounds, ensuring reliable predictions, even for underrepresented chemicals.12,19
4. Applicability domain analysis
The applicability domain (AD) of the QSPR, q-RASPR, and SbSD models, all based on multiple linear regression, was evaluated using the standardization approach.38 Except for the fish metal halide and metal HC5 models, they all achieved 100% AD coverage. The fish metal halide model identified Hg2Cl2 (mercurous chloride) as an outlier due to its large molecular size, while the metal HC5 model flagged cadmium (HC5 = 21.42) and barium (atomic radius = 268 pm) as outliers due to high variance and atomic size, respectively. Table S4† provides a detailed summary of outliers and model coverage, reinforcing the models’ robustness in predicting bioaccumulation potential across diverse chemicals.
5. Y-randomization test
We acknowledge the challenges posed by limited sample sizes in metal bioaccumulation studies, particularly the increased risk of partitioning bias in small test sets (Ntest = 1–6). To rigorously evaluate model robustness, we performed Y-randomization (50 permutations), which showed significantly reduced predictive performance (R2, Q2) in the randomized data sets compared to original models. The resulting cRp2 values exceeded 0.5, confirming resistance to overfitting (Tables S1 and S2 in ESI-2†). In addition, repeated random subsampling (three iterations per model) was performed, keeping the descriptor sets fixed while varying the selection of test compounds. Inter-iteration variability in RMSEc (training) and RMSEp (test) remained below 20%, and stable RMSEp/RMSEc ratios (Tables S5 and S6 in ESI-2†) indicated consistent generalizability. Statistical analyses revealed that over 90% of the models achieved comparable descriptor distributions between the training and test sets, supporting their representativeness despite the limited data.
6. In-depth evaluation of the predictive performance of QSPR, q-RASPR and SbSD using synthetic data
To further strengthen validation, particularly for our QSPR, q-RASPR, and SbSD models, we employed SMOGN (Synthetic Minority Over-sampling Technique for Regression).43 This technique was used to synthetically generate new data points in underrepresented regions of the target distribution, thereby increasing the diversity and balance of the validation sets. The quantitative results obtained from the SMOGN-enhanced validation sets are reported in Tables S7 and S8 in ESI-2 of the ESI.† This approach increased the amount of “unseen” data points used in the external validation process, allowing for a more reliable and rigorous evaluation of the models’ predictive performance in accordance with OECD validation principles.
7. SbSD derivation of extended list of chemicals
After validating the predictive accuracy and robustness of our MLR-based QSPR, q-RASPR, and SbSD models, we selected the high-performing q-RASPR and SbSD models to rank an expanded dataset of 29 metals, 50 metal halides, and 49 metal oxides. This extended list, sourced from existing literature and enriched with theoretically generated halides and oxides based on metal valence and octet properties, comprehensively covers both current and potential industrial chemicals. The detailed compound list is available in the ESI-1 (Sheet 12).† Metal halides, widely utilized in catalysis, lighting,44 chemical synthesis,45 and semiconductors,46 and metal oxides, essential in pigments,47 ceramics,48 coatings,49 electronics,50 and energy storage,51 are known for their high toxicity and extensive environmental presence. Therefore, strict regulatory measures are crucial to prevent environmental contamination and reduce associated health risks for both humans and ecosystems.
To address existing data gaps, we generated chemical structures of the extended compounds in SMILES format, calculated descriptors using validated computational tools, and adhered to the methodology used for the original dataset. Predictions of pBCF were performed across key species endpoints, including algae, crustaceans, fish, molluscs, and plants. Furthermore, MLR models derived from q-RASPR, and SbSD endpoints (mean and standard deviation) were employed to assess sensitivity profiles across 10 species, such as algae, amphibians, insects, crustaceans, fish, molluscs, moss, and plants. The resulting data matrix was then visualized through normal distribution plots for metals, metal halides, and metal oxides, offering a comprehensive bioaccumulation sensitivity profile for each chemical category.
For enhanced accessibility, the QSPR, q-RASPR, and SbSD models for predicting pBCF across algae, crustaceans, fish, molluscs, and plants are now publicly available at https://nanosens.onrender.com/pages/apps/. Users can easily select specific chemicals from a dropdown menu for a quick and efficient BCF analysis. Additionally, the platform NanoSens CalTox allows users to input descriptor values of query chemicals to predict BCF values for targeted species accurately. This user-friendly, data-driven tool empowers industrial and regulatory stakeholders by offering reliable insights into the bioaccumulation potential of metallic compounds, enabling informed decision-making to mitigate environmental and biological risks. Such accessible computational tools play a pivotal role in promoting sustainable material development and ensuring environmental safety in emerging technologies.
8. Model/method comparison
The performance of machine learning models is typically assessed by comparing them against established benchmarks from the literature and publicly available tools. In this study, a comprehensive review of BCF-related models was conducted using research repositories such as Scopus, PubMed, and Google Scholar, alongside public search engines, to identify available BCF prediction tools.52–55 This extensive search ensured a fair and thorough comparison with existing models, particularly those designed to predict BCF properties and sensitivities of inorganic chemicals, including metals, metal halides, and metal oxides. This approach enabled a robust and meaningful evaluation of the newly developed models, highlighting their reliability and predictive accuracy.
The predictions generated by these expert models,52–55 summarized in Table S9 in ESI-2,† reveal a critical limitation: they frequently fall outside the applicability domain when applied to the targeted set of inorganic chemicals, including metals, metal halides, and metal oxides. While these tools perform in predicting BCF concentrations for organic compounds, their predictive accuracy is considerably limited for emerging materials, such as nanomaterials and advanced chemical frameworks (Fig. S1–S4 in ESI-2†). Finally, paired Student's t-tests were performed to evaluate whether statistically significant differences exist between experimental results and predictions from publicly available in silico tools (i.e., EPI Suite,56 TEST,57 OECD QSAR Toolbox,58 and OCHEM59 model) and the models presented in this study (i.e., QSPR and q-RASPR models). For all three classes of metal-based compounds (i.e., metals, metal halides, and metal oxides), the calculated t-test values were less than the critical t-values, and the corresponding p-values were greater than the 5% significance level (p ≥ 0.05). The summarized results presented in Table S10 (in ESI-2†) confirm that there are no statistically significant differences between the experimentally observed pBCF(Fish) values and those predicted by the QSPR and q-RASPR models. However, as shown in Table S10 (in ESI-2†), significant differences (p = 0.032 < 0.05) were observed when comparing experimental pBCF(Fish) values for metal halides with those estimated using in silico tools such as EPI Suite, TEST and OECD QSAR Toolbox. Statistically significant differences (p = 0.013 < 0.05) were also observed in the pBCF(Fish) values predicted by the OCHEM model for metal oxides. This underscores the pressing need for tailored predictive models capable of addressing the unique physicochemical properties of these materials. Our study bridges this gap by presenting reliable and validated predictive models, offering a significant advancement in the environmental risk assessment of these often-overlooked chemical categories.
9. BCF modeling significance for nanomaterials and advanced materials
BCF modeling is essential for assessing the environmental and biological risks associated with nanomaterials and advanced materials, including nanoparticles, metal oxides, metal halides, and advanced composites. As these materials possess unique physicochemical properties, such as high surface area-to-volume ratios, variable solubility, and surface reactivity, all of which significantly influence their bioavailability and bioaccumulation potential.1,60 BCF modeling is crucial for assessing the environmental risk of nanomaterials because it helps quantify their potential to accumulate in aquatic organisms. Assessing the bioconcentration of nanoparticles is challenging due to limited bioaccumulation data, the high cost of animal studies, and significant knowledge gaps in determining the equilibrium state. BCF modeling is essential for assessing the environmental risks linked to nanomaterials, as it measures their potential to accumulate in aquatic organisms. These BCF models can also predict the ecological effects of nanomaterials once they are released into the environment. Furthermore, predicted BCF models are important for anticipating ecotoxicity, which is necessary for regulatory compliance. By combining predicted BCF values with other relevant factors, we can evaluate the potential impact of nanomaterials on the ecosystem.
10. Conclusions
With rapid progress in the fields of nanomaterials, metal–organic frameworks, and perovskites, the use of substances such as metals, metal oxides, and metal halides has expanded in applications ranging from catalysis, electronics, and renewable energy to sensors and medical technologies. Given their high toxicity and bioaccumulation potential, comprehensive risk assessments are imperative. Assessing the BCF is essential for understanding the ecological risks associated with chemical pollutants, particularly in aquatic environments. Our study introduces the first comprehensive computational tool capable of predicting pBCF across species using QSPR, q-RASPR, and SbSD models. Traditional BCF models often fail to capture these nanoscale interactions, highlighting the need for advanced QSPR and q-RASPR models.61,62 The findings highlight the significant role of BCF in the accumulation of various substances, including metals and metal compounds, in living organisms, which can induce oxidative stress and threaten biodiversity. Advanced modelling techniques such as QSPR and q-RASPR have proven effective in predicting bioaccumulation potential, underscoring the importance of physicochemical properties and metabolic processes. This study emphasizes the urgent need for comprehensive data collection to address existing knowledge gaps related to the bioaccumulation of inorganic compounds. It also highlights the necessity of a more in-depth investigation into the behaviour of nanomaterials (metals, metal halide/oxides), which remain underrepresented in the existing studies. The primary aim of this investigation is to enhance predictive capabilities by integrating diverse modelling approaches to better assess ecological risks associated with chemical pollutants. Accurate predictions play a vital role in regulatory compliance, safer material design, and risk mitigation, thereby promoting the sustainable development of nanotechnology.63 By improving our understanding of metal bioaccumulation processes, this study lays a strong foundation for future research and regulatory frameworks designed to mitigate environmental risks that bridge data gaps, support environmentally responsible innovation, and empower stakeholders to design safer materials while minimizing environmental risks. Such tools are indispensable for guiding the sustainable development of emerging advanced materials. The methodologies developed here have the potential to enhance risk assessments and support regulatory decision-making, ultimately contributing to more effective management of ecological health.
Author contributions
R. Abdullayev: data curation, methodology, investigation, writing – original draft. K. Khan: conceptualization, data curation, investigation, supervision, writing – original draft, writing – review & editing, funding acquisition. G. K. Jillella: methodology, investigation writing – original draft. V. G. Nair: visualization, software. S. A. Amin: methodology, investigation, writing – original draft. J. Roy: writing – original draft. M. Bousily: writing – original draft. A. Gajewicz-Skretna: conceptualization, supervision, methodology, investigation, writing – review & editing.
Data availability
The data supporting this article have been included as part of the ESI.†
Conflicts of interest
There are no conflicts to declare.
Acknowledgements
This research is part of the project no. 2022/45/P/NZ7/03391 co-funded by the National Science Centre and the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 945339. For the purpose of Open Access, the author has applied a CC-BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission. We also acknowledge the use of Canva (www.canva.com) for designing elements incorporated into Figures 1, 2, 4, 5, and 6 of this manuscript.
References
- K. Khan, V. Kumar, E. Colombo, A. Lombardo, E. Benfenati and K. Roy, Environ. Int., 2022, 170, 107625 CrossRef CAS PubMed
.
- H. Wang, S. Tang, X. Zhou, R. Gao, Z. Liu, X. Song and F. Zeng, Environ. Res., 2022, 204, 112398 CrossRef CAS PubMed
.
- S. Benjamin, E. Masai, N. Kamimura, K. Takahashi, R. C. Anderson and P. A. Faisal, J. Hazard. Mater., 2017, 340, 360–383 CrossRef CAS PubMed
.
-
S. E. Manahan, Fundamentals of environmental chemistry, CRC Press, Florida, 2011 Search PubMed
.
-
D. B. Peakall, Principles of Ecotoxicology, CRC Press, 2016 Search PubMed
.
- J. R. Rochester, Reprod. Toxicol., 2013, 42, 132–155 CrossRef CAS PubMed
.
- F. Lunghini, G. Marcou, P. Azam, R. Patoux, M. H. Enrici, F. Bonachera, D. Horvath and A. Varnek, SAR QSAR Environ. Res., 2019, 30, 507–524 CrossRef CAS PubMed
.
-
S. E. Manahan, Industrial Ecology, Routledge, 2017 Search PubMed
.
-
J. D. Walker and C. Michael, Newman and Monica Enache, Fundamental QSARs for metal ions, CRC Press, 2012 Search PubMed
.
-
F. Gagné, Biochemical ecotoxicology: principles and methods, Elsevier, Waltham, 2014 Search PubMed
.
-
M. C. Newman and W. H. Clements, Ecotoxicology, CRC Press, 2007 Search PubMed
.
- S. Pore, A. Pelloux, M. Chatterjee, A. Banerjee and K. Roy, J. Hazard. Mater., 2024, 479, 135725 CrossRef CAS PubMed
.
- Z. Chen, N. Li, L. Li, Z. Liu, W. Zhao, Y. Li, X. Huang and X. Li, Environ. Res., 2025, 264, 120356 CrossRef CAS PubMed
.
- D. Kowalska, A. Sosnowska, S. Zdybel, M. Stepnik and T. Puzyn, Chemosphere, 2024, 364, 143146 CrossRef CAS PubMed
.
- H. Ai, X. Wu, L. Zhang, M. Qi, Y. Zhao, Q. Zhao, J. Zhao and H. Liu, Ecotoxicol. Environ. Saf., 2019, 179, 71–78 CrossRef CAS PubMed
.
- S. Kar and A. Gallagher, J. Hazard. Mater., 2024, 480, 136060 CrossRef CAS PubMed
.
- Q. Tian, J. He, S. He, Q. Zhang, H. Li, L. Peng, D. Huang, H. Zhu, X. Liu and Q. Zhu, J. Hazard. Mater., 2025, 493, 138092 CrossRef CAS PubMed
.
- Y. Zhou, Y. Liu, X. Yuan, Y. Ruan and H. Chen, Ecotoxicol. Environ. Saf., 2025, 291, 117902 CrossRef CAS PubMed
.
- A. Banerjee and K. Roy, Mol. Diversity, 2022, 26, 2847–2862 CrossRef CAS PubMed
.
- A. Banerjee and K. Roy, Expert Opin. Drug Discovery, 2024, 19, 1017–1022 CrossRef CAS PubMed
.
- K. Khan, G. K. Jillella and A. Gajewicz-Skretna, Aquat. Toxicol., 2024, 277, 107136 CrossRef CAS PubMed
.
- D. Fourches, E. Muratov and A. Tropsha, J. Chem. Inf. Model., 2010, 50, 1189–1204 CrossRef CAS PubMed
.
- P. Gramatica, S. Cassani and A. Sangion, Green Chem., 2016, 18, 4393–4406 RSC
.
- X. Wu, F. Ernst, J. L. Conkle and J. Gan, Environ. Int., 2013, 60, 15–22 CrossRef CAS PubMed
.
- C. Pan, M. Yang, H. Xu, B. Xu, L. Jiang and M. Wu, Chemosphere, 2018, 205, 8–14 CrossRef CAS PubMed
.
- J. Liu, G. Lu, Y. Wang, Z. Yan, X. Yang, J. Ding and Z. Jiang, Chemosphere, 2014, 99, 102–108 CrossRef CAS PubMed
.
- P. De, S. Kar, K. Roy and J. Leszczynski, Environ. Sci.: Nano, 2018, 5, 2742–2760 RSC
.
- S. Kar and S. Yang, Beilstein J. Nanotechnol., 2024, 15, 1142–1152 CrossRef CAS PubMed
.
- J. G. Topliss and R. J. Costello, J. Med. Chem., 1972, 1066–1068 CrossRef CAS PubMed
.
- T. M. Martin, P. Harten, D. M. Young, E. N. Muratov, A. Golbraikh, H. Zhu and A. Tropsha, J. Chem. Inf. Model., 2012, 52, 2570–2578 CrossRef CAS PubMed
.
- G. K. Jillella, K. Khan and K. Roy, Toxicol. in Vitro, 2020, 65, 104768 CrossRef CAS PubMed
.
- J. G. Krishna, P. K. Ojha, S. Kar, K. Roy and J. Leszczynski, Nano Energy, 2020, 70, 104537 CrossRef CAS
.
- G. K. Jillella, P. K. Ojha and K. Roy, Aquat. Toxicol., 2021, 238, 105925 CrossRef CAS PubMed
.
- A. Cherkasov, E. N. Muratov, D. Fourches, A. Varnek, I. I. Baskin, M. Cronin, J. Dearden, P. Gramatica, Y. C. Martin, R. Todeschini, V. Consonni, V. E. Kuz'Min, R. Cramer, R. Benigni, C. Yang, J. Rathman, L. Terfloth, J. Gasteiger, A. Richard and A. Tropsha, J. Med. Chem., 2014, 57, 4977–5010 CrossRef CAS PubMed
.
- A. Golbraikh and A. Tropsha, J. Mol. Graphics Modell., 2002, 20, 269–276 CrossRef CAS PubMed
.
- K. Roy, R. N. Das, P. Ambure and R. B. Aher, Chemom. Intell. Lab. Syst., 2016, 152, 18–33 CrossRef CAS
.
- K. Khan and K. Roy, SAR QSAR Environ. Res., 2019, 30, 665–681 CrossRef CAS PubMed
.
- K. Roy, S. Kar and P. Ambure, Chemom. Intell. Lab. Syst., 2015, 145, 22–29 CrossRef CAS
.
- A. Botté, C. Seguin, J. Nahrgang, M. Zaidi, J. Guery and V. Leignel, Ecotoxicology, 2022, 31, 194–207 CrossRef PubMed
.
- Y. Ge, X. Liu, F. Nan, Q. Liu, J. Lv, J. Feng and S. Xie, Water, 2022, 14, 3228 Search PubMed
.
- T. A. Jarvis, R. J. Miller, H. S. Lenihan and G. K. Bielmyer, Environ. Toxicol. Chem., 2013, 32, 1264–1269 CrossRef CAS PubMed
.
- K. Khan and K. Roy, Green Chem., 2022, 24, 2160–2178 Search PubMed
.
-
P. Branco, L. Torgo and R. P. Ribeiro, in First international workshop on learning with imbalanced domains: Theory and applications, 2017, pp. 36–50 Search PubMed
.
- W. Zhang, G. E. Eperon and H. J. Snaith, Nat. Energy, 2016, 1, 16048 CrossRef CAS
.
- T. R. Bartlett, C. Batchelor-McAuley, K. Tschulik, K. Jurkschat and R. G. Compton, ChemElectroChem, 2015, 2, 522–528 Search PubMed
.
- S. D. Stranks and H. J. Snaith, Nat. Nanotechnol., 2015, 10, 391–402 CrossRef CAS PubMed
.
- V. Stengl, Dyes Pigm., 2003, 58, 239–244 Search PubMed
.
- K. Mukae, K. Tsuda and I. Nagasawa, Jpn. J. Appl. Phys., 1977, 16, 1361–1368 CrossRef CAS
.
- W. Gulbiński, T. Suszko, W. Sienicki and B. Warcholiński, Wear, 2003, 254, 129–135 CrossRef
.
- X. Yu, T. J. Marks and A. Facchetti, Nat. Mater., 2016, 15, 383–396 CrossRef CAS PubMed
.
- J. Jiang, Y. Li, J. Liu, X. Huang, C. Yuan and X. W. (David) Lou, Adv. Mater., 2012, 24, 5166–5180 CrossRef CAS PubMed
.
- I. Sushko, S. Novotarskyi, R. Körner, A. K. Pandey, M. Rupp, W. Teetz, S. Brandmaier, A. Abdelaziz, V. V. Prokopenko, V. Y. Tanchuk, R. Todeschini, A. Varnek, G. Marcou, P. Ertl, V. Potemkin, M. Grishina, J. Gasteiger, C. Schwab, I. I. Baskin, V. A. Palyulin, E. V. Radchenko, W. J. Welsh, V. Kholodovych, D. Chekmarev, A. Cherkasov, J. Aires-de-Sousa, Q.-Y. Zhang, A. Bender, F. Nigsch, L. Patiny, A. Williams, V. Tkachenko and I. V. Tetko, J. Comput. Aided Mol. Des., 2011, 25, 533–554 CrossRef CAS PubMed
.
- M. I. Petoumenou, F. Pizzo, J. Cester, A. Fernández and E. Benfenati, Environ. Res., 2015, 142, 529–534 CrossRef CAS PubMed
.
- S. D. Dimitrov, R. Diderich, T. Sobanski, T. S. Pavlov, G. V. Chankov, A. S. Chapkanov, Y. H. Karakolev, S. G. Temelkov, R. A. Vasilev, K. D. Gerova, C. D. Kuseva, N. D. Todorova, A. M. Mehmed, M. Rasenberg and O. G. Mekenyan, SAR QSAR Environ. Res., 2016, 27, 203–219 CrossRef CAS PubMed
.
- A. A. Toropov, A. P. Toropova, M. Marzo, J. L. Dorne, N. Georgiadis and E. Benfenati, Environ. Toxicol. Pharmacol., 2017, 53, 158–163 CrossRef CAS PubMed
.
- M. L. Card, V. Gomez-Alvarez, W.-H. Lee, D. G. Lynch, N. S. Orentas, M. T. Lee, E. M. Wong and R. S. Boethling, Environ. Sci.:Processes Impacts, 2017, 19, 203–212 RSC
.
-
T. Martin, P. Harten and D. Young, TEST (Toxicity Estimation Software Tool) Ver 4.1, US Environmental Protection Agency, Washington, DC, 2012 Search PubMed
.
-
T. W. Schultz, R. Diderich, C. D. Kuseva and O. G. Mekenyan, Computational toxicology: methods and protocols, 2018, pp. 55–77 Search PubMed.
- V. Kovalishyn, N. Abramenko, I. Kopernyk, L. Charochkina, L. Metelytsia, I. V. Tetko, W. Peijnenburg and L. Kustov, Food Chem. Toxicol., 2018, 112, 507–517 CrossRef CAS PubMed
.
- R. D. Handy, N. van den Brink, M. Chappell, M. Mühling, R. Behra, M. Dušinská, P. Simpson, J. Ahtiainen, A. N. Jha, J. Seiter, A. Bednar, A. Kennedy, T. F. Fernandes and M. Riediker, Ecotoxicology, 2012, 21, 933–972 CrossRef CAS PubMed
.
- F. Gottschalk, T. Sun and B. Nowack, Environ. Pollut., 2013, 181, 287–300 CrossRef CAS PubMed
.
- D. A. Walker, B. Kowalczyk, M. O. de la Cruz and B. A. Grzybowski, Nanoscale, 2011, 3, 1316–1344 RSC
.
- S. Gottardo, A. Mech, J. Drbohlavová, A. Małyska, S. Bøwadt, J. R. Sintes and H. Rauscher, NanoImpact, 2021, 21, 100297 CrossRef CAS PubMed
.
|
This journal is © The Royal Society of Chemistry 2025 |
Click here to see how this site uses Cookies. View our privacy policy here.