The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints†
Received
27th January 2025
, Accepted 2nd April 2025
First published on 2nd April 2025
Abstract
The continuous quest for the quick, accurate, and efficient methods for filling the gaps in the toxicity data of commercial chemicals is the need of the hour. Thus, it has become essential to develop simple and improved modeling strategies that aim to generate more accurate predictions. Recently, quantitative Read-Across Structure–Activity Relationship (q-RASAR) modeling has been reported to enhance the external predictivity of QSAR models. However, the cross-validation metrics of some q-RASAR models show compromised values compared to those of the corresponding QSAR models. We report here an improved q-RASAR workflow coupled with the Arithmetic Residuals in K-groups Analysis (ARKA) framework. This improved workflow (ARKA-RASAR) considers two important aspects: the contribution of different QSAR descriptors to different experimental response ranges, and the identification of similarity among close congeners based on both the selected QSAR descriptors and the contribution of different QSAR descriptors to different experimental response ranges. A simple, free, and user-friendly Java-based tool, Multiclass ARKA-v1.0, has been developed to compute the multiclass ARKA descriptors. In this study, five different toxicity datasets previously used for the development of QSAR and q-RASAR models were considered. We developed hybrid ARKA models that consist of a combination of QSAR descriptors and ARKA descriptors. These hybrid feature spaces were used to compute RASAR descriptors and develop ARKA-RASAR models. We used the same modeling strategies used to develop the previously reported QSAR and q-RASAR models for a fair comparison. Additionally, these modeling algorithms are straightforward, reproducible, and transferable. A multi-criteria decision-making statistical approach, the Sum of Ranking Differences (SRD), indicated that the ARKA-RASAR models are the best-performing models, considering training, test, and cross-validation statistics. The least significant difference procedure ensured that the SRD values were significantly different for most models, presenting an unbiased workflow. True external validation using a set of pesticide metabolites and predicting their early-stage acute fish toxicity using relevant ARKA-RASAR models was also carried out and yielded encouraging results. The promising results and the ease of computation of ARKA and RASAR descriptors using our tools suggest that the ARKA-RASAR modeling framework may be a potential choice for developing highly robust and predictive models for filling the gaps in environmental toxicity data.
Environmental significance
Due to limited availability of quantitative environmental toxicity data for existing and newer chemicals, computational-model-derived data provides an alternative approach for filling gaps in the data. However, developing meaningful statistical models using limited quantitative environmental toxicity data is quite challenging. The problem of small data set classification modeling of ecotoxicity endpoints was previously addressed by introducing the concept of Arithmetic Residuals in K-groups Analysis (ARKA) as a novel method of supervised dimensionality reduction. Here, a multiclass-ARKA framework is introduced for developing robust and predictive regression-based quantitative read-across-structure–activity relationship (q-RASAR) models to deal with limited quantitative environmental toxicity data.
|
Introduction
The environmental toxicity of chemicals is a topic of concern, as they are constantly being accumulated in the environment, which presents a significant threat to flora and fauna. This is because a large number of chemicals present in the environment lack experimental toxicity data, which makes it challenging to assess their environmental risk. Due to the amount of chemicals that we use or are exposed to in our daily lives, it has become imperative to perform human and environmental health risk assessments. Experimental evaluation of the toxicity of such chemicals is time-consuming, laborious, and less economically viable, and does not seem to be able to cope with the increasing demand for filling gaps in toxicity data.1 This has led researchers to utilize simple and inexpensive computational prediction tools to quickly and efficiently fill these data gaps. This approach is backed by the Organisation for Economic Co-operation and Development (OECD) guidelines and the European Union Registration, Evaluation, Authorization and Restriction of Chemicals (EU-REACH) legislation, which encourage the use of in silico approaches and accept their prediction data for filling gaps in toxicity data.2,3 One of the most accepted and widely used in silico approaches is the Quantitative Structure–Activity Relationship (QSAR) approach. Pioneered by Profs. Corwin Hansch and Toshio Fujita in the 1960s, this approach establishes the correlation of structural and physicochemical features with biological activity.4,5 Since then, this field has seen exponential progress and acceptability, both in terms of reliability and time and cost efficiency.6 With the growing popularity of various machine learning and deep learning algorithms, researchers can now identify non-linear relationships among structural and physicochemical features and target biological activity.7 Various internal validation metrics are also computed to assess the robustness and goodness-of-fit, while external validation metrics are used to adjudge the predictivity of the developed models,8,9 as per the established standard practice of developing and validating QSAR models. Although Machine Learning and Deep Learning QSAR models produce highly predictive models in many cases, the interpretability of the contributing features is sometimes compromised due to their black-box nature. Although explainable artificial intelligence (XAI) is rapidly developing, the importance of the classical approaches is still relevant. More importantly, regulatory bodies like the United States Environmental Protection Agency (US EPA) use simple, interpretable, and reproducible models for data gap filling, thereby reinstating the importance of simple modeling frameworks for toxicity and ecotoxicity studies.10 This challenges researchers to develop better models utilizing simple and reproducible modeling frameworks.
Developing QSAR models from small datasets presents a further challenge in identifying the right amount of chemical information from a limited number of descriptors used to maintain statistically acceptable degrees of freedom. Inevitably, this is not always possible, as encoding sufficient chemical space warrants using a proportionately higher number of descriptors, which in turn diminishes the statistical reliability of the QSAR model, reducing the degree of freedom and the variance ratio (F)-value.11,12 Either of two approaches can address this: first, to use non-statistical approaches such as Read-Across, and secondly, to use dimensionality reduction techniques to downsize the dimensions of the input feature matrix.13 In its original form, Read-Across is a non-statistical approach that does not require the development of a mathematical model. The basic algorithm of Read-Across involves the identification of the nearest neighbors (with known response values) of a target compound in the feature space, followed by the application of consensus methods to predict the response of the target compound.14,15 This algorithm is simple, as it avoids statistical considerations. However, estimating the quantitative contributions of the contributing features in a Read-Across study is difficult and requires a mathematical modeling framework. On the other hand, dimensionality reduction is an essential tool to reduce the size of the feature matrix with minimal loss of chemical information. Principal Component Analysis (PCA) is an unsupervised technique that encodes feature information into distinct principal components (PCs). The optimum number of PCs is selected when 95% of the feature information is successfully preserved, ensuring minimal loss of information.16 The recent development of the supervised dimensionality reduction technique, the ARKA framework, has opened an avenue for an improved modeling strategy using the same level of feature information, a reduced number of modeling descriptors, and enhanced model statistics.13 In the original paper introducing the framework,13 it was applied to a binary-classification-based modeling framework. The training set data points were divided into two groups – actives and inactives. The training set descriptor matrix was normalized from 0 to 1 to ensure that the range of descriptors was on a uniform scale. The mean values of a descriptor in the active and inactive classes were obtained, and their difference was calculated. If this mean difference value was positive for a particular descriptor, we considered that it had a higher value in the active class, and subsequently, it was selected to be encoded into ARKA_1. Similarly, a descriptor with a negative mean difference value was selected to be encoded into ARKA_2. After this step, a weightage was given to each descriptor in ARKA_X based on the mean difference value, and these weightage values were then used for the computation of the weighted sum on the standardized descriptor matrix of the training and test sets, thereby generating the descriptors ARKA_1 and ARKA_2, which could be used not only for modeling but also for the identification of activity cliffs and modelability analysis. However, this version of the framework may not be suitable for regression modeling exercises, as the response values are in the quantitative scale, thereby generating significant differences among data points, warranting a number of “K-groups” above 2, i.e., increasing the number of ARKA descriptors (“Multiclass ARKA”).
Another recent advancement in the field of predictive cheminformatics is the encapsulation of the concepts of Read-Across into a QSAR modeling framework. This methodology is termed as the quantitative/classification Read-Across Structure–Activity Relationship (q-RASAR/c-RASAR).17 While Luechtefeld et al.18 introduced this concept in a classification modeling framework, Banerjee and Roy applied19 and further developed20–22 this approach in a regression modeling framework, and also improved the classification RASAR methodology by introducing various additional similarity and error-based descriptors.23–25 The q-RASAR/c-RASAR models are developed using various similarity and error-based measures, identified from the nearest neighbor analysis of various query compounds as distinct descriptors. The advantage of this approach is that the user is able to use the information of the nearest neighbors, and therefore, in most cases, the predictivity of the different q-RASAR models is enhanced as compared to their counterpart QSAR models, with both models using the same amount of chemical space.17,26,27 This ensures the encapsulation of non-linear relationships, even if models are generated using simple and reproducible linear modeling frameworks.
Thus far, one of the key drawbacks of the conventional q-RASAR approach, even after achieving enhanced external predictivity, is the slightly lowered cross-validation metrics, in some cases, when compared to the corresponding QSAR model. Although this can be justified by the fact that most q-RASAR models are developed using a lower number of modeling descriptors than the corresponding QSAR model, and there is the application of the additional “leave-same-out” approach in q-RASAR modeling (which is not done in the case of QSAR modeling), there is always scope for addressing and mitigating this issue. Additionally, it remains to be seen how the q-RASAR algorithm performs on a hybrid feature matrix consisting of conventional QSAR descriptors and ARKA descriptors, especially now with the computation of “multiclass ARKA”, which this manuscript reports. Multi-class ARKA descriptors can now encode feature contributions towards different response ranges, and these descriptors can then be used for similarity analysis in the q-RASAR modeling framework, opening a significantly different avenue for predictive model development.
Materials and methods
Collection of datasets
For the purpose of developing and evaluating the potential of ARKA-RASAR models, it was essential to use datasets for which QSAR and q-RASAR models had been previously reported so that a direct comparison of the model quality could be made. In the present work, five different toxicity studies that reported both QSAR and q-RASAR models were selected. Study 1 reported QSAR and q-RASAR models to predict the androgen receptor binding affinity of endocrine disruptors.28 Study 2 reported QSAR and q-RASAR models to predict the hERG K+ channel inhibition cardiotoxic potential of molecules.29 Study 3 reported QSAR and q-RASAR models to predict the skin sensitization potential of industrial and environmental chemicals.22 Studies 4 and 5 reported QSAR and q-RASAR models predicting aquatic toxicity of organic pesticides against Lepomis and Rainbow trout, respectively.30 Notably, we have not altered the training and test set composition, and we have also used the same QSAR features or descriptors as reported in the previous studies, thereby not altering the original chemical and feature spaces for effective and judicious comparison. The training and test sets for all five different studies have been reported in ESI SI-1.†
Algorithm for the computation of “multiclass ARKA”
In a regression-based model, the range of experimental response values in the training set plays a crucial role. The mechanistic interpretation, associated with OECD Principle 5, assesses a rather “overall” contribution of a particular feature towards a mathematical model. However, from a cheminformatics and statistical point-of-view, the features may exhibit different levels of contributions in different ranges of response values. The initial work on ARKA13 splits the training data space into two different groups, which is perfectly acceptable for binary classification problems and may be acceptable for regression problems with a relatively low response range.31 However, this may not be suitable when a wide range in the quantitative response values of the training set is used. This warrants dividing the training dataset into multiple groups/classes (Multiclass) to capture the relative contribution of the descriptors in a particular response range. Moreover, this reduces noise in the computation of the ARKA descriptors, which is ultimately expected to develop models with better statistics.
The “K” in ARKA denotes K-groups; in other words, the training set descriptor matrix can be divided into multiple groups. First, the training set descriptors and the response values were normalized from 0 to 1.32 This was an essential step, as we wanted to unify the range of descriptor columns to assess their relative contribution to a particular class correctly. The normalized training set was then divided into different groups based on the normalized response values. Each of these different groups encompasses an equal range of normalized response values, although they may contain varying numbers of data points. The total number of groups into which the training set should be divided depends on the user, because each group should contain at least two data points. In each iteration, a particular group was considered the “positive” class, while the data points of all the other groups were collectively considered the “negative” class. The basic principle of the application of ARKA in the regression framework is that each descriptor may have different patterns of contributions in different ranges of quantitative response values. The mean that the values of each descriptor in the positive and negative classes were computed, and their difference was calculated. If a particular descriptor had a positive mean difference value (i.e., positive class mean > negative class mean), that descriptor was classified as being in cluster 1. Otherwise, the descriptor was classified as being in cluster 2. In this process, all descriptors with a positive mean difference value were classified as cluster 1, and all those with a negative mean difference value were classified as cluster 2.
After assigning the descriptors to two distinct clusters, our main idea was to encode the information of the descriptors from cluster 1 to a distinct single descriptor (ARKA_1) and the information from cluster 2 to another distinct single descriptor (ARKA_2). However, each of the descriptors in a particular cluster has varying impacts and contributions to the response. Therefore, we have assigned a weightage to each descriptor in a cluster by using the formula (eqn (1)) stated in the previous work on ARKA:13
|  | (1) |
Once we had obtained the weights of each descriptor, we performed a simple arithmetic “weighted summation” of the descriptors in a particular cluster. For a hypothetical dataset having five descriptors (x1, x2, x3, x4, and x5) with corresponding weights of w1, w2, w3, w4, and w5, and considering a hypothetical situation in which descriptors x1, x2, and x3 are from cluster 1, and x4 and x5 are from cluster 2, the corresponding formulae for calculating ARKA_1 and ARKA_2 are presented in eqn (2) and (3). It is important to note that we have used the standardized training and test set descriptor matrix32 while computing ARKA_1 and ARKA_2, similar to the initial work on ARKA.13
| ARKA_1 = w1 × x1 + w2 × x2 + w3 × x3 | (2) |
| ARKA_2 = w4 × x4 + w5 × x5 | (3) |
The above-mentioned procedure was performed in “k” different iterations to obtain “k” sets of ARKA_1 and ARKA_2. In the current study, we considered 4 and 6 as the number of groups (k), thus generating 4 and 6 sets of ARKA_1 and ARKA_2 descriptors, respectively. We selected 4 and 6 as the number of groups based on various factors such as the training set size, the training set response range, and the number of modeled descriptors reported in previous studies. Typically, for each k value, the corresponding number of ARKA descriptors generated is 2k. Therefore, unnecessarily increasing the value of k may result in an increased chance of encountering multi-collinearity issues with the conventional QSAR descriptors while developing the hybrid ARKA models. Conversely, reducing the value of k may result in an insufficient number of ARKA descriptors, which may not efficiently encode the contributions of the various QSAR descriptors over the training set response range. The efficient capture of descriptor contributions within a particular response range is the motivation for and novelty of the ARKA descriptors. Therefore, in light of the previously reported models and the number of modeling descriptors used in the previous models, we decided to consider an optimum k value in order to ultimately develop improved prediction models using a lower number of modeling descriptors. Fig. 1 illustrates how the grouping was performed when we used k = 4, while Fig. 2 presents how the ARKA descriptors are computed in each iteration. The manual computation of the ARKA descriptors for all five different datasets has been presented in ESI Materials SI-2 to SI-6.†
 |
| Fig. 1 A representation of the grouping technique adopted. This is for a sample hypothetical training set of 17 compounds, whose descriptors and the quantitative experimental response values are normalized from 0 to 1. | |
 |
| Fig. 2 Computation of the ARKA descriptors in each iteration. | |
Data fusion, feature selection, and development of hybrid ARKA models
After the computation of the ARKA descriptors, we merged the initially selected structural and physicochemical descriptors (used to develop models in the previous literature) and the ARKA descriptors, computed from the same, to generate a complete pool of features. This pool was then subjected to feature selection to identify the most essential features. A grid search algorithm was used to generate all possible MLR models using a given number of descriptor combinations, employing the Best Subset Selection tool (freely available from https://www.teqip.jdvu.ac.in/QSAR_Tools/). The optimum number of descriptors and the best-performing descriptor combination were selected based on cross-validation statistics. The selected descriptors were then used to develop a hybrid ARKA MLR model, and the corresponding internal and external validation metrics were computed. A hybrid ARKA Partial Least Squares model was also developed and validated in cases with significant inter-correlation among descriptors, which was reflected by lowering the number of latent variables based on cross-validation statistics. The above-mentioned process for developing hybrid ARKA models was performed for all five datasets.
Computation of the RASAR descriptors
Unlike in the established q-RASAR methodology, in which the structural and physicochemical features are used to define similarity and compute RASAR descriptors, we used the hybrid feature matrix (i.e., the structural, physicochemical, and ARKA descriptors) employed to develop the hybrid ARKA model to define similarity among compounds. This approach considers two different aspects: first, it considers the effect of the important structural and physicochemical descriptors, and secondly, it considers the relative contribution of the descriptors in different quantitative response ranges. The training set containing the hybrid feature matrix was divided into sub-training and validation sets for optimization of the Read-Across hyperparameters. The sub-training and validation set files were used as inputs for the tool Auto_RA_Optimizer-v1.0 (available from https://www.sites.google.com/jadavpuruniversity.in/dtc-lab-software/home), in which the optimized hyperparameter setting was selected based on the predictive performance of the validation set. This optimized hyperparameter setting and the similarity measure were used to compute the RASAR descriptors for the training and test sets using the tool RASAR-Desc-Calc-v3.0.3 (available from https://www.sites.google.com/jadavpuruniversity.in/dtc-lab-software/home). This tool calculates the RASAR descriptors for the query set, where the training set serves as the source set. For the calculation of RASAR descriptors for the test set, the test set is used as the query set, while for the calculation of RASAR descriptors for the training set; the training set is used as the query set. In the latter case, in which the training set is used as both the source and query set, a Leave-Same-Out (LSO) algorithm removes any bias. The concept and the approach have been explained in ref. 33.
Data fusion, feature selection, and development of ARKA-RASAR models
Once the 18 different RASAR descriptors were computed, it was essential to identify the important RASAR descriptors and the role of the hybrid feature matrix. The computed RASAR descriptors were merged with the selected hybrid feature matrix to generate a complete descriptor pool. This pool was subjected to feature selection using the grid search algorithm, employing the Best Subset Selection tool as mentioned above. The optimum number of descriptors and the best-performing descriptor combination were selected based on the cross-validation statistics. The selected descriptors were used to develop MLR ARKA-RASAR models. A Partial Least Squares model was also developed in cases in which the modeling descriptors possessed significant inter-correlation, as evidenced by the lowering of the latent variables based on cross-validation statistics.
Development of Java-based multiclass ARKA descriptor calculating software
As the manual computation of the multiclass ARKA descriptors takes a significant amount of time and may involve human error due to its relatively complex calculations, we have developed a simple Java-based Multiclass ARKA descriptor calculating tool – “Multiclass ARKA-v1.0”, which can be freely downloaded from the DTC Laboratory Website (https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/arithmetic-residuals-in-k-groups-analysis-arka?authuser=0#h.kbvqc8zijr2s). This tool takes the training and test set files containing the selected QSAR descriptors as inputs and quickly computes the corresponding ARKA descriptors. The number of ARKA descriptors computed is determined by the number of groups the user enters. It is to be noted that the number of groups that the user enters should be an integer term >2. Additionally, the tool stops in some cases and explains to the user that according to the number of groups entered, at least one particular group may contain less than two data points, making it statistically meaningless. Moreover, this tool works even when the experimental response values of the test set/true external set are not available. A detailed user manual is provided inside the tool to help the user to correctly utilize it.
A detailed workflow of the methodology adopted in this study is presented in Fig. 3.
 |
| Fig. 3 Detailed workflow of the methodology adopted in this study. | |
Results and discussion
In this study, we considered five different toxicity datasets to evaluate the ARKA-RASAR approach. The criteria for choosing the appropriate datasets were as follows:
i. Regression-based QSAR and q-RASAR models validated using various internal and external validation metrics have been previously reported for these data sets.
ii. The training and test set compositions are available.
iii The studied datasets have varying sizes and response ranges.
Additionally, we considered 4 and 6 groups, which correspond to 8 and 12 ARKA descriptors. We felt that these groupings were sufficient, considering factors such as the dataset size, training set response range, and the number of QSAR descriptors used to develop the QSAR models in previous studies. However, the user is free to choose any number of groups (>2) and compute the ARKA descriptors using the Multiclass ARKA-v1.0 tool.
This work aims to compare the performance of the q-RASAR models developed from the standard QSAR descriptors and the q-RASAR models developed from a hybrid feature matrix of QSAR and ARKA descriptors (ARKA-RASAR).
Statistics of the developed regression-based hybrid ARKA models
Although the ARKA framework was presented as a supervised dimensionality reduction framework in the original publication,13 it was evident that it could have further applications. The multiclass ARKA framework clusters various data points in a particular response range to compute the ARKA descriptors. In conventional QSAR modeling, descriptors are calculated in an unsupervised manner without considering information from the experimental response values. Conversely, the ARKA framework divides the training set chemical space into different groups (K-groups) based on a particular range of normalized response values, thereby incorporating a “supervised” aspect while still utilizing the same amount of chemical information denoted by the selected QSAR descriptors. This approach has two advantages: first, it re-shapes the feature dimension in a supervised manner, and secondly, it considers the relative contribution of the features in different ranges of response values, which is not possible in the case of conventional QSAR modeling. This justifies the novelty of the ARKA framework, and therefore, it is significantly different from QSAR models. Moreover, after examining the results from all five datasets (Table 1), we found that in most cases, the hybrid ARKA models generated enhanced internal and external model statistics compared to the previously reported conventional QSAR models. Additionally, the hybrid ARKA models were developed using a lower number of modeling descriptors than the previously reported conventional QSAR models. Moreover, there is also a clear enhancement in the external validation statistics of the hybrid ARKA models compared to those of the corresponding QSAR models. This analysis was conducted utilizing simple and reproducible modeling algorithms such as Multiple Linear Regression (MLR) and Partial Least Squares (PLS) for efficient comparison with the previously reported studies, which also reported the same, thus eliminating the effects of different other modeling algorithms on the results. Based on the leave-one-out cross-validated Q2 (QLOO2) values, the optimum number of latent variables was chosen to develop PLS models. In cases in which the highest QLOO2 values correspond to an MLR model (number of Latent Variables = number of modeled descriptors), we have adopted an MLR model due to the absence of inter-correlation among descriptors. The training and test sets used to develop and evaluate the hybrid ARKA models have been presented in ESI Materials SI-7.†
Table 1 Model statistics of the previous QSAR, previous q-RASAR, hybrid ARKA and ARKA-RASAR modelsa
Dataset |
Model |
n
Desc
|
Training set |
Test set |
Model specifications |
n
Train
|
R
2
|
Q
LOO
2
|
MAEtrain |
MAELOO |
n
Test
|
Q
F1
2
|
Q
F2
2
|
MAEtest |
BOLD TEXT indicates the overall best-performing model in a dataset considering internal and external validation statistics.
|
1 |
PLS (LV = 3) |
8 |
103 |
0.737 |
0.68 |
0.456 |
0.497 |
44 |
0.582 |
0.582 |
0.539 |
Previous QSAR |
Univariate |
1 |
103 |
0.675 |
0.657 |
0.434 |
0.444 |
44 |
0.633 |
0.633 |
0.483 |
Previous q-RASAR |
PLS (LV = 4) |
5 |
103 |
0.728 |
0.685 |
0.472 |
0.507 |
44 |
0.609 |
0.609 |
0.527 |
Hybrid ARKA |
MLR
|
4 |
103 |
0.724 |
0.684 |
0.461 |
0.488 |
44 |
0.675 |
0.675 |
0.472 |
ARKA-RASAR |
2 |
PLS (LV = 3) |
15 |
196 |
0.635 |
0.549 |
0.555 |
0.607 |
65 |
0.485 |
0.484 |
0.695 |
Previous QSAR (without removal) |
PLS (LV = 3) |
15 |
196 |
0.635 |
0.549 |
0.555 |
0.607 |
63 |
0.575 |
0.574 |
0.642 |
Previous QSAR |
PLS (LV = 4) |
12 |
196 |
0.608 |
0.546 |
0.581 |
0.623 |
63 |
0.66 |
0.66 |
0.548 |
Previous q-RASAR |
MLR |
7 |
196 |
0.624 |
0.591 |
0.559 |
0.583 |
65 |
0.487 |
0.486 |
0.687 |
Hybrid ARKA (without removal) |
MLR |
7 |
196 |
0.624 |
0.591 |
0.559 |
0.583 |
63 |
0.578 |
0.577 |
0.634 |
Hybrid ARKA |
PLS (LV = 10)
|
12 |
196 |
0.663 |
0.623 |
0.534 |
0.568 |
63 |
0.67 |
0.669 |
0.557 |
ARKA-RASAR |
PLS (LV = 10) |
12 |
196 |
0.663 |
0.623 |
0.534 |
0.568 |
65 |
0.574 |
0.574 |
0.612 |
ARKA-RASAR (without removal) |
3 |
PLS (LV = 9) |
10 |
133 |
0.696 |
0.644 |
0.41 |
0.445 |
47 |
0.526 |
0.524 |
0.562 |
Previous QSAR (without removal) |
PLS (LV = 9) |
10 |
133 |
0.696 |
0.644 |
0.41 |
0.445 |
44 |
0.586 |
0.585 |
0.528 |
Previous QSAR |
PLS (LV = 8) |
9 |
133 |
0.695 |
0.649 |
0.406 |
0.436 |
44 |
0.607 |
0.606 |
0.523 |
Previous q-RASAR |
MLR |
7 |
133 |
0.703 |
0.672 |
0.399 |
0.421 |
47 |
0.561 |
0.559 |
0.55 |
Hybrid ARKA (without removal) |
MLR |
7 |
133 |
0.703 |
0.672 |
0.399 |
0.421 |
44 |
0.602 |
0.601 |
0.526 |
Hybrid ARKA |
MLR
|
8 |
133 |
0.708 |
0.673 |
0.395 |
0.422 |
44 |
0.607 |
0.606 |
0.524 |
ARKA-RASAR |
MLR |
8 |
133 |
0.708 |
0.673 |
0.395 |
0.422 |
47 |
0.563 |
0.561 |
0.55 |
ARKA-RASAR (without removal) |
4 |
PLS (LV = 3) |
8 |
102 |
0.679 |
0.62 |
0.763 |
0.839 |
34 |
0.703 |
0.658 |
0.678 |
Previous QSAR |
PLS (LV = 3) |
8 |
102 |
0.67 |
0.598 |
0.767 |
0.845 |
34 |
0.74 |
0.701 |
0.644 |
Previous q-RASAR |
PLS (LV = 5) |
6 |
102 |
0.684 |
0.647 |
0.756 |
0.804 |
34 |
0.7 |
0.655 |
0.697 |
Hybrid ARKA |
PLS (LV = 5)
|
8 |
102 |
0.706 |
0.649 |
0.742 |
0.805 |
34 |
0.708 |
0.664 |
0.682 |
ARKA-RASAR |
5 |
PLS (LV = 5) |
8 |
537 |
0.534 |
0.515 |
0.783 |
0.797 |
178 |
0.551 |
0.541 |
0.752 |
Previous QSAR |
PLS (LV = 4) |
8 |
537 |
0.52 |
0.504 |
0.79 |
0.804 |
178 |
0.588 |
0.579 |
0.715 |
Previous q-RASAR |
MLR |
4 |
537 |
0.52 |
0.51 |
0.795 |
0.802 |
178 |
0.561 |
0.552 |
0.749 |
Hybrid ARKA |
PLS (LV = 4)
|
7 |
537 |
0.527 |
0.513 |
0.791 |
0.801 |
178 |
0.58 |
0.571 |
0.729 |
ARKA-RASAR |
Statistics of the developed ARKA-RASAR models
Because the ARKA descriptors encode the relative contribution of the features in different ranges of response values, investigating the use of the hybrid feature matrix to define similarity among compounds and identify the nearest neighbors was of interest. Once again, unlike the conventional q-RASAR approach, in which the selected QSAR descriptors are used to define similarity and identify close source congeners, we used the hybrid feature matrix to define similarity and develop q-RASAR models. From the results observed in Table 1, it is evident that the problem of somewhat inferior metric values of the cross-validation measures for the conventional q-RASAR models, in comparison to the corresponding QSAR models, seems to have been resolved, as the ARKA-RASAR models had enhanced internal validation statistics compared to the other three modeling strategies (QSAR, q-RASAR, and Hybrid ARKA) for all five datasets. Moreover, the external predictivity of the ARKA-RASAR models remained similar to that of the corresponding conventional q-RASAR models (QF12 values ranged from 0.588 to 0.74 for q-RASAR models, and 0.58 to 0.708 for ARKA-RASAR models), thus maintaining high predictivity. The training and test sets used to develop and evaluate the ARKA-RASAR models have been presented in ESI Materials SI-8.† It is to be noted that for datasets 2 and 3, the previous works reported a QSAR model, identified potential prediction confidence outliers using the DTC plot,29 removed those compounds from the test sets, and then reported the external validation statistics of the q-RASAR models. Therefore, for an efficient and unbiased comparison, we have reported here the model statistics of the hybrid ARKA and ARKA-RASAR models both with and without the identified prediction confidence outliers.
Statistical tests to identify the best-performing modeling approach
As this study reports an extensive modeling exercise on five different datasets using four different modeling algorithms, it is necessary to identify the best-performing modeling algorithm across the five datasets. This identification process should consider factors relating to the performance for both the training and test sets, which requires the evaluation of an array of different validation metrics to arrive at a conclusion. To achieve this, we used a statistical Multi-Criteria Decision-Making (MCDM) approach – the Sum of Ranking Differences (SRD).34 The basic principle of this approach is to determine how close a particular method is to a reference ranking that is considered to be the gold standard.35 The input data matrix was arranged with the models denoted in rows and the corresponding metric values in columns. The metric values were scaled to unit length, and the data matrix was transposed such that the models were in the columns and the metrics were in the rows. Subsequently, the absolute differences between the standard reference (the maximum value in a row) and the individual method ranks were obtained, and these absolute differences were summed. Based on this, the Sum of Ranking Differences was calculated for each model. Therefore, according to the theory, the lower the SRD value, the better the model.
However, selecting the appropriate input data matrix is crucial for this analysis. The aim was not only to establish the right balance between the internal, external, and cross-validation statistics, but also to evaluate the balanced performance on the seen and unseen data (training and test sets, respectively). For this reason, we selected the following metrics (Table 2) to identify and evaluate the best-performing model. It should be noted that for datasets 2 and 3, we only considered the models that omitted the prediction confidence outliers identified in the previous studies in order to maintain the same training and test set composition.
Table 2 List of metricsa that were considered for the SRD analysis
Training set and cross-validation statistics |
Test set statistics |
Statistics for evaluating the balanced performance |
For all these metrics, a higher value indicates better model quality.
|
R
2
|
Q
F1
2
|
1 − |MAEtrain − MAEtest| |
QLOO2 |
Q
F2
2
|
1 − |MAEtest − MAELOO| |
1 − (R2 − QLOO2) |
1 − MAEtest |
1 − |R2 − QF12| |
1 − MAEtrain |
Average (QF12, QF22) |
1 − |R2 − QF22| |
1 − MAELOO |
|
1 − |QLOO2 − QF12| |
1 − |MAELOO − MAEtrain| |
|
1 − |QLOO2 − QF22| |
For the computation of the scaled SRD values, the software package CRRN_DNA was used, which can be freely downloaded from https://www.knight.kit.bme.hu/CRRN. The results are shown graphically in Fig. 4, in which we plot the %SRD values of every model in a random environment. The X-axis and the left Y-axis are the normalized %SRD values, while the right Y-axis represents the cumulative relative frequencies corresponding to the randomization test. The critical threshold (XX1) represents the region of randomness with p < 0.05 (i.e., probability of randomness less than 5%), Med indicates 50% randomness, and XX19 denotes 95% randomness. Therefore, models having lower SRD values, i.e., models appearing towards the left of the plot, are the best, most significant, and non-random. From Fig. 4, it is evident that the ARKA-RASAR models are the best-performing models, followed by the conventional q-RASAR models and the hybrid ARKA models. Specifically, the ARKA_RASAR models of datasets 2, 3, and 5 are within the critical threshold (XX1 with p < 0.05). However, the conventional QSAR models do not appear to perform well enough despite utilizing the same amount of chemical and feature information across all four modeling strategies, which justifies the potential and novelty of our approach. This is also reflected in Fig. 5, which presents the leave-1/7th-out cross-validated SRD results,36 in which the ARKA-RASAR models appeared to be the best, followed by conventional q-RASAR, hybrid ARKA, and conventional QSAR models.
 |
| Fig. 4 SRD analysis of all the developed models, suggesting that the ARKA-RASAR models are the best performing. AR = ARKA-RASAR models, A = hybrid ARKA models, R = conventional q-RASAR models, and Q = conventional QSAR models. Suffixes 1–5 indicate the corresponding dataset number. | |
 |
| Fig. 5 Leave-1/7th-out cross-validated SRD results show that the ARKA-RASAR models are the best. The Y-axis is the leave-1/7th-out cross-validated SRD, while the X-axis represents the corresponding models. AR = ARKA-RASAR models, A = hybrid ARKA models, R = conventional q-RASAR models, and Q = conventional QSAR models. Suffixes 1–5 indicate the corresponding dataset number. | |
An important aspect that must be considered is ensuring a lack of bias concerning the datasets and the four different modeling algorithms to justify the correctness of our investigation. It may be possible that similar datasets may yield similar results, which is also true for similar modeling algorithms. However, this should not be the case for a comprehensive statistical investigation, as we want to ensure that the datasets under investigation are distinctively different from each other, and so are the four different modeling algorithms. Therefore, we performed a two-way ANalysis Of VAriance (ANOVA) considering the different modeling approaches (i.e., conventional QSAR, conventional q-RASAR, hybrid ARKA, and ARKA-RASAR) as Factor 1 and the five different datasets as Factor 2. From the results presented in Table 3, it is evident that not only all are the datasets significantly different from each other (p = 0.000), but the results from the four types of modeling algorithms are also significantly different from each other (p = 0.000). Therefore, the two-way ANOVA analysis ensured that not only was there an absence of bias in our investigation, but also that the enhancement in the model statistics by the ARKA-RASAR models was highly significant.
Table 3 Results of the ANOVA analysis (Factor 1 = between four different modeling approaches – QSAR, q-RASAR, hybrid ARKA and ARKA-RASAR, Factor 2 = between the five different datasets)a
Source |
DF |
SS |
MS |
F
|
p
|
Inference |
DF = degree of freedom, SS = sum of squares, MS = mean squares.
|
Factor 2 |
4 |
28 177 |
7044 |
351.52 |
0.000 |
The datasets are significantly different from each other |
Factor 1 |
3 |
321 411 |
107 137 |
5346.33 |
0.000 |
The results from the modeling algorithms are significantly different from each other |
Interaction |
12 |
163 826 |
13 652 |
681.27 |
0.000 |
Both the factors are inter-dependent |
Additionally, to show that the results obtained from most of the models in the five different datasets were significantly different from each other, we used the least significant difference (LSD) procedure37 coupled with one-way ANOVA using Fisher's test (95% confidence interval) of the SRD values. The results are presented in the form of a heat map (Fig. 6), which indicates that most of the models are significantly different from each other.
 |
| Fig. 6 Heat map for the one-way ANOVA showing that the modeling results are significantly different from each other (except those with solid border lines). The color gradient represents the difference in the SRD values. AR = ARKA-RASAR models, A = hybrid ARKA models, R = conventional q-RASAR models, and Q = conventional QSAR models. | |
From the statistical analyses and tests, it is evident that the ARKA-RASAR approach is the best performing approach.
Comparison of the hybrid ARKA models with the corresponding QSAR models
Conventional QSAR and hybrid ARKA are non-RASAR approaches that have been used to develop models. If we compare the statistics of the hybrid ARKA models to the corresponding QSAR models, we find that not only is there a consistent enhancement in the cross-validated Q2 values, but also that the differences between the R2 and QLOO2 values are reduced, signifying the enhanced robustness of the hybrid ARKA models.38 Additionally, we observe that there has been a consistent enhancement in the external predictivity of the hybrid ARKA models when compared to the corresponding QSAR models, as evident from the increased values of QF12, QF22 and lowered MAEtest values. Moreover, most of the hybrid ARKA models were developed using fewer descriptors than the corresponding QSAR models. These results show the usefulness and importance of the ARKA descriptors, which efficiently take into account the contributions of descriptors in different response ranges, thus resulting in enhanced internal and external validation statistics.
Comparison of the conventional q-RASAR models and the ARKA-RASAR models
q-RASAR is a novel technique that incorporates similarity-based information of compounds into a mathematical modeling framework, resulting in enhanced predictivity compared to the corresponding QSAR model in most cases. However, the ARKA-RASAR approach is an improved q-RASAR framework in which the RASAR descriptors are computed based on the ARKA descriptors, which in turn encode the relative contributions of the original QSAR descriptors across different response ranges. Therefore, it is expected that the similarity of compounds defined using this hybrid feature matrix takes into account the range of response values and can be considered as a “supervised similarity framework”, which is the novelty of this study. If we compare the different validation metrics of the ARKA-RASAR models and the corresponding q-RASAR models, we find that there is a very significant enhancement in the cross-validation performance of the ARKA-RASAR models in all cases, which seemingly solved the problem of the slight lowering of the internal validation statistics associated with conventional q-RASAR models. Moreover, this enhancement in the internal validation statistics did not negatively affect the performance towards the test set, as the test set statistics remained consistent with the conventional q-RASAR models in most cases. Therefore, the ARKA-RASAR models proved to be a much better modeling strategy, especially in terms of providing balanced performance for the training and test sets.
Analysis of feature importance in the ARKA-RASAR models
In this investigation, we wanted to explore the importance of the individual features of the ARKA-RASAR models on the target endpoints. For this, we generated Variable Importance Plots (VIPs) using the software Simca-P v10.0.39 For the PLS models, the plots were generated using the same number of latent variables as used to develop the PLS ARKA-RASAR models. However, for the MLR ARKA-RASAR models, we considered the number of latent variables to be equal to the number of modeling descriptors40 and then generated the VIP plots. Upon observing the VIP plots of five different ARKA-RASAR models (Fig. 7), it is clear that the ARKA descriptors are among the most important descriptors in every case. Moreover, RASAR descriptors such as RA function, which is computed from the hybrid feature matrix containing the ARKA descriptors, present an enhanced contribution to the response. The general trend also implies that the standard QSAR descriptors have a consistently low level of importance in all five different datasets. These observations again show the importance of the ARKA and RASAR descriptors in enhancing the model quality and external predictivity while using the same amount of chemical information.
 |
| Fig. 7 VIP plots of the ARKA-RASAR models for the five different datasets. | |
Real-world effectiveness and potential of ARKA-RASAR models in filling gaps in the data and environmental risk assessment
To showcase practical examples of the application of the ARKA-RASAR model to environmental toxicity prediction, we selected the final models of Datasets 4 and 5 (datasets for early fish toxicity of pesticides) to evaluate the aquatic toxicity of pesticide metabolites, which is crucial in regulatory settings, particularly in light of the European Plant Protection Products Regulation 1107/2009 (https://www.eur-lex.europa.eu/eli/reg/2009/1107/oj/eng). We used a curated set of 150 pesticide metabolites prepared by Burden et al. (2016)41 from the Pesticide Properties DataBase (PPDB)42 which reported experimental acute fish LC50 data and also ECOSAR43 predictions for the prediction purpose. The compounds with predicted fish toxicity values (ESI Materials SI-9†) higher than the mean response values of the corresponding training sets (Datasets 4 and 5) were considered to pose toxicological concerns. Among these compounds, we identified seven compounds (Table 4) that showed concern based on both models (corresponding to Datasets 4 and 5) and were not present in the training sets of the corresponding datasets (Datasets 4 and 5). Interestingly, all seven compounds exhibit acute fish toxicity in experimental studies and in ECOSAR (US EPA) predictions, as classified by the GHS acute toxicity scheme.44 This constitutes a true external validation of the ARKA-RASAR models corresponding to Datasets 4 and 5. As the number of registrations of pesticide compounds increases, a considerable number of metabolites will require expense and resources for experimental toxicity testing. In light of the EFSA's (2013) guidance on tiered risk assessment,45 which recommends non-testing methods, the ARKA-RASAR framework may provide reliable estimates of chemical toxicities in a regulatory context.
Table 4 True external validation of ARKA-RASAR models (Datasets 4 and 5): list of compounds predicted to be toxic to fish based on both models and not common to the modeling data sets, along with their GHS classification based on experimental and ECOSAR predicted LC50 (mg L−1) values
Sl |
Metabolite |
SMILES code |
Expt. Fish LC50 (mg L−1) |
Expt. GHS class |
ECOSAR LC50 (mg L−1) |
ECOSAR GHS class |
1 |
2-Ethyl-4,5,6,7-tetrahydro-4-oxo-6-(2,4,6-trimethylphenyl)benzoxazole |
CCc1nc2c(o1)CC(CC2 O)c3c(cc(cc3C)C)C |
>0.61 |
1 |
0.36 |
1 |
2 |
Acifluorfen |
Clc2cc(ccc2Oc1cc(C( O)O)c([N+]([O–]) O))cc1)C(F)(F)F |
54 |
3 |
33.16 |
3 |
3 |
Chlordecone |
ClC54C( O)C1(Cl)C2(Cl)C5(Cl)C3(Cl)C4(Cl)C1(Cl)C2(Cl)C3(Cl)Cl |
0.02 |
1 |
0.99 |
1 |
4 |
Ethion |
S P(SCSP( S)(OCC)OCC)(OCC)OCC |
0.5 |
1 |
0.07 |
1 |
5 |
Fipronil sulfide |
c1c(cc(c(c1Cl)n2c(c(c(n2)C#N)SC(F)(F)F)N)Cl)C(F)(F)F |
0.03 |
1 |
0.031 |
1 |
6 |
Ioxynil |
Ic1cc(C#N)cc(I)c1O |
8.5 |
2 |
2.14 |
2 |
7 |
Triadimefon |
CC(C)(C)C( O)C(N1C NC N1)OC2 CC C(C C2)Cl |
4.08 |
2 |
13.76 |
3 |
Conclusion
To achieve the quick and efficient estimation of the toxicity of chemicals via models with enhanced prediction accuracy, it is necessary to develop improved modeling frameworks. Paola Gramatica and colleagues, in one of their publications, stated that QSAR modeling is not “Push a button and find a correlation”.46 This serves as a driving force for researchers across the globe to continuously develop newer methodologies in QSARs that are aimed towards generating better models with enhanced predictivity. Inspired by this, the q-RASAR17,19 and ARKA frameworks13 are two novel methods developed by Banerjee and Roy that have applications in developing improved models with enhanced statistics as well as in the identification of activity cliffs. However, in most previous studies, it was observed that although q-RASAR models lead to enhanced predictivity as compared to QSAR models, this was also associated, in some cases, with a slight lowering of the internal validation statistics. Another point of exploration was to increase the number of groups in the ARKA framework to test its performance in a quantitative modeling framework. The present study not only reports a “Multiclass ARKA” framework in which the increased number of ARKA descriptors lead to the development of improved models but also uses ARKA descriptors as part of the feature space to develop ARKA-RASAR models. In this study, the statistical multi-criteria decision-making approach showed that the ARKA-RASAR modeling strategy is the best. Two-way ANOVA showed that the datasets used in the current study are significantly different from each other. Additionally, it showed that the four different modeling strategies (QSAR, q-RASAR, hybrid ARKA and ARKA-RASAR) are also significantly different from each other. Moreover, the application of the least significant difference procedure coupled with one-way ANOVA showed that the results obtained from most of the models in the five different datasets are significantly different from each other. Moreover, the hybrid ARKA and ARKA-RASAR models were developed using the same modeling algorithm (PLS/MLR) used in the previous studies reporting the QSAR and q-RASAR models, thus eliminating any bias imposed due to varying modeling strategies. The Variable Importance Plots showed that the ARKA descriptors are among the most important descriptors, while the standard QSAR descriptors are among the least important. The only drawback may be the absence of a straightforward physicochemical interpretability aspect of the ARKA descriptors; however, this may not be an impeding factor considering the enhancement in the model quality and predictivity.47–50 Moreover, the purpose of computing ARKA descriptors is to capture the contribution of features across different response ranges, which obviates the necessity for an “overall interpretation”. In short, the q-RASAR workflow has been improved by capturing the response-range specific feature contributions through the incorporation of the multi-class ARKA framework, which increases the model robustness and external predictivity. Therefore, this study suggests that the novel ARKA-RASAR approach has great potential to develop robust and predictive models for the purpose of filling the gaps in the data for environmental chemicals.
Declaration
A preprint version of the article has been deposited to ChemRxiv preprint server (https://doi.org/10.26434/chemrxiv-2025-5qsh2).
Data availability
The source data used to develop the models reported in this paper are available in ESI Materials SI-1.† The step-by-step manual computation of the ARKA descriptors for all five datasets is presented in the ESI Materials (SI-2 to SI-6).† The training and test sets for the hybrid ARKA and ARKA-RASAR models for all five different datasets have been presented in ESI Materials SI-7 and SI-8,† respectively. The true external set predictions have been presented in ESI Materials SI-9.†
Author contributions
Arkaprava Banerjee: data curation, formal analysis, validation, software, writing – initial draft, Kunal Roy: conceptualization, funding acquisition, supervision, writing – editing.
Conflicts of interest
The authors declare no competing interests.
Acknowledgements
AB thanks the Life Sciences Research Board, DRDO, New Delhi for a senior research fellowship. This research has been funded by the Life Sciences Research Board, DRDO, New Delhi (LSRB/01/15001/M/LSRB-394/SH&DD/2022).
References
-
K. Roy, S. Kar and R. N. Das, Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment. Academic Press, NY, 2015. DOI: DOI:10.1016/C2014-0-00286-9
.
- N. Gellatly and F. Sewell, Regulatory Acceptance Of In Silico Approaches For The Safety Assessment Of Cosmetic-Related Substances, Comput. Toxicol., 2019, 11, 82–89, DOI:10.1016/j.comtox.2019.03.003
.
- E. Benfenati, R. G. Diaza, A. Cassano, S. Pardoe, G. Gini, C. Mays, R. Knauf and L. Benighaus, The Acceptance of In Silicomodels For REACH: Requirements, Barriers, and Perspectives, Chem. Cent. J., 2011, 5, 58, DOI:10.1186/1752-153X-5-58
.
- C. Hansch, P. P. Maloney, T. Fujita and R. M. Muir, Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients, Nature, 1962, 194, 178–180, DOI:10.1038/194178b0
.
- C. Hansch and T. Fujita, p-σ-π analysis. A Method for the Correlation of Biological Activity and Chemical Structure, J. Am. Chem. Soc., 1964, 86, 1616–1626, DOI:10.1021/ja01062a035
.
- A. Banerjee, K. Roy and P. Gramatica, A Bibliometric Analysis of The Cheminformatics/QSAR Literature (2000–2023) For Predictive Modeling in Data Science Using the SCOPUS Database, Mol. Diversity, 2024 DOI:10.1007/s11030-024-11056-8
.
-
G. Gini and F. Zanoli, Machine Learning and Deep Learning Methods in Ecotoxicological QSAR Modeling. In: Roy, K. (eds) Ecotoxicological QSARs. Methods in Pharmacology and Toxicology. Humana, New York, NY, 2020. DOI: DOI:10.1007/978-1-0716-0150-1_6
.
- P. Gramatica, Principles of QSAR modeling: Comments and Suggestions from Personal Experience, Int. J. Quant. Struct.-Prop. Relat., 2020, 5, 61–97, DOI:10.4018/IJQSPR.20200701.oa1
.
- A. Tropsha, Best Practices for QSAR Model Development, Validation, and Exploitation, Mol. Inf., 2010, 29, 476–488, DOI:10.1002/minf.201000061
.
-
NAFTA Technical Working Group on Pesticides Quantitative Structure Activity Relationship Guidance Document: https://www.epa.gov/sites/default/files/2016-01/documents/qsar-guidance.pdf (accessed on Dec 13, 2024).
- M. T. D. Cronin and T. W. Schultz, Pitfalls in QSAR, J. Mol. Struct.:THEOCHEM, 2003, 622, 39–51, DOI:10.1016/S0166-1280(02)00616-4
.
-
B. Rasulev, A. Gajewicz, T. Puzyn, D. Leszczynska and J. Leszczynski,. Nano-QSAR: Advances and Challenges. in: Leszczynski, J. and Puzyn, T. ed. Towards Efficient Designing of Safe Nanomaterials: Innovative Merge of Computational Approaches and Experimental Techniques. Royal Society of Chemistry 2012. DOI: 10.1039/9781849734530-00220
.
- A. Banerjee and K. Roy, ARKA: A Framework of Dimensionality Reduction for Machine-Learning Classification Modeling, Risk Assessment, and Data Gap-Filling of Sparse Environmental Toxicity Data, Environ. Sci.: Processes Impacts, 2024, 26, 991–1007, 10.1039/D4EM00173G
.
-
S. Manganelli and E. Benfenati, Use of Read-Across Tools. in: Benfenati, E. ed. In Silico Methods for Predicting Drug Toxicity. Methods in Molecular Biology, vol 1425. Humana Press, New York, NY. 2016. DOI: DOI:10.1007/978-1-4939-3609-0_13
.
- A. Banerjee, S. Kar, K. Roy, G. Patlewicz, N. Charest, E. Benfenati and M. T. D. Cronin, Molecular Similarity in Chemical Informatics and Predictive Toxicity Modeling: From Quantitative Read-Across (q-RA) to Quantitative Read-Across Structure–Activity Relationship (q-RASAR) with the Application of Machine Learning, Crit. Rev. Toxicol., 2024, 54, 659–684, DOI:10.1080/10408444.2024.2386260
.
- S. Wold, K. Esbensen and P. Geladi, Principal Component Analysis, Chemom. Intell. Lab. Syst., 1987, 2, 37–52, DOI:10.1016/0169-7439(87)80084-9
.
-
K. Roy K and A. Banerjee, Q-RASAR: A Path to Predictive Cheminformatics. SpringerNY, 2024. DOI: DOI:10.1007/978-3-031-52057-0
.
- T. Luechtefeld, D. Marsh, C. Rowlands and T. Hartung, Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility, Toxicol. Sci., 2018, 165, 198–212, DOI:10.1093/toxsci/kfy152
.
- A. Banerjee and K. Roy, First Report of q-RASAR Modeling Toward
an Approach of Easy Interpretability and Efficient Transferability, Mol. Diversity, 2022, 26, 2847–2862, DOI:10.1007/s11030-022-10478-6
.
- A. Banerjee and K. Roy, On Some Novel Similarity-Based Functions Used in the ML-based q-RASAR Approach For Efficient Quantitative Predictions of Selected Toxicity End Points, Chem. Res. Toxicol., 2023, 36, 446–464, DOI:10.1021/acs.chemrestox.2c00374
.
- V. Kumar, A. Banerjee and K. Roy, Machine Learning-based q-RASAR Approach for the In Silico Identification of Novel Multi-Target Inhibitors Against Alzheimer's Disease, Chemom. Intell. Lab. Syst., 2024, 245, 105049, DOI:10.1016/j.chemolab.2023.105049
.
- A. Banerjee and K. Roy, Read-across-Based Intelligent Learning: Development of a Global q-RASAR Model for the Efficient Quantitative Predictions of Skin Sensitization Potential of Diverse Organic Chemicals, Environ. Sci.: Processes Impacts, 2023, 25, 1626–1644, 10.1039/D3EM00322A
.
- V. Kumar, A. Banerjee and K. Roy, Breaking the Barriers: Machine-learning-based c-RASAR Approach for Accurate Blood–Brain Barrier Permeability Prediction, J. Chem. Inf. Model., 2024, 64, 4298–4309, DOI:10.1021/acs.jcim.4c00433
.
- A. Banerjee and K. Roy, Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure–Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients, Chem. Res. Toxicol., 2023, 36, 1518–1531, DOI:10.1021/acs.chemrestox.3c00155
.
- A. Banerjee and K. Roy, The Application of Chemical Similarity Measures in an Unconventional Modeling Framework c-RASAR Along With Dimensionality Reduction Techniques to a Representative Hepatotoxicity Dataset, Sci. Rep., 2024, 14, 20812, DOI:10.1038/s41598-024-71892-4
.
- H. Wang, P. Wang, T. Fan, T. Ren, N. Zhang, L. Zhao, R. Zhong and G. Sun, From Molecular Descriptors to the Developmental Toxicity Prediction of Pesticides/Veterinary Drugs/Bio-Pesticides Against Zebrafish Embryo: Dual Computational Toxicological Approaches For Prioritization, J. Hazard. Mater., 2024, 476, 134945, DOI:10.1016/j.jhazmat.2024.134945
.
- X. Lu, X. Wang, S. Chen, T. Fan, L. Zhao, R. Zhong and G. Sun, The Rat Acute Oral Toxicity of Trifluoromethyl Compounds (TFMs): A Computational Toxicology Study Combining the 2D-QSTR, Read-Across and Consensus Modeling Methods, Arch. Toxicol., 2024, 98, 2213–2229, DOI:10.1007/s00204-024-03739-w
.
- A. Banerjee, P. De, V. Kumar, S. Kar and K. Roy, Quick and Efficient Quantitative Predictions of Androgen Receptor Binding Affinity for Screening Endocrine Disruptor Chemicals Using 2D-QSAR and Chemical Read-Across, Chemosphere, 2022, 309, 136579, DOI:10.1016/j.chemosphere.2022.136579
.
- A. Banerjee and K. Roy, Machine-learning-based Similarity Meets Traditional QSAR: “q-RASAR” for the Enhancement of the External Predictivity and Detection of Prediction Confidence Outliers in an hERG Toxicity Dataset, Chemom. Intell. Lab. Syst., 2023, 237, 104829, DOI:10.1016/j.chemolab.2023.104829
.
- S. Ghosh, M. Chatterjee and K. Roy, Quantitative Read-Across Structure-Activity Relationship (q-RASAR): A New Approach Methodology to Model Aquatic Toxicity of Organic Pesticides Against Different Fish Species, Aquat. Toxicol., 2023, 265, 106776, DOI:10.1016/j.aquatox.2023.106776
.
- A. Sobanska, A. Banerjee and K. Roy, Organic Sunscreens and Their Products of Degradation in Biotic and Abiotic Conditions—In Silico Studies of Drug-Likeness and Human Placental Transport, Int. J. Mol. Sci., 2024, 25, 12373, DOI:10.3390/ijms252212373
.
-
G. W. Snedecor and W. G. Cochran, Statistical Methods. 8th edn, Wiley-Blackwell, 1989 Search PubMed
.
- A. Banerjee and K. Roy, How to Correctly Develop q-RASAR Models For Predictive Cheminformatics, Expert Opin. Drug Discovery, 2024, 19, 1017–1022, DOI:10.1080/17460441.2024.2376651
.
- K. Heberger, Sum of Ranking Differences Compares Methods or Models Fairly, TrAC, Trends Anal. Chem., 2010, 29, 101–109, DOI:10.1016/j.trac.2009.09.009
.
- A. Gere, A. Racz, D. Bajusz and K. Heberger, Multicriteria decision making for evergreen problems in food science by sum of ranking differences, Food Chem., 2021, 344, 128617, DOI:10.1016/j.foodchem.2020.128617
.
- B. R. Sziklai, M. Baranyi and K. Heberger, Does Cross-Validation Work in Telling Rankings Apart?, Cent. Eur. J. Oper. Res., 2024 DOI:10.1007/s10100-024-00932-1
.
-
S. Bolton, Statistics, in: Remington JP. Remington: the Science and Practice of Pharmacy (ed. D. Troy), Lippincott Williams & Wilkins, Baltimore; 2006 Search PubMed
.
- G. Toth, Z. Bodai and K. Heberger, Estimation of influential points in any data set from coefficient of determination and its leave-one-out cross-validated counterpart, J. Comput.-Aided Mol. Des., 2013, 27, 837–844, DOI:10.1007/s10822-013-9680-4
.
-
Z. Wu, D. Li, J. Meng and H. Wang, Introduction to SIMCA-P and Its Application. in: Esposito Vinzi, V., Chin, W., Henseler, J. and Wang, H. ed. Handbook of Partial Least Squares. Springer Handbooks of Computational Statistics, Springer, Berlin, Heidelberg. 2010. DOI: DOI:10.1007/978-3-540-32827-8_33
.
-
S. Wold, PLS for multivariate linear modeling. in: van de Waterbeemd H. ed. Chemometric Methods in Molecular Design. Weinheim, VCH, 1995 Search PubMed
.
- N. Burden, S. K. Maynard, L. Weltje and J. R. Wheeler, The Utility of QSARs in Predicting Acute Fish Toxicity of Pesticide Metabolites: A Retrospective Validation Approach, Regul. Toxicol. Pharmacol., 2016, 80, 241–246, DOI:10.1016/j.yrtph.2016.05.032
.
- K. A. Lewis, J. Tzilivakis, D. Warner and A. Green, An International Database for Pesticide Risk Assessments and Management, Hum. Ecol. Risk Assess.: Int. J., 2016, 22, 1050–1064, DOI:10.1080/10807039.2015.1133242
.
-
ECOSAR: https://www.epa.gov/tsca-screening-tools/ecological-structure-activity-relationships-ecosar-predictive-model (accessed on 30.3.25).
-
National Research Council.GHS Classification: Committee on the Design and Evaluation of Safer Chemical Substitutions: A Framework to Inform Government and Industry Decision; Board on Chemical Sciences and Technology; Board on Environmental Studies and Toxicology; Division on Earth and Life Studies; Washington (DC): National Academies Press (US); 2014 Oct 29 Search PubMed
.
- EFSA, Guidance on tiered risk assessment for plant protection products for aquatic organisms in edge-of-field surface waters, EFSA J., 2013, 11, 3290 Search PubMed
.
- P. Gramatica, S. Cassani, P. P. Roy, S. Kovarich, C. W. Yap and E. Papa, QSAR Modeling is Not “Push a Button and Find a Correlation”: A Case Study of Toxicity of (Benzo-)triazoles On Algae, Mol. Inf., 2012, 31, 817–835, DOI:10.1002/minf.201200075
.
- B. Bhhatarai, R. Garg and P. Gramatica P, Are Mechanistic and Statistical QSAR Approaches Really Different? MLR Studies on 158 cycloalkyl-pyranones, Mol. Inf., 2010, 29, 511–522, DOI:10.1002/minf.201000011
.
- T. Fujita and D. A. Winkler, Understanding the Roles of the “Two QSARs”, J. Chem. Inf. Model., 2016, 56, 269–274, DOI:10.1021/acs.jcim.5b00229
.
- R. Guha, On the Interpretation and Interpretability of Quantitative Structure-Activity Relationship Models, J. Comput.-Aided Mol. Des., 2008, 22, 857–871, DOI:10.1007/s10822-008-9240-5
.
- P. Gramatica, Origin of the OECD Principles for QSAR Validation and Their Role in Changing the QSAR Paradigm Worldwide: An Historical Overview, J. Chemom., 2025, 39, e70014, DOI:10.1002/cem.70014
.
|
This journal is © The Royal Society of Chemistry 2025 |
Click here to see how this site uses Cookies. View our privacy policy here.