Long Jiao*ab,
Xiaofei Wanga,
Shan Binga,
Zhiwei Xuec and
Hua Li*b
aCollege of Chemistry and Chemical Engineering, Xi'an Shiyou University, Xi'an 710065, P. R. China. E-mail: mop@xsyu.edu.cn; Fax: +86-29-88382702; Tel: +86-29-88383172
bCollege of Chemistry and Materials Science, Northwest University, Xi'an 710069, P. R. China. E-mail: nwufxkx2012@126.com; Fax: +86-29-88303527; Tel: +86-29-88302635
cNo. 203 Research Institute of Nuclear Industry, Xian yang 712000, P. R. China
First published on 17th December 2014
The quantitative structure property relationship (QSPR) for the photolysis half-life (t1/2) of polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs) on spruce (Picea abies (L.) Karst.) needle surfaces under sunlight irradiation was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCDD/Fs. The quantitative relationship between the MDEV index and log
t1/2 was modeled by using multivariable linear regression (MLR) and artificial neural network (ANN) respectively. Leave-one-out cross validation and external validation were carried out to assess the prediction ability of the developed models. For the MLR method, the prediction root mean square relative error (RMSRE) of leave-one-out cross validation and external validation is 3.47 and 4.25 respectively. For the ANN method, the prediction RMSRE of leave-one-out cross validation and external validation is 2.68 and 3.52 respectively. It is demonstrated that there is a quantitative relationship between the MDEV index and log
t1/2 of PCDD/Fs. Both MLR and ANN are practical for modeling this relationship. The developed MLR model and ANN model can be used to predict the log
t1/2 of PCDD/Fs. Thus, the log
t1/2 of each PCDD/F congener was predicted by using the developed models.
Photo-degradation is degradation of a photodegradable molecule caused by the absorption of photons. It is an important abiotic transformation of the compounds in the environment.7,8 Photo-degradation reaction has been regarded as an important approach to eliminate the contamination of PCDD/Fs, and thus been studied by many researchers.7–11 Niu et al.10,11 have reported that PCDD/Fs can be efficiently degraded on the spruce (Picea abies (L.) Karst.) needle surfaces under sunlight irradiation. Photolysis half-life (t1/2) is an important parameter for characterizing the photo-degradation reactions and assessing the environmental risk of PCDD/Fs. However, determining the t1/2 of PCDD/Fs is a hard work because of the complexity of analytical methods, high cost of experiments, large expenditures of time and lack of the standards.7,12–14 Measured photolysis t1/2 data for PCDD/Fs are rather scarce.7,15 Thus, the quantitative structure property relationship (QSPR) method, which is fast, simple and cost-effective for predicting the property of compounds, has been paid much attention to preliminarily estimate the photolysis t1/2 of PCDD/Fs. Several QSPR models for predicting the t1/2 of PCDD/Fs under various reaction conditions have been studied.7,16–18 Niu et al.7 has investigated the QSPR model for the photolysis t1/2 of PCDD/Fs on the spruce (Picea abies (L.) Karst.) needle surfaces under sunlight irradiation. They suggested that there is no quantitative relationship between the t1/2 and quantum chemical descriptors of PCDDs and just built a QSPR model for the t1/2 of PCDFs by using quantum chemical descriptors. However, the generation and selection of quantum chemical descriptors are somewhat complicated and time-consuming. Therefore, it is still worthwhile to investigate the QSPR model for the t1/2 of PCDD/Fs.
The aim of this work is to develop a reliable and easy-to-use QSPR model for estimating the photolysis t1/2 of PCDD/Fs. Topological index is a kind of structural descriptor which is commonly used in QSPR researches. It can effectively describe the structure of a molecule without the need for detailed molecular orbital calculations. It is useful because, despite its mathematical simplicity, it is able to differentiate molecules with different structures.19,20 Thus, molecular distance-edge vector (MDEV) index,21–24 a kind of topological index, was used as the structural descriptor of PCDD/Fs in this work. Multivariable linear regression (MLR) and artificial neural network (ANN) were used to model the quantitative relationship between the t1/2 and MDEV index of PCDD/Fs.
t1/2 value is known, is listed in Table 1. The experimental log
t1/2 of these 70 PCDD/Fs was taken from ref. 7 and 10 and listed in Table 2.
| No. | Compound | M11 | M12 | M22 |
|---|---|---|---|---|
| a The ones marked by an asterisk are the PCDD/F congeners in the Group II (see text). | ||||
| 1 | 1,2,3,9-T4CDD | 0.3485 | 4.2050 | 0.2500 |
| 2 | 1,2,6,7-T4CDD | 0.2862 | 4.2050 | 0.2500 |
| 3 | 1,2,6,8-T4CDD | 0.2457 | 4.2050 | 0.2500 |
| 4 | 1,2,6,9-T4CDD | 0.2353 | 4.2275 | 0.2500 |
| 5* | 1,2,8,9-T4CDD | 0.3064 | 4.2050 | 0.2500 |
| 6 | 1,3,6,8-T4CDD | 0.1986 | 4.2050 | 0.2500 |
| 7 | 1,3,7,8-T4CDD | 0.2376 | 4.1825 | 0.2500 |
| 8 | 1,3,7,9-T4CDD | 0.1997 | 4.2050 | 0.2500 |
| 9 | 1,4,7,8-T4CDD | 0.2232 | 4.2050 | 0.2500 |
| 10* | 2,3,7,8-T4CDD | 0.2782 | 4.1600 | 0.2500 |
| 11 | 1,2,3,4,6-P5CDD | 0.5825 | 5.2675 | 0.2500 |
| 12 | 1,2,3,6,7-P5CDD | 0.4959 | 5.2450 | 0.2500 |
| 13 | 1,2,3,6,8-P5CDD | 0.4520 | 5.2450 | 0.2500 |
| 14 | 1,2,3,6,9-P5CDD | 0.4450 | 5.2675 | 0.2500 |
| 15* | 1,2,3,7,8-P5CDD | 0.4878 | 5.2225 | 0.2500 |
| 16 | 1,2,3,7,9-P5CDD | 0.4546 | 5.2450 | 0.2500 |
| 17 | 1,2,3,8,9-P5CDD | 0.5080 | 5.2450 | 0.2500 |
| 18 | 1,2,4,6,7-P5CDD | 0.4369 | 5.2675 | 0.2500 |
| 19 | 1,2,4,7,8-P5CDD | 0.4248 | 5.2450 | 0.2500 |
| 20* | 1,2,4,8,9-P5CDD | 0.4450 | 5.2675 | 0.2500 |
| 21 | 1,2,3,4,6,7-H6CDD | 0.7577 | 6.3075 | 0.2500 |
| 22 | 1,2,3,4,6,9-H6CDD | 0.7068 | 6.3300 | 0.2500 |
| 23 | 1,2,3,4,7,8-H6CDD | 0.7375 | 6.2850 | 0.2500 |
| 24 | 1,2,3,6,7,8-H6CDD | 0.7179 | 6.2850 | 0.2500 |
| 25* | 1,2,3,7,8,9-H6CDD | 0.7252 | 6.2850 | 0.2500 |
| 26 | 1,2,3,4,6,7,8-H7CDD | 0.9953 | 7.3475 | 0.2500 |
| 27 | 1,2,3,4,6,7,9-H7CDD | 0.9444 | 7.3700 | 0.2500 |
| 28 | O8CDD | 1.2931 | 8.4100 | 0.2500 |
| 29 | 1,2,6,7-T4CDF | 0.3064 | 4.2761 | 1.0000 |
| 30* | 1,2,6,9-T4CDF | 0.2671 | 4.3472 | 1.0000 |
| 31 | 1,2,7,8-T4CDF | 0.3064 | 4.2761 | 1.0000 |
| 32 | 1,3,4,8-T4CDF | 0.2774 | 4.2761 | 1.0000 |
| 33 | 1,3,6,8-T4CDF | 0.2166 | 4.2761 | 1.0000 |
| 34 | 1,4,6,8-T4CDF | 0.2062 | 4.2986 | 1.0000 |
| 35* | 2,3,4,6-T4CDF | 0.3533 | 4.2275 | 1.0000 |
| 36 | 2,3,6,7-T4CDF | 0.2943 | 4.2050 | 1.0000 |
| 37 | 2,3,7,8-T4CDF | 0.2895 | 4.2050 | 1.0000 |
| 38 | 2,4,6,7-T4CDF | 0.2578 | 4.2275 | 1.0000 |
| 39 | 3,4,6,7-T4CDF | 0.3064 | 4.2050 | 1.0000 |
| 40* | 1,2,3,4,6-P5CDF | 0.5947 | 5.3386 | 1.0000 |
| 41 | 1,2,3,4,9-P5CDF | 0.6143 | 5.3872 | 1.0000 |
| 42 | 1,2,3,6,7-P5CDF | 0.5161 | 5.3161 | 1.0000 |
| 43 | 1,2,3,6,9-P5CDF | 0.4815 | 5.3872 | 1.0000 |
| 44 | 1,2,3,7,9-P5CDF | 0.4871 | 5.3647 | 1.0000 |
| 45* | 1,2,3,8,9-P5CDF | 0.5478 | 5.3872 | 1.0000 |
| 46 | 1,2,4,6,7-P5CDF | 0.4571 | 5.3386 | 1.0000 |
| 47 | 1,2,4,7,8-P5CDF | 0.4498 | 5.3386 | 1.0000 |
| 48 | 1,2,4,8,9-P5CDF | 0.4889 | 5.4097 | 1.0000 |
| 49 | 1,2,6,7,9-P5CDF | 0.4767 | 5.3872 | 1.0000 |
| 50* | 1,3,4,6,9-P5CDF | 0.4178 | 5.3872 | 1.0000 |
| 51 | 1,3,4,7,8-P5CDF | 0.4450 | 5.3161 | 1.0000 |
| 52 | 1,3,6,7,8-P5CDF | 0.4748 | 5.3161 | 1.0000 |
| 53 | 2,3,4,6,7-P5CDF | 0.5161 | 5.2675 | 1.0000 |
| 54 | 2,3,4,6,8-P5CDF | 0.4723 | 5.2900 | 1.0000 |
| 55* | 2,3,4,7,8-P5CDF | 0.5039 | 5.2675 | 1.0000 |
| 56 | 1,2,3,4,6,7-H6CDF | 0.7779 | 6.3786 | 1.0000 |
| 57 | 1,2,3,4,6,8-H6CDF | 0.7414 | 6.4011 | 1.0000 |
| 58 | 1,2,3,4,8,9-H6CDF | 0.8096 | 6.4497 | 1.0000 |
| 59 | 1,2,3,6,7,8-H6CDF | 0.7535 | 6.3786 | 1.0000 |
| 60* | 1,2,3,6,7,9-H6CDF | 0.7068 | 6.4272 | 1.0000 |
| 61 | 1,2,3,7,8,9-H6CDF | 0.7731 | 6.4272 | 1.0000 |
| 62 | 1,2,4,6,7,8-H6CDF | 0.6993 | 6.4011 | 1.0000 |
| 63 | 1,2,4,6,7,9-H6CDF | 0.6552 | 6.4497 | 1.0000 |
| 64 | 1,2,4,6,8,9-H6CDF | 0.6673 | 6.4722 | 1.0000 |
| 65* | 2,3,4,6,7,8-H6CDF | 0.7461 | 6.3300 | 1.0000 |
| 66 | 1,2,3,4,6,7,8-H7CDF | 1.0357 | 7.4411 | 1.0000 |
| 67 | 1,2,3,4,6,7,9-H7CDD | 0.9964 | 7.4897 | 1.0000 |
| 68 | 1,2,3,4,6,8,9-H7CDD | 1.0085 | 7.5122 | 1.0000 |
| 69 | 1,2,3,4,7,8,9-H7CDD | 1.0553 | 7.4897 | 1.0000 |
| 70* | O8CDF | 1.3653 | 8.5522 | 1.0000 |
| No. | Compound | Experimental log t1/2 |
Predicted log t1/2 |
Relative error (%) | ||
|---|---|---|---|---|---|---|
| MLR | ANN | MLR | ANN | |||
| a The ones marked by an asterisk are the PCDD/F congeners in the Group II (see text). | ||||||
| 1 | 1,2,3,9-T4CDD | 1.90 | 1.819 | 1.840 | −4.26 | −3.16 |
| 2 | 1,2,6,7-T4CDD | 1.89 | 1.820 | 1.861 | −3.70 | −1.53 |
| 3 | 1,2,6,8-T4CDD | 1.91 | 1.816 | 1.857 | −4.92 | −2.77 |
| 4 | 1,2,6,9-T4CDD | 1.79 | 1.824 | 1.861 | 1.90 | 3.97 |
| 5* | 1,2,8,9-T4CDD | 1.95 | 1.828 | 1.857 | −6.26 | −4.77 |
| 6 | 1,3,6,8-T4CDD | 1.90 | 1.811 | 1.866 | −4.68 | −1.79 |
| 7 | 1,3,7,8-T4CDD | 1.94 | 1.812 | 1.860 | −6.60 | −4.12 |
| 8 | 1,3,7,9-T4CDD | 1.90 | 1.811 | 1.853 | −4.68 | −2.47 |
| 9 | 1,4,7,8-T4CDD | 1.76 | 1.824 | 1.871 | 3.64 | 6.31 |
| 10* | 2,3,7,8-T4CDD | 1.74 | 1.824 | 1.858 | 4.83 | 6.78 |
| 11 | 1,2,3,4,6-P5CDD | 1.82 | 1.892 | 1.852 | 3.96 | 1.76 |
| 12 | 1,2,3,6,7-P5CDD | 1.82 | 1.878 | 1.848 | 3.19 | 1.54 |
| 13 | 1,2,3,6,8-P5CDD | 1.74 | 1.879 | 1.855 | 7.99 | 6.61 |
| 14 | 1,2,3,6,9-P5CDD | 1.89 | 1.871 | 1.875 | −1.01 | −0.79 |
| 15* | 1,2,3,7,8-P5CDD | 1.76 | 1.874 | 1.845 | 6.48 | 4.83 |
| 16 | 1,2,3,7,9-P5CDD | 1.82 | 1.875 | 1.856 | 3.02 | 1.98 |
| 17 | 1,2,3,8,9-P5CDD | 1.85 | 1.878 | 1.856 | 1.51 | 0.32 |
| 18 | 1,2,4,6,7-P5CDD | 1.85 | 1.873 | 1.855 | 1.24 | 0.27 |
| 19 | 1,2,4,7,8-P5CDD | 1.91 | 1.866 | 1.857 | −2.30 | −2.77 |
| 20* | 1,2,4,8,9-P5CDD | 1.84 | 1.872 | 1.840 | 1.74 | 0.00 |
| 21 | 1,2,3,4,6,7-H6CDD | 1.97 | 1.929 | 1.910 | −2.08 | −3.05 |
| 22 | 1,2,3,4,6,9-H6CDD | 1.87 | 1.931 | 1.915 | 3.26 | 2.41 |
| 23 | 1,2,3,4,7,8-H6CDD | 1.96 | 1.927 | 1.908 | −1.68 | −2.65 |
| 24 | 1,2,3,6,7,8-H6CDD | 1.88 | 1.930 | 1.906 | 2.66 | 1.38 |
| 25* | 1,2,3,7,8,9-H6CDD | 1.97 | 1.928 | 1.906 | −2.13 | −3.25 |
| 26 | 1,2,3,4,6,7,8-H7CDD | 1.92 | 1.991 | 1.960 | 3.70 | 2.08 |
| 27 | 1,2,3,4,6,7,9-H7CDD | 1.98 | 1.980 | 1.960 | 0.00 | −1.01 |
| 28 | O8CDD | 2.02 | 2.049 | 1.960 | 1.44 | −2.97 |
| 29 | 1,2,6,7-T4CDF | 1.73 | 1.698 | 1.710 | −1.85 | −1.16 |
| 30* | 1,2,6,9-T4CDF | 1.61 | 1.699 | 1.686 | 5.53 | 4.72 |
| 31 | 1,2,7,8-T4CDF | 1.65 | 1.704 | 1.688 | 3.27 | 2.30 |
| 32 | 1,3,4,8-T4CDF | 1.64 | 1.701 | 1.695 | 3.72 | 3.35 |
| 33 | 1,3,6,8-T4CDF | 1.76 | 1.687 | 1.692 | −4.15 | −3.86 |
| 34 | 1,4,6,8-T4CDF | 1.70 | 1.691 | 1.700 | −0.53 | 0.00 |
| 35* | 2,3,4,6-T4CDF | 1.64 | 1.703 | 1.687 | 3.84 | 2.87 |
| 36 | 2,3,6,7-T4CDF | 1.70 | 1.697 | 1.699 | −0.18 | −0.06 |
| 37 | 2,3,7,8-T4CDF | 1.66 | 1.700 | 1.693 | 2.41 | 1.99 |
| 38 | 2,4,6,7-T4CDF | 1.72 | 1.693 | 1.698 | −1.57 | −1.28 |
| 39 | 3,4,6,7-T4CDF | 1.72 | 1.696 | 1.696 | −1.40 | −1.40 |
| 40* | 1,2,3,4,6-P5CDF | 1.67 | 1.758 | 1.720 | 5.27 | 2.99 |
| 41 | 1,2,3,4,9-P5CDF | 1.71 | 1.768 | 1.724 | 3.39 | 0.82 |
| 42 | 1,2,3,6,7-P5CDF | 1.74 | 1.751 | 1.732 | 0.63 | −0.46 |
| 43 | 1,2,3,6,9-P5CDF | 1.82 | 1.746 | 1.721 | −4.07 | −5.44 |
| 44 | 1,2,3,7,9-P5CDF | 1.65 | 1.753 | 1.717 | 6.24 | 4.06 |
| 45* | 1,2,3,8,9-P5CDF | 1.71 | 1.755 | 1.716 | 2.63 | 0.35 |
| 46 | 1,2,4,6,7-P5CDF | 1.69 | 1.748 | 1.716 | 3.43 | 1.54 |
| 47 | 1,2,4,7,8-P5CDF | 1.69 | 1.748 | 1.717 | 3.43 | 1.60 |
| 48 | 1,2,4,8,9-P5CDF | 1.67 | 1.754 | 1.721 | 5.03 | 3.05 |
| 49 | 1,2,6,7,9-P5CDF | 1.72 | 1.750 | 1.719 | 1.74 | −0.06 |
| 50* | 1,3,4,6,9-P5CDF | 1.65 | 1.744 | 1.704 | 5.70 | 3.27 |
| 51 | 1,3,4,7,8-P5CDF | 1.70 | 1.746 | 1.711 | 2.71 | 0.65 |
| 52 | 1,3,6,7,8-P5CDF | 1.68 | 1.749 | 1.718 | 4.11 | 2.26 |
| 53 | 2,3,4,6,7-P5CDF | 1.78 | 1.748 | 1.722 | −1.80 | −3.26 |
| 54 | 2,3,4,6,8-P5CDF | 1.72 | 1.746 | 1.718 | 1.51 | −0.12 |
| 55* | 2,3,4,7,8-P5CDF | 1.74 | 1.748 | 1.708 | 0.46 | −1.84 |
| 56 | 1,2,3,4,6,7-H6CDF | 1.76 | 1.808 | 1.821 | 2.73 | 3.47 |
| 57 | 1,2,3,4,6,8-H6CDF | 1.78 | 1.804 | 1.806 | 1.35 | 1.46 |
| 58 | 1,2,3,4,8,9-H6CDF | 1.79 | 1.812 | 1.827 | 1.23 | 2.07 |
| 59 | 1,2,3,6,7,8-H6CDF | 1.87 | 1.801 | 1.821 | −3.69 | −2.62 |
| 60* | 1,2,3,6,7,9-H6CDF | 1.85 | 1.801 | 1.809 | −2.65 | −2.22 |
| 61 | 1,2,3,7,8,9-H6CDF | 1.90 | 1.802 | 1.824 | −5.16 | −4.00 |
| 62 | 1,2,4,6,7,8-H6CDF | 1.76 | 1.802 | 1.803 | 2.39 | 2.44 |
| 63 | 1,2,4,6,7,9-H6CDF | 1.81 | 1.795 | 1.810 | −0.83 | 0.00 |
| 64 | 1,2,4,6,8,9-H6CDF | 1.92 | 1.782 | 1.808 | −7.19 | −5.83 |
| 65* | 2,3,4,6,7,8-H6CDF | 1.85 | 1.801 | 1.808 | −2.65 | −2.27 |
| 66 | 1,2,3,4,6,7,8-H7CDF | 1.93 | 1.853 | 1.929 | −3.99 | −0.05 |
| 67 | 1,2,3,4,6,7,9-H7CDD | 1.89 | 1.856 | 1.898 | −1.80 | 0.42 |
| 68 | 1,2,3,4,6,8,9-H7CDD | 1.93 | 1.854 | 1.927 | −3.94 | −0.16 |
| 69 | 1,2,3,4,7,8,9-H7CDD | 1.93 | 1.856 | 1.931 | −3.83 | 0.05 |
| 70* | O8CDF | 2.00 | 1.924 | 1.946 | −3.80 | −2.70 |
Root mean square relative error (RMSRE) was used to indicate the prediction performance of the obtained QSPR models. The RMSRE is defined as eqn (1):
![]() | (1) |
![]() | (2) |
In eqn (2), k and l mean the type of atoms (k = 1 or l = 1 denotes the chlorine atom, and k = 2 or l = 2 denotes the benzene ring); Items i and j are the coding number of a chlorine atom or a benzene ring. The item dik,jl represents the nearest relative distance between two atoms. For example, di1,j1 means the nearest relative distance between the ith and jth chlorine atom. The relative distance between the two adjacent non-hydrogen atoms is defined as d = 1. According to eqn (2), there are three elements, M11, M12 and M22, in the MDEV index for a PCDD/F molecule. For example, the MDEV index of 1,3,7,9-T4CDD (the structure is shown in Fig. 1a) was calculated as follows:
![]() | (3) |
The MDEV index of 1,2,6,7-T4CDF (the structure is shown in Fig. 1b) was calculated as follows:
![]() | (4) |
ANN is a multivariable calibration method which is capable of modeling complex functions. Its basic processing unit is the neuron (node). An ANN is composed of a number of neurons organized in layers. Multilayer perceptron (MLP) feed forward artificial neural network, trained with back propagation (BP) algorithm, is one of the most popular network architectures in use today. This kind of ANN is also known as back propagation artificial neural network. An MLP–ANN consists of a series of layers. The first layer has a connection from the network input. Each subsequent layer has a connection from the previous layer. The final layer produces the output of network. Each neuron in a particular layer is connected with all neurons in the next layer. The connection between the neurons is characterized by the weights. In the BP algorithm, input variables are multiplied by the weights of each neuron. These products are summed for each neuron, and then the sums are transformed with a non-linear transfer function. The transformed sums are then processed by the output neurons where they are summed and transformed to get the output variables. Then, the error between the target values and the outputs is calculated. This error is propagated backwards through the network for adjusting the weights to minimize the error. The procedure will be repeated until the error is minimized.
t1/2 of PCDD/Fs
| No. | Compound | M11 | M12 | M22 | Predicted log t1/2 |
|
|---|---|---|---|---|---|---|
| MLR | ANN | |||||
| 1 | 1-CDD | 0.0000 | 1.0625 | 0.2500 | 1.691 | 1.830 |
| 2 | 2-CDD | 0.0000 | 1.0400 | 0.2500 | 1.690 | 1.830 |
| 3 | 1,2-D2CDD | 0.1111 | 2.1025 | 0.2500 | 1.735 | 1.850 |
| 4 | 1,3-D2CDD | 0.0625 | 2.1025 | 0.2500 | 1.729 | 1.842 |
| 5 | 1,4-D2CDD | 0.0400 | 2.1250 | 0.2500 | 1.727 | 1.842 |
| 6 | 1,6-D2CDD | 0.0204 | 2.1250 | 0.2500 | 1.724 | 1.844 |
| 7 | 1,7-D2CDD | 0.0156 | 2.1025 | 0.2500 | 1.723 | 1.844 |
| 8 | 1,8-D2CDD | 0.0204 | 2.1025 | 0.2500 | 1.724 | 1.839 |
| 9 | 1,9-D2CDD | 0.0278 | 2.1250 | 0.2500 | 1.725 | 1.849 |
| 10 | 2,3-D2CDD | 0.1111 | 2.0800 | 0.2500 | 1.734 | 1.849 |
| 11 | 2,7-D2CDD | 0.0123 | 2.0800 | 0.2500 | 1.722 | 1.848 |
| 12 | 2,8-D2CDD | 0.0156 | 2.0800 | 0.2500 | 1.722 | 1.848 |
| 13 | 1,2,3-T3CDD | 0.2847 | 3.1425 | 0.2500 | 1.786 | 1.858 |
| 14 | 1,2,4-T3CDD | 0.2136 | 3.1650 | 0.2500 | 1.778 | 1.858 |
| 15 | 1,2,6-T3CDD | 0.1471 | 3.1650 | 0.2500 | 1.770 | 1.858 |
| 16 | 1,2,7-T3CDD | 0.1391 | 3.1425 | 0.2500 | 1.768 | 1.858 |
| 17 | 1,2,8-T3CDD | 0.1471 | 3.1425 | 0.2500 | 1.769 | 1.858 |
| 18 | 1,2,9-T3CDD | 0.1593 | 3.1650 | 0.2500 | 1.772 | 1.858 |
| 19 | 1,3,6-T3CDD | 0.1033 | 3.1650 | 0.2500 | 1.765 | 1.858 |
| 20 | 1,3,7-T3CDD | 0.0938 | 3.1425 | 0.2500 | 1.763 | 1.858 |
| 21 | 1,3,8-T3CDD | 0.0953 | 3.1425 | 0.2500 | 1.763 | 1.858 |
| 22 | 1,3,9-T3CDD | 0.1059 | 3.1650 | 0.2500 | 1.765 | 1.858 |
| 23 | 1,4,6-T3CDD | 0.0882 | 3.1875 | 0.2500 | 1.764 | 1.850 |
| 24 | 1,4,7-T3CDD | 0.0760 | 3.1650 | 0.2500 | 1.761 | 1.858 |
| 25 | 2,3,6-T3CDD | 0.1471 | 3.1425 | 0.2500 | 1.769 | 1.858 |
| 26 | 2,3,7-T3CDD | 0.1391 | 3.1200 | 0.2500 | 1.768 | 1.858 |
| 27 | 1,2,3,4-T4CDD | 0.4983 | 4.2050 | 0.2500 | 1.843 | 1.851 |
| 28 | 1,2,3,6-T4CDD | 0.3412 | 4.2050 | 0.2500 | 1.824 | 1.857 |
| 29 | 1,2,3,7-T4CDD | 0.3283 | 4.1825 | 0.2500 | 1.822 | 1.857 |
| 30 | 1,2,3,8-T4CDD | 0.3331 | 4.1825 | 0.2500 | 1.822 | 1.857 |
| 31 | 1,2,4,6-T4CDD | 0.2774 | 4.2275 | 0.2500 | 1.817 | 1.842 |
| 32 | 1,2,4,7-T4CDD | 0.2620 | 4.2050 | 0.2500 | 1.814 | 1.858 |
| 33 | 1,2,4,8-T4CDD | 0.2653 | 4.2050 | 0.2500 | 1.815 | 1.858 |
| 34 | 1,2,4,9-T4CDD | 0.2822 | 4.2275 | 0.2500 | 1.817 | 1.858 |
| 35 | 1,2,7,8-T4CDD | 0.2862 | 4.1825 | 0.2500 | 1.817 | 1.858 |
| 36 | 1,2,7,9-T4CDD | 0.2498 | 4.2050 | 0.2500 | 1.813 | 1.858 |
| 37 | 1,3,6,9-T4CDD | 0.1867 | 4.2275 | 0.2500 | 1.806 | 1.859 |
| 38 | 1,4,6,9-T4CDD | 0.1764 | 4.2500 | 0.2500 | 1.805 | 1.859 |
| 39 | 1,2,3,4,7-P5CDD | 0.5623 | 5.2450 | 0.2500 | 1.881 | 1.846 |
| 40 | 1,2,4,6,8-P5CDD | 0.3916 | 5.2675 | 0.2500 | 1.861 | 1.855 |
| 41 | 1,2,4,6,9-P5CDD | 0.3860 | 5.2900 | 0.2500 | 1.861 | 1.855 |
| 42 | 1,2,4,7,9-P5CDD | 0.3931 | 5.2675 | 0.2500 | 1.861 | 1.855 |
| 43 | 1,2,3,4,6,8-H6CDD | 0.7091 | 6.3075 | 0.2500 | 1.930 | 1.917 |
| 44 | 1,2,3,6,7,9-H6CDD | 0.6622 | 6.3075 | 0.2500 | 1.924 | 1.914 |
| 45 | 1,2,3,6,8,9-H6CDD | 0.6670 | 6.3075 | 0.2500 | 1.925 | 1.914 |
| 46 | 1,2,4,6,7,9-H6CDD | 0.6080 | 6.3300 | 0.2500 | 1.919 | 1.911 |
| 47 | 1,2,4,6,8,9-H6CDD | 0.6113 | 6.3300 | 0.2500 | 1.919 | 1.911 |
| 48 | 1-CDF | 0.0000 | 1.1111 | 1.0000 | 1.558 | 1.739 |
| 49 | 2-CDF | 0.0000 | 1.0625 | 1.0000 | 1.557 | 1.739 |
| 50 | 3-CDF | 0.0000 | 1.0400 | 1.0000 | 1.556 | 1.739 |
| 51 | 4-CDF | 0.0000 | 1.0625 | 1.0000 | 1.557 | 1.739 |
| 52 | 1,2-D2CDF | 0.1111 | 2.1736 | 1.0000 | 1.603 | 1.723 |
| 53 | 1,3-D2CDF | 0.0625 | 2.1511 | 1.0000 | 1.596 | 1.725 |
| 54 | 1,4-D2CDF | 0.0400 | 2.1736 | 1.0000 | 1.594 | 1.726 |
| 55 | 1,6-D2CDF | 0.0278 | 2.1736 | 1.0000 | 1.593 | 1.726 |
| 56 | 1,7-D2CDF | 0.0204 | 2.1511 | 1.0000 | 1.591 | 1.727 |
| 57 | 1,8-D2CDF | 0.0278 | 2.1736 | 1.0000 | 1.593 | 1.726 |
| 58 | 1,9-D2CDF | 0.0400 | 2.2222 | 1.0000 | 1.596 | 1.725 |
| 59 | 2,3-D2CDF | 0.1111 | 2.1025 | 1.0000 | 1.601 | 1.724 |
| 60 | 2,4-D2CDF | 0.0625 | 2.1250 | 1.0000 | 1.596 | 1.725 |
| 61 | 2,6-D2CDF | 0.0204 | 2.1250 | 1.0000 | 1.590 | 1.727 |
| 62 | 2,7-D2CDF | 0.0156 | 2.1025 | 1.0000 | 1.589 | 1.728 |
| 63 | 2,8-D2CDF | 0.0204 | 2.1250 | 1.0000 | 1.590 | 1.727 |
| 64 | 3,4-D2CDF | 0.1111 | 2.1025 | 1.0000 | 1.601 | 1.724 |
| 65 | 3,6-D2CDF | 0.0204 | 2.1025 | 1.0000 | 1.590 | 1.728 |
| 66 | 3,7-D2CDF | 0.0156 | 2.0800 | 1.0000 | 1.589 | 1.728 |
| 67 | 4,6-D2CDF | 0.0278 | 2.1250 | 1.0000 | 1.591 | 1.727 |
| 68 | 1,2,3-T3CDF | 0.2847 | 3.2136 | 1.0000 | 1.654 | 1.696 |
| 69 | 1,2,4-T3CDF | 0.2136 | 3.2361 | 1.0000 | 1.646 | 1.701 |
| 70 | 1,2,6-T3CDF | 0.1593 | 3.2361 | 1.0000 | 1.640 | 1.705 |
| 71 | 1,2,7-T3CDF | 0.1471 | 3.2136 | 1.0000 | 1.638 | 1.706 |
| 72 | 1,2,8-T3CDF | 0.1593 | 3.2361 | 1.0000 | 1.640 | 1.705 |
| 73 | 1,2,9-T3CDF | 0.1789 | 3.2847 | 1.0000 | 1.644 | 1.703 |
| 74 | 1,3,4-T3CDF | 0.2136 | 3.2136 | 1.0000 | 1.646 | 1.702 |
| 75 | 1,3,6-T3CDF | 0.1107 | 3.2136 | 1.0000 | 1.633 | 1.709 |
| 76 | 1,3,7-T3CDF | 0.0985 | 3.1911 | 1.0000 | 1.631 | 1.724 |
| 77 | 1,3,8-T3CDF | 0.1059 | 3.2136 | 1.0000 | 1.633 | 1.724 |
| 78 | 1,3,9-T3CDF | 0.1229 | 3.2622 | 1.0000 | 1.636 | 1.722 |
| 79 | 1,4,6-T3CDF | 0.0956 | 3.2361 | 1.0000 | 1.632 | 1.724 |
| 80 | 1,4,7-T3CDF | 0.0808 | 3.2136 | 1.0000 | 1.630 | 1.725 |
| 81 | 1,4,8-T3CDF | 0.0882 | 3.2361 | 1.0000 | 1.631 | 1.724 |
| 82 | 1,4,9-T3CDF | 0.1078 | 3.2847 | 1.0000 | 1.635 | 1.722 |
| 83 | 1,6,7-T3CDF | 0.1593 | 3.2136 | 1.0000 | 1.639 | 1.721 |
| 84 | 1,6,8-T3CDF | 0.1181 | 3.2361 | 1.0000 | 1.635 | 1.723 |
| 85 | 1,7,8-T3CDF | 0.1593 | 3.2136 | 1.0000 | 1.639 | 1.706 |
| 86 | 2,3,4-T3CDF | 0.2847 | 3.1650 | 1.0000 | 1.653 | 1.696 |
| 87 | 2,3,6-T3CDF | 0.1519 | 3.1650 | 1.0000 | 1.637 | 1.706 |
| 88 | 2,3,7-T3CDF | 0.1424 | 3.1425 | 1.0000 | 1.635 | 1.708 |
| 89 | 2,3,8-T3CDF | 0.1471 | 3.1650 | 1.0000 | 1.636 | 1.707 |
| 90 | 2,4,6-T3CDF | 0.1107 | 3.1875 | 1.0000 | 1.632 | 1.708 |
| 91 | 2,4,7-T3CDF | 0.0985 | 3.1650 | 1.0000 | 1.630 | 1.710 |
| 92 | 2,4,8-T3CDF | 0.1033 | 3.1875 | 1.0000 | 1.631 | 1.709 |
| 93 | 2,6,7-T3CDF | 0.1471 | 3.1650 | 1.0000 | 1.636 | 1.707 |
| 94 | 3,4,6-T3CDF | 0.1593 | 3.1650 | 1.0000 | 1.638 | 1.706 |
| 95 | 3,4,7-T3CDF | 0.1471 | 3.1425 | 1.0000 | 1.636 | 1.707 |
| 96 | 1,2,3,4-T4CDF | 0.4983 | 4.2761 | 1.0000 | 1.711 | 1.686 |
| 97 | 1,2,3,6-T4CDF | 0.3533 | 4.2761 | 1.0000 | 1.694 | 1.684 |
| 98 | 1,2,3,7-T4CDF | 0.3364 | 4.2536 | 1.0000 | 1.691 | 1.691 |
| 99 | 1,2,3,8-T4CDF | 0.3485 | 4.2761 | 1.0000 | 1.693 | 1.690 |
| 100 | 1,2,3,9-T4CDF | 0.3729 | 4.3247 | 1.0000 | 1.698 | 1.688 |
| 101 | 1,2,4,6-T4CDF | 0.2896 | 4.2986 | 1.0000 | 1.687 | 1.693 |
| 102 | 1,2,4,7-T4CDF | 0.2701 | 4.2761 | 1.0000 | 1.684 | 1.686 |
| 103 | 1,2,4,8-T4CDF | 0.2822 | 4.2986 | 1.0000 | 1.686 | 1.686 |
| 104 | 1,2,4,9-T4CDF | 0.3092 | 4.3472 | 1.0000 | 1.690 | 1.691 |
| 105 | 1,2,6,8-T4CDF | 0.2700 | 4.2986 | 1.0000 | 1.684 | 1.686 |
| 106 | 1,2,7,9-T4CDF | 0.2774 | 4.3247 | 1.0000 | 1.686 | 1.686 |
| 107 | 1,2,8,9-T4CDF | 0.3382 | 4.3472 | 1.0000 | 1.694 | 1.690 |
| 108 | 1,3,4,6-T4CDF | 0.2896 | 4.2761 | 1.0000 | 1.686 | 1.685 |
| 109 | 1,3,4,7-T4CDF | 0.2701 | 4.2536 | 1.0000 | 1.683 | 1.686 |
| 110 | 1,3,4,9-T4CDF | 0.3018 | 4.3247 | 1.0000 | 1.689 | 1.692 |
| 111 | 1,3,6,7-T4CDF | 0.2578 | 4.2536 | 1.0000 | 1.681 | 1.687 |
| 112 | 1,3,6,9-T4CDF | 0.2111 | 4.3247 | 1.0000 | 1.678 | 1.689 |
| 113 | 1,3,7,8-T4CDF | 0.2530 | 4.2536 | 1.0000 | 1.681 | 1.687 |
| 114 | 1,3,7,9-T4CDF | 0.2214 | 4.3022 | 1.0000 | 1.678 | 1.688 |
| 115 | 1,4,6,7-T4CDF | 0.2475 | 4.2761 | 1.0000 | 1.681 | 1.687 |
| 116 | 1,4,6,9-T4CDF | 0.2033 | 4.3472 | 1.0000 | 1.677 | 1.689 |
| 117 | 1,4,7,8-T4CDF | 0.2401 | 4.2761 | 1.0000 | 1.680 | 1.688 |
| 118 | 1,6,7,8-T4CDF | 0.3607 | 4.2761 | 1.0000 | 1.695 | 1.689 |
| 119 | 2,3,4,7-T4CDF | 0.3364 | 4.2050 | 1.0000 | 1.690 | 1.692 |
| 120 | 2,3,4,8-T4CDF | 0.3412 | 4.2275 | 1.0000 | 1.691 | 1.691 |
| 121 | 2,3,6,8-T4CDF | 0.2505 | 4.2275 | 1.0000 | 1.680 | 1.687 |
| 122 | 2,4,6,8-T4CDF | 0.2140 | 4.2500 | 1.0000 | 1.676 | 1.689 |
| 123 | 1,2,3,4,7-P5CDF | 0.5704 | 5.3161 | 1.0000 | 1.750 | 1.721 |
| 124 | 1,2,3,4,8-P5CDF | 0.5826 | 5.3386 | 1.0000 | 1.753 | 1.722 |
| 125 | 1,2,3,6,8-P5CDF | 0.4796 | 5.3386 | 1.0000 | 1.740 | 1.716 |
| 126 | 1,2,3,7,8-P5CDF | 0.5113 | 5.3161 | 1.0000 | 1.743 | 1.714 |
| 127 | 1,2,4,6,8-P5CDF | 0.4207 | 5.3611 | 1.0000 | 1.734 | 1.713 |
| 128 | 1,2,4,6,9-P5CDF | 0.4251 | 5.4097 | 1.0000 | 1.736 | 1.716 |
| 129 | 1,2,4,7,9-P5CDF | 0.4281 | 5.3872 | 1.0000 | 1.735 | 1.715 |
| 130 | 1,2,6,7,8-P5CDF | 0.5282 | 5.3386 | 1.0000 | 1.746 | 1.719 |
| 131 | 1,3,4,6,7-P5CDF | 0.4571 | 5.3161 | 1.0000 | 1.737 | 1.713 |
| 132 | 1,3,4,6,8-P5CDF | 0.4159 | 5.3386 | 1.0000 | 1.732 | 1.711 |
| 133 | 1,3,4,7,9-P5CDF | 0.4207 | 5.3647 | 1.0000 | 1.734 | 1.713 |
| 134 | 1,4,6,7,8-P5CDF | 0.4693 | 5.3386 | 1.0000 | 1.739 | 1.719 |
| 135 | 1,2,3,4,6,9-H6CDF | 0.7507 | 6.4497 | 1.0000 | 1.806 | 1.820 |
| 136 | 1,2,3,4,7,8-H6CDF | 0.7657 | 6.3786 | 1.0000 | 1.805 | 1.815 |
| 137 | 1,2,3,4,7,9-H6CDF | 0.7489 | 6.4272 | 1.0000 | 1.805 | 1.827 |
| 138 | 1,2,3,6,8,9-H6CDF | 0.7189 | 6.4497 | 1.0000 | 1.802 | 1.826 |
| 139 | 1,3,4,6,7,8-H6CDF | 0.6945 | 6.3786 | 1.0000 | 1.797 | 1.812 |
| 140 | 1,3,4,6,7,9-H6CDF | 0.6478 | 6.4272 | 1.0000 | 1.792 | 1.813 |
t1/2 of PCDD/Fs. The MDEV index was used as the independent variable and the log
t1/2 was used as the dependent variable to develop the regression model. In order to assess the predictive performance of the developed model, two validation methods, leave-one-out cross validation and external validation, were carried out. The 70 samples shown in Table 1 were randomly split into two groups: Group I, which comprises 56 samples, and Group II, which comprises 14 samples (marked by asterisk in Tables 1 and 2). Firstly, Group I was used to complete the leave-one-out cross validation. The result of leave-one-out cross validation is presented in Table 2. Fig. 2a illustrates the plot of the predicted log
t1/2 versus the experimental log
t1/2. As shown in Table 2 and Fig. 2a, the predicted log
t1/2 is in agreement with the experimental log
t1/2. For the 56 compounds, the RMSRE of prediction is 3.47. Subsequently, external validation was carried out to further assess the predictive performance of the MLR model. In this procedure, the model was developed by using all the 56 compounds in Group I as the calibration set. The obtained regression equation is log
t1/2 = 0.08967M11 + 0.03002M12 − 0.1726M22 + 1.7170. Then, the log
t1/2 of the samples in Group II was predicted by using this regression equation. The result is shown in Table 2 and the plot of predicted log
t1/2 versus experimental log
t1/2 is shown in Fig. 2a. For the 14 compounds, the prediction RMSRE is 4.25. Obviously, the predicted log
t1/2 is in agreement with the experimental log
t1/2.
The result of leave-one-out cross validation and external validation demonstrates that the MDEV index is quantitatively related to the log
t1/2 of PCDD/Fs. MLR is practicable for modeling the quantitative relationship between the MDEV index and log
t1/2 of PCDD/Fs. In the previous research reported by Niu et al.,7 it is proposed that quantum chemical descriptors is not quantitatively related to the t1/2 of PCDDs. Fortunately, we can build a QSPR model for the t1/2 of PCDDs by using the MDEV index as the structural descriptor. In addition, building a QSPR model based on MDEV index is easier than based on quantum chemical descriptors. Thus, predicting the log
t1/2 of PCDD/Fs by using the QSPR model based on MDEV index is convenient and practicable. Then, an MLR model was developed by using all the 70 PCDD/Fs listed in Table 1. The obtained regression equation is log
t1/2 = 0.1220M11 + 0.02914M12 − 0.1785M22 + 1.7045. The log
t1/2 of the other 140 PCDD/Fs was then predicted by using this regression equation. The result is shown in Table 3. The log
t1/2 value of these PCDD/Fs has not been experimentally determined as yet. This prediction result can be used as an estimation of the log
t1/2 of these compounds.
t1/2 of PCDD/Fs, the model is not perfect, because the prediction error of several compounds is somewhat large. For example, the prediction error for 1,2,3,6,8-P5CDD reaches 7.99%. It has been proved by Niu et al.7 that the combination of frontier molecular orbital energies, (ELUMO − EHOMO)2 is significant to the log
t1/2 of PCDD/Fs. That is to say, there might be a nonlinear relationship between the structure and the log
t1/2 of PCDD/Fs. Probably, developing a linear model is not the best choice to model the relationship between the structure and log
t1/2 of PCDD/Fs. A nonlinear QSPR model might be better than a linear model for predicting the log
t1/2. ANN is a commonly used multivariable calibration method for establishing nonlinear calibration model. Thus, we investigated whether a better model can be obtained by using ANN. A 3-6-1 ANN (i.e. there are 3 nodes in the input player, 6 nodes in the hidden layer and 1 node in the output layer) with a sigmoid transfer function was used. The learning rate and momentum was set as 0.6 and 0.3 respectively. In each run of ANN, verification set consists of 14 randomly selected samples. The MDEV index and log
t1/2 was used as input and output variables respectively. Previous to training procedure, the input and output variables were normalized.
Leave-one-out cross validation and external validation were carried out to assess the prediction performance of the developed model. Group I was used to complete the leave-one-out cross validation. The result of leave-one-out cross validation is listed in Table 2. Fig. 2b is the plot of predicted log
t1/2 versus experimental log
t1/2. As shown in Table 2 and Fig. 2b, the predicted log
t1/2 is in good agreement with the experimental log
t1/2. For all the 56 compounds, the RMSRE of prediction is 2.68. Then, all the 70 samples were used to complete the external validation. A 3-6-1 ANN was developed by using all the 56 compounds in Group I as the calibration set. And the log
t1/2 of the samples in Group II was then predicted by using the obtained network. The result of external validation is shown in Table 2. The plot of predicted log
t1/2 versus experimental log
t1/2 is shown in Fig. 2b. As shown in Table 2 and Fig. 2b, the predicted log
t1/2 is in good agreement with the experimental log
t1/2. For the 14 samples, the prediction RMSRE is 3.52. The result of leave-one-out cross validation and external validation demonstrates the prediction accuracy of the obtained ANN model is slightly higher than that of the MLR model. Obviously, the improvement of prediction accuracy is benefited from the use of ANN. As we expected, ANN is a practicable method for developing the calibration model between the MDEV index and log
t1/2 of PCDD/Fs. Using ANN is slightly better than using MLR for modeling the quantitative relationship between the MDEV index and log
t1/2 of PCDD/Fs.
Since ANN model is practicable for predicting the log
t1/2 of PCDD/Fs, a 3-6-1 ANN was developed by using all the 70 PCDD/Fs listed in Table 1. The log
t1/2 of the rest 140 PCDD/Fs was then predicted by using the obtained network. Table 3 lists the prediction result. Certainly, the prediction result obtained from ANN model can also be used as an estimation of the log
t1/2 of these compounds and should be more accurate than the prediction result obtained from the MLR model.
t1/2 of PCDD/Fs.
It is demonstrated that MDEV index is quantitatively related to the log
t1/2 of PCDD/Fs. It is reasonable to establish the QSPR model for the log
t1/2 of PCDD/Fs by using the MDEV index as structural descriptor. Although a QSPR model for the t1/2 of PCDDs cannot be built by using the quantum chemical descriptors, the QSPR model based on MDEV index is able to describe the quantitative relationship between the log
t1/2 and structure of PCDDs. MDEV index can be generated easier than quantum chemical descriptors. Accordingly, using MDEV index as structural descriptor is easier than using quantum chemical descriptor when developing the QSPR model for the log
t1/2 of PCDFs.
Moreover, the validation result demonstrates that both MLR and ANN are practicable for modeling the quantitative relationship between the MDEV index and log
t1/2 of PCDD/Fs. Compared with the MLR model, the ANN model shows higher prediction accuracy. Hence, using ANN is slightly superior to MLR for developing the QSPR model of the log
t1/2 of PCDD/Fs.
The proposed method is easy-to-use and practicable for predicting the log
t1/2 of PCDD/Fs. Thus, the log
t1/2 of each PCDD/F congener was predicted by using the obtained models. The obtained log
t1/2 can be used as an estimation of the log
t1/2 of PCDD/Fs and can be used to study the photolysis reactions of PCDD/Fs.
| This journal is © The Royal Society of Chemistry 2015 |