Urinary metabolomics study on an induced-stress rat model using UPLC-QTOF/MS

Yuan-yuan Xie a, Li Liab, Qun Shao*c, Yi-ming Wanga, Qiong-Lin Lianga, Hui-Yun Zhange, Peng Sune, Ming-qi Qiaoe and Guo-An Luo*ad
aDepartment of Chemistry, Tsinghua University, Beijing 100084, P. R. China. E-mail: luoga@mail.tsinghua.edu.cn; Fax: +86-10-62781688; Tel: +86-10-62781688
bThe Second College of Clinical Medicine, Guangzhou University of Chinese Medicine, Guangzhou 510120, P. R. China
cSchool of Life Science, University of Bradford, Bradford, West Yorkshire BD71DP, UK. E-mail: q.shao@bradford.ac.uk; Fax: +44 (0)1274 236155; Tel: +44 (0)1274 236041
dState Key Laboratory for Quality Research in Chinese Medicine, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, P. R. China
eShandong University of Traditional Chinese Medicine, Jinan, P. R. China

Received 10th June 2015 , Accepted 25th August 2015

First published on 25th August 2015


Abstract

A urinary metabolomics method based on ultra-performance liquid chromatography coupled with quadrupole/time-of-flight mass spectrometry (UPLC-QTOF/MS) was employed to investigate the pathogenesis and therapeutic effects of a Baixiangdan capsule on rats undergoing electric-induced stress for five days. Multivariate analysis techniques, such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA), were applied to observe the temporal changes in the metabolic state of the electric-stressed rats visually, as well as the recovering tendency of the rats treated with the Baixiangdan capsule. Artificial intelligence technology (artificial neural networks and neurofuzzy logic) was used to identify potential biomarkers, and the results showed a high overlap with the PLS-DA model. A total of 14 potential biomarkers representing the major cause–effect relationships between the variations in the endogenous metabolites and the dynamic pathological processes associated with the stress induced by the electric stimulation were identified, including amino acid metabolites, such as 2-aminoadipic acid, hippuric acid, spermine, 4-hydroxyglutamate and L-phenylalanine, in addition to prostaglandin F3a and melatonin. The results indicated that the pathways corresponding to L-phenylalanine, tyrosine, tryptophan, arginine, proline metabolism, pantothenic acid, and coenzyme A synthesis were disturbed in the electric-stressed rats, and that the application of the Baixiangdan capsule may regulate the aforementioned metabolic pathways back to their initial states. The application of artificial intelligence technologies provided powerful and promising tools to model complex metabolomic data and to discover hidden knowledge regarding the potential biomarkers associated with the development of disease, which are also suitable for other complex biological data sets.


1. Introduction

Metabolomics, one of the major platforms in systems biology, is used to study perturbations in response to physiological challenges, toxic insults or disease processes by monitoring low-molecular-weight metabolites (<1 kDa) and their dynamic changes in complex biological samples.1–3 Recently, an increasing number of publications have described the application of a metabolomics approach in traditional Chinese medicine (TCM) research, which demonstrates that metabolomics is a powerful tool for assessing the holistic efficacy of TCM formulae because the global metabolic state of an entire organism can be represented via a single metabolic profile analysis.4–6 Normally, information-rich metabolomics data are acquired from high-field nuclear magnetic resonance (NMR) spectroscopy,7 gas chromatography mass spectrometry (GC-MS),8 or/and UPLC-QTOF/MS.9 Furthermore, many multivariate analysis techniques, such as principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), orthogonal partial least squares discriminant analysis (OPLS-DA) and support vector machine recursive feature elimination (SVM-RFE), are commonly used to find informative biomarkers for subsequent studies.10,11

Neurofuzzy logic, which combines the adaptive learning capabilities of artificial neural networks (ANNs) with the generality of representation from fuzzy logic, is one of the artificial intelligence (AI) technologies that has proven to be an effective tool for analysing complex biological data sets.12 Fuzzy logic modelling implementing the adaptive B-spline modelling of observation data (ASMOD) algorithm can be applied to generate a number of training models and to perform training tests to determine which one best fits the data. The quality of the models is assessed using various statistical fitness criteria, e.g., Akaike’s Information Theoretic Criterion (AIC), final prediction error (FPE), cross validation (CV), generalised cross validation (GCV), minimum description length (MDL) and structure risk minimization (SRM). These fitness criteria aim to minimize a criterion containing two terms—one involving the prediction errors computed in the data set and the other involving the complexity of the structure of the trained models. These training parameters are then investigated to obtain the ideal model which has the best predictions of the validation data, and generates intelligible rules in an “if then” format that explicitly represent the cause–effect relationships contained in the experimental data. During the training process, the improvement of the model training is assessed via these fitness criteria, which is different from the cross validation approach commonly applied in neural networks, where the test data are used. In this method, neural networks are used to optimise certain parameters of the fuzzy systems and automatically extract fuzzy rules from the numerical data. Five back-propagation learning algorithms, including Standard Incremental, Standard Batch, RPROP, Quickprop, and Angle Driven Learning, are used to adjust the weights of the network connections during the training. A change in the weights will affect the contribution of each input variable and therefore largely influence the way that a trained network gives predictions.13 Neurofuzzy logic has been successfully applied in tablet film coatings, pharmaceutical formulations and processing.14 However, the application of neurofuzzy logic in metabolomics data analysis to discover hidden knowledge regarding the potential biomarkers associated with the development of a disease remains relatively new.

Premenstrual syndrome (PMS), a typical stress-related emotional disease affecting 8% of women of child bearing age, is a collection of emotional symptoms, with or without physical symptoms, related to a woman’s menstrual cycle. Emotional symptoms, referred to as premenstrual dysphoric disorder (PMDD), such as depression and anxiety, must be consistently present to diagnose PMS.15 Although the duration of PMS symptoms is shorter than depression due to other etiologies, such as severe depression, post-traumatic stress disorder and anxiety disorders, their influences on the quality of life of a patient in the luteal phase can be as great as or worse than other disorders. PMS has attracted much attention from the international medical community, though the exact etiology remains unclear even after more than 40 years of systematic research.16 The theory of TCM believes that the pathogenesis of PMS is closely related to liver dysfunction, in which liver-Qi invasion syndrome and liver-Qi depression syndrome are the two principal subtypes.17 Therefore, TCM formulae that function to smooth the liver and regulate vital energy are normally used to relieve the symptoms of PMS.18 The Baixiangdan capsule is a novel modernised composite medicine prepared using radix paeoniae alba and cortex moutan radicis extracts, together with Rhizoma cyperi volatile oil, which exhibits a favourable efficacy for the treatment of PMS due to liver-Qi invasion syndrome.19

The electrical stimulation of female Sprague-Dawley (SD) rats produces a series of abnormal behavioural and physiological responses similar to the symptoms of liver-Qi invasion syndrome PMS in humans, and it is often used as an animal model to study the pathogenesis of PMS.20 The feasibility of establishing a model of liver-Qi invasion syndrome PMS in SD rats using electrical stimulation has been proven.21 Previously, a serum metabolomics approach based on UPLC/QTOF-MS was developed to evaluate the therapeutic effects of the Baixiangdan capsule on liver-Qi invasion syndrome PMS in rats. The therapeutic mechanism of the Baixiangdan capsule is related to the regulation of metabolism by corticosteroids (e.g., tetrahydrodeoxycorticosterone, 5α-tetrahydrocortisol, epinephrine), oestrogen (e.g., pregnanediol, estrone) and excitatory/inhibitory amino acid neurotransmitters (e.g., lysine, 5-hydroxylysine, acetylcysteine).22 In the present study, we applied a urinary metabolomics method to investigate the time-related biochemical abnormalities in liver-Qi invasion syndrome PMS due to electrical stimulation for 5 days and assessed the therapeutic effects of the Baixiangdan capsule. Artificial intelligence technology (artificial neural networks and neurofuzzy logic) was used to identify the metabolic pathways and the potential biomarkers related to PMS to achieve the most comprehensive metabolome coverage and to provide a more in-depth understanding of the pathophysiological processes of PMS.

2. Materials and methods

Chemicals and reagents

HPLC-grade acetonitrile was purchased from J. T. Baker (Phillipsburg, NJ, USA). The following compounds were obtained from Sigma-Aldrich (Louis, Mo, USA): 2-aminoadipic acid, hippuric acid, spermine, 4-hydroxyglutamate, L-phenylalanine, melatonin, L-methionine, proline, genistein and leucine-enkephalin. Ultrapure water (18.2 MΩ) was prepared using a Milli-Q water purification system (Millipore, France). All of the other chemicals that were used were of analytical grade.

The Baixiangdan capsule was a TCM prescription prepared using radix paeoniae alba extract, cortex moutan radicis extract and Rhizoma cyperi volatile oil, which was provided by Shandong Traditional Chinese Medicine University. The preparation process was strictly carried out according to the fixed processing parameters. The Baixiangdan capsules used in this study were placed under a careful quality control to ensure their identity throughout all of the experiments. Three representative components (paeoniflorin, paeonol and α-cyperone) were used as quality indicators during the HPLC evaluation.23

Animal handling and sample collection

Healthy and non-pregnant female Sprague-Dawley rats (190–200 g in weight) were supplied by the Experimental Animal Center of Shandong Traditional Chinese Medicine University (serial number SCXK (Lu) 20050015 on the certificate of conformity, Jinan, China). All of the animals were maintained in an environmentally controlled room under a controlled temperature (22–25 °C) and relative humidity (50 ± 5%) on a 12 h light/dark cycle (lights on from 08:00 to 20:00). The experiments were conducted in a specific pathogen free (SPF) grade laboratory according to the guidelines provided in the Guiding Principles for the Care and Use of Laboratory Animals approved by the Committee for Animal Experiments at Shandong Traditional Chinese Medicine University (Jinan, China). The animals were acclimated for 1 week before use. A standard diet and water were provided to the rats ad libitum.

A total of 30 animals in diestrus and metestrus were selected using vaginal smears together with the behavioural assessment described previously.20 The animals were randomly divided into the following 3 groups with 10 rats in each group: (1) control group (CG), (2) stress group (SG), (3) Baixiangdan capsule-dosed group (BCDG). Each rat was kept in an individual tailor-made cage. The PMS rat model was produced using electrically induced stimuli with a digital pulse stimulator.21 The SG and BCDG rats were treated with the electrically induced stimuli (0.5 mA pulses at a voltage of 2700–3300 V and a pulse width of 0.3 s) continuously for 5 days. Each application of the electric stimulus lasted for 5 minutes during the day, which was twice carried out, and for 10 minutes in the evening, which was carried out three times. The BCDG rats were administered a water solution of the Baixiangdan capsule at a dose of 10 mL kg−1 w−1 d−1 (1 mL of the solution is equivalent to 1 g of the crude herbs) via an intra-gastric gavage once a day, amounting to eight times the clinical dosage. Meanwhile, the CG and SG rats were administered the same volume of water via oral gavage. The 24 h urine samples were collected over the 5 day electric-stimuli period. The urine samples at the starting point (without electric stimuli, day 0) were collected 24 h prior to the start of the experiment. The collected urine samples were stored at −80 °C until the sample preparation was carried out. Because of the individual differences between the rats, not all of the rats urinated regularly every 24 h. At the end of the experiment, only 140 urine samples were collected for the UPLC-QTOF/MS analysis.

Sample preparation

Prior to the analysis, the samples were thawed at room temperature. The urine samples were centrifuged at 13[thin space (1/6-em)]000 rpm for 20 min at 4 °C, then the supernatant was analysed via UPLC-QTOF/MS. Three parallel sample solutions were prepared and analysed for accuracy.

UPLC-QTOF/MS analysis

The chromatographic separation was performed on an ACQUITY UPLC BEH C18 column (2.1 × 100 mm, 1.7 μm, Waters Corp, Milford, MA, USA) using a Waters ACQUITY UPLC™ system equipped with a binary solvent delivery system, an auto-sampler, and a PDA detector. The column was maintained at 30 °C and eluted at a flow rate of 0.4 mL min−1, using a mobile phase of water with 0.2% (by volume) formic acid (A) and acetonitrile (B). The gradient program was optimised as follows: 0–18 min, 0% B to 35% B; 18–20 min, 35% B to 95% B; 20–22 min, 95% B; 22–25 min, 95% B to 0% B; and 25–28 min, equilibration with 0% B. The column eluent was directed to the mass spectrometer without a split.

The mass spectrometry was performed on a Waters Q-TOF Premier mass spectrometer (Waters Corp., Manchester, UK) with an electrospray ionization source (ESI) operation in the positive ion mode (“V” mode of operation). The ESI-MS parameters for LC/TOF-MS were: capillary voltage of 3200 V; cone voltage of 35 V; nitrogen was used as the drying gas, and the desolvation gas flow rate was set as 700 L h−1 at a temperature of 350 °C; the cone gas rate was 50 L h−1; source temperature of 110 °C; the scan time was 0.1 s; and the inter-scan delay was 0.02 s. All of the analyses were acquired using an independent reference lock-mass ion via the LockSpray™ interface to ensure accuracy and reproducibility. Leucine-enkephalin was used as the reference compound (m/z 556.2771 for the negative-ion mode) at a concentration of 50 pg μL−1 and a flow rate of 10 μL min−1. The data were collected in the centroid mode from m/z 50 to m/z 1000 using a LockSpray frequency of 10 s, and the data were averaged over 10 scans for the correction.

Data processing

The data were combined into a single matrix by aligning the peaks with the exact mass/retention time pair (EMRT) from each data file along with their associated intensities using MarkerLynx Applications Manager version 4.1 (Waters Corp., Manchester, UK). The parameters included a retention time (tR) range from 0 to 28 min, a mass range from 50 to 1000 Da, and the mass tolerance was 0.02 Da. The minimum intensity was set at 15% of the base peak intensity, the maximum mass per tR was set at 6, and the tR tolerance was set at 0.02 min. An original data list was obtained using a database (peak matrix) containing 421 data records (144 data records were obtained from the stress group, 127 data records were obtained from the Baixiangdan capsule-dosed group and 150 data records were obtained from the control group) and 4000 independent variables (biochemical substances). Prior to the multivariate statistical analysis, the data from each chromatogram were normalised to a constant integrated intensity relative to the number of peaks to partially compensate for the concentration bias of each sample. The processing of the data normalization had little effect on the conclusion of the trajectory analysis, which aimed to improve the clustering tightness in the PLS-DA model by comparing the results of the area-normalised data model with that of the non-normalised data model (data not shown). The between-subject data X was then Pareto-scaled to facilitate the analysis of the major effects in the data. Upon grouping the information, the processed original data list was then divided into three data sets and exported and processed via PCA and PLS-DA analyses using the software package SIMCA-P version 11.5 (Umetrics AB, Umeå, Sweden).

Two commercial AI software tools representing the two technologies were used in this study: INForm 4.3 for the neural networks and FormRules 3.0 for the neurofuzzy logic. Both software packages were provided by Intelligensys Ltd, UK. The algorithm and data processing methods of these two software programs are as follows, and have also been described previously.13,14,23 For the data prepossessing of the AI analysis, PCA was initially applied to reduce the dimensions of the SG data set containing the 144 data records, from 4000 independent variables to 140, according to the significance of the contribution to the PCA model. FormRules 3.0 implements the ASMOD algorithm to generate the neurofuzzy logic model, which enables the discovery of differential features (potential biomarkers). The reduced variable data set containing only the discovered differential features as independent variables was established during the neurofuzzy logic modelling, and the hidden relationships among the these differential features were also discovered. Structure Risk Minimization (SRM) was used to assess the quality of the models in this study. INForm 4.3, which is embedded with a multi-layer perceptron neural network, was applied to validate the robustness of the discovered differential features by comparing the quality of the models based upon the original data set and the reduced variable data set. One of the back-propagation learning algorithms, such as Standard Incremental, Standard Batch, RPROP, Quickprop and Angle Driven Learning, was selected to obtain the optimal prediction accuracy. The work flow of the data analyses using the AI techniques is shown in Fig. 1.


image file: c5ra10992b-f1.tif
Fig. 1 The work flow of the data analyses using AI techniques, including reducing the variables using a neurofuzzy logic model and the predictive ability assessment using a neural network model.

3. Results

Establishment of the metabolic fingerprints

To optimize the experimental conditions, a pre-investigation was conducted before the full study. The fingerprints of a small batch of test urinary samples were acquired in both the positive and negative mode. Higher noise and matrix effects in ESI negative mode were observed. The higher baseline in ESI negative mode led to the neglect of some low abundance metabolites and the concomitance of multiple adduction ions. After considering the maximization of the number of detectable metabolites and the quality of the acquired data, the full-scan detection was eventually set in ESI positive mode. After a careful optimisation of the flow rate and the column temperature for the chromatography, and of the capillary voltage, flow, and the temperature of the desolvation gas for the mass spectrometry detector, the optimal parameters were fixed as listed in Section 2.4. As a result, a higher flow rate (0.4 mL min−1) was used to achieve a higher analysis efficiency on the UPLC column and to reduce the run time. Meanwhile, the tolerance in the backpressure elevation and the effect on the spray and desolvation were also considered. The flow and temperature of the desolvation gas were set at 700 L h−1 and 350 °C, respectively, to remove any redundant solvent resulting from the high flow rate and to improve the efficiency of the desolvation and ionization. Using the optimised conditions, the representative base peak intensity chromatograms of the rat urine obtained in ESI positive mode for the different groups are shown in Fig. 2. After completing the processing described in Section 2.5, a list of 4000 compounds was exported for each sample, and the standard quality control (QC) samples were pooled (small aliquots of each biological sample to be studied were pooled and thoroughly mixed). Between each analytical unit of 20 analytes, the QC sample was analysed to provide a robust quality assurance for each metabolic feature that was detected. The precision and repeatability of the UPLC-MS method were validated via the duplicate analysis of six injections of the same QC sample and six parallel samples prepared using the same preparation protocol, respectively. The relative standard deviations of the retention time and area were less than 5.0%. The resulting data showed that the precision and repeatability of the proposed method were satisfactory for metabolomics analysis.
image file: c5ra10992b-f2.tif
Fig. 2 The representative base peak intensity chromatograms of the rat urine obtained using the ESI positive mode of the UPLC-QTOF/MS. (A) Normal group; (B) model group; (C) Baixiangdan capsule-dosed group.

Urinary metabolic profiling data processing using PCA and PLS-DA

The PCA and PLS-DA analyses of the data set containing 144 data records obtained from the SG rats on days 0 (prior to the electrically induced stress), 1, 2, 3, 4 and 5 were performed first. The PCA score plot (Fig. S1) shows clear differences between the urine samples collected on days 0, 1, 2, 3, 4, and 5, which visualises the general changes in the holistic metabolic profile of the endogenous metabolites during the electric stimulations.

The supervised pattern recognition (PLS-DA) was more focused on the actual class discriminating variations compared to the unsupervised approach (PCA). Fig. S2(A) shows the score plot of the PLS-DA model using the data set from the SG rat urine samples to discriminate between the different days of induced stress, and it is similar to the PCA result. The parameters of this PLS-DA model were R2X(cum) = 0.427, R2Y(cum) = 0.952 and Q2Y(cum) = 0.912, which means that 42.7% of the independent variables were applied to construct the model, 95.2% of the samples (data records) fit the established discriminant mathematic model, and the prediction accuracy of this model was 91.2%. After being processed via PLS-DA in SIMCA-P, the mean-centred PLS-DA score plots were generated to trace and compare the dynamic changes in the metabolic events in the rats undergoing electric stimulation for 5 days. In the PLS-DA graph, each spot represents a sample and each assembly of samples indicates a particular metabolic pattern at a different time point. The loci marked by arrows represent the trend of the mean metabolite pattern changes. As shown in Fig. S2(A), the metabolic state of each group on day 1 had deviated from the initial position (day 0, prior to the electrically induced stress), and the greatest difference was observed on day 2, which indicates that in response to the electric stimulation, the metabolism of the endogenous substances and the metabolic profiles of the urine compared to the initial state (day 0) were significantly altered. From day 3 to day 5, the trajectory direction gradually returned to that observed on day 1, indicating the recovery of the disturbed metabolic state. The VIP (variable importance in the projection) value of each variable in the model was ranked according to its contribution to the classification. The VIP list of the retention time–exact mass pairs was obtained from the PLS-DA using SIMCA-P. To select the potential biomarkers worthy of preferential study in the next step, these differential metabolites were validated using the student’s t test. The critical p-value was set to 0.05 for the significantly different variables in this study. Following the criteria listed above, 14 significantly different endogenous metabolites present in the urine of the model rats on the 5th day were selected for further study. The identification of the potential biomarkers was then carried out as follows, and the results are listed in Table 1. The possible elemental compositions of the selected compounds were generated using the software program Masslynx according to the following procedure: the calculated mass, mass deviation (mDa and ppm), double-bond equivalent, formula, and i-fit value (the isotopic pattern of the selected ion) were calculated using the selected m/z ions. A lower i-fit value and smaller mass deviation indicate a more accurate elemental composition. The structural information was obtained by searching freely accessible databases (KEGG (http://www.genome.jp) and HMDB (http://www.hmdb.ca)) using the detected molecular weights and elemental compositions.

Table 1 Identification of the significantly different endogenous metabolites in the model rats’ urine
No. tR (min) m/z Elemental composition Identification results Data mining Modelab
a “↑” represents a higher level of metabolites, whereas “↓” represents a lower level of metabolites. All of the data represent the intensity values of the metabolites on day 5. “+” represents a statistically significant difference (p < 0.05).b Compared to the initial state.c Confirmed using authentic standards.
1 5.75 162.0759 C16H12NO4 2-Aminoadipic acidc PLS-DA, ANN +(↑)
2 6.42 130.0869 C5H8NO3 5-Oxoproline PLS-DA +(↑)
3 19.19 353.2466 C20H33O5 11-epi-Prostaglandin F2α PLS-DA, ANN +(↓)
4 10.54 255.0259 C7H12O8P Shikimate-5-phosphate PLS-DA, ANN +(↑)
5 5.30 164.0805 C5H10NO5 4-Hydroxyglutamatec PLS-DA, ANN +(↑)
6 19.33 373.2744 C16H29N4O4S Biocytin PLS-DA +(↓)
7 12.47 271.0607 C15H11O5 Genisteinc PLS-DA, ANN +(↓)
8 8.27 252.1595 C10H14N5O3 Deoxyadenosine PLS-DA +(↓)
9 19.19 371.2589 C20H35O6 6-Keto-prostaglandin F1α PLS-DA +(↓)
10 4.60 180.0619 C9H10NO3 Hippuric acidc PLS-DA, ANN +(↑)
11 5.78 144.0601 C6H10NOS 5-(2-Hydroxyethyl)-4-methyliazole PLS-DA +(↑)
12 4.69 105.0683 C3H8N2O2 2,3-Diaminopropionic acid PLS-DA, ANN +(↓)
13 6.30 233.1219 C13H17N2O2 Melatoninc PLS-DA, ANN +(↑)
14 4.60 118.0921 C5H12NO2 5-Amino-valerate PLS-DA +(↓)
15 16.31 203.1827 C10H27N4 Sperminec ANN +(↓)
16 1.03 166.0808 C9H12NO2 L-Phenylalaninec ANN +(↓)
17 5.78 116.0722 C5H10NO4 Prolinec ANN +(↑)
18 8.26 206.1565 C9H20NO4 Pantothenol ANN +(↓)
19 1.06 150.0911 C5H12NO2S L-Methioninec ANN +(↑)
20 11.34 285.0753 C10H12N4O6 Xanthosine ANN +(↓)


As a result, 14 potential biomarkers were identified based on the accurate elemental compositions and the retention times, and 9 were confirmed using the available reference standards by matching their retention time and accurate mass measurement. Among them, 2-aminoadipic acid (1), 5-oxoproline (2), shikimate-5-phosphate (4), 4-hydroxyglutamate (5), hippuric acid (10), 5-(2-hydroxyethyl)-4-methyliazole (11) and melatonin (13) were found to have increased in the urine samples from the electric-stressed rats compared to their initial state. Conversely, prostaglandin F3α (3), biocytin (6), genistein (7), deoxyadenosine (8), 6-keto-prostagladin F1α (9), 2,3-diaminopropionic acid (12) and 5-amino-valerate (14) had decreased.23

Meanwhile, the MS spectra data set of the CG rats during the five testing days was also analysed using PLS-DA. Compared to the pathological variations observed in the SG rats, the trajectory of the CG rats was irregular, as shown in ESI Fig. S3, which suggests that the electric stimuli on the female rats may lead to systemic metabolic variation. To determine the treatment-related metabolic pattern alterations, another PLS-DA model (R2X(cum) = 0.423, R2Y(cum) = 0.973 and Q2Y(cum) = 0.877) was constructed with a data set containing 127 data records obtained from the BCDG rats. As shown in Fig. S2(B), a classification of the different treatment days is clearly achieved, and the trajectory of the metabolic profiles illustrates the temporal metabolic variations in the urine metabolites and exhibits a recovering tendency back to the initial state (day 0) following treatment with the Baixiangdan capsule.

Feature selection and identification of the significant metabolites using AI technology

Due to the complexity and nonlinearity of metabolomics data, AI technologies provide a meaningful method for the discovery of feature information hidden in the data. Neural networks are computational systems capable of mimicking the mechanisms of human learning. They enable the detection of complex relationships between a set of inputs and outputs and estimate the magnitude of the relationships without requiring a mathematical description of how the output is functionally dependent on the input. They are useful for processing unstructured and nonlinear data for the recognition of patterns in high-dimensional data. Neurofuzzy logic is a hybrid AI technology that combines the learning capabilities of neural networks with the generality of fuzzy logic, and it is able to generate knowledge regarding the patterns hidden in data in an interpretable format.13

In this study, the top 140 independent variables were selected from the ranking order generated by the PCA according to the significance of the contribution to the PCA model. A reduced variable data set was then formed, which included the 140 independent variables from the original data set. Further data mining activities were then conducted using this new data set, and the ASMOD algorithm was applied to generate the neurofuzzy logic model. A total of 14 independent variables were discovered to be differential features. Therefore, two data sets that included the same data records but different dimensions (number of independent variables), in which one contained 140 variables and the other contained the 14 selected differential features as the independent variables, have been established. Next, a further investigation using a multi-layer perceptron neural network was carried out to validate the robustness of the discovered reduced variable data set by comparing the quality of the models based on the two established data sets. During the modelling process, the two data sets were both randomly divided into a validation set (28 data records, 20% of the 144 data records were selected using the “Smart Selection” function in INForm 4.3) and a training set (116 data records, the remaining 80% of the 144 data records). Two neural network models were generated using the two selected training set data sets. Then, the predictabilities of these two neural network models were tested against the validation data sets. The validation R2 that was computed using the validation data set was used to evaluate the predictability of the neural network model. As shown in Fig. 3, the validation R2 of the validation data set containing the 140 independent variables is 0.9505 (Fig. 3A), and it is 0.9539 (Fig. 3B) for the 14 differential features (independent variables) discovered after reducing the data set dimensions using neurofuzzy logic. The similarity between the validation R2 values indicates that the predictability of the neural network model did not deteriorate as a result of reducing the dimension. The major knowledge of the relationships between the independent variables and the dependent variables (grouping information) still remains in the reduced variable data set. The 14 discovered differential features are sufficient to explain the variability associated with the relationship between the independent and dependent variables (grouping information) and to represent the cause–effect relationships between the variations in the endogenous metabolites and the dynamic pathological processes associated with the stress induced by the electric stimulation. Therefore, the 14 differential feature metabolites discovered via AI analysis were considered to be potential biomarkers related to the development of induced stress. They were identified using the methods described in Section 3.3. As shown in Table 1, eight of the potential biomarkers, including 2-aminoadipic acid (1), prostaglandin F3α (3), shikimate-5-phosphate (4), 4-hydroxyglutamate (5), genistein (7), hippuric acid (10), 2,3-diaminopropionic acid (12) and melatonin (13), were discovered by both the PLS-DA and the AI analysis. The six remaining differential metabolites were only discovered by the AI analysis, of which, spermine (15), L-phenylalanine (16), pantothenol (18) and xanthosine (20) were significantly decreased in the electric-stressed rats, and proline (17) and L-methionine (19) were significantly increased.


image file: c5ra10992b-f3.tif
Fig. 3 The predictions given by the ANN models generated using data sets containing various numbers of independent variables. (A) 140 independent variables, (B) 14 independent variables.

4. Discussion

PMS is a typical stress-related emotional disease that affects 8% of women of child-bearing age. Emotional symptoms, such as anxiety, must be consistently present to diagnose PMS. The electrical stimulation of female SD rats can produce a series of abnormal behavioural and physiological responses that are similar to the emotional symptoms of PMS, including a reduction in exploratory behaviour and plasma hormone level alterations (prolactin, estradiol and progesterone).20 Previously, a serum metabolomics approach based on UPLC/QTOF-MS had been developed to evaluate the therapeutic effects of the Baixiangdan capsule on liver-Qi invasion syndrome PMS in rats.22 The therapeutic mechanism of the Baixiangdan capsule is related to the regulation of the metabolism of corticosteroids, oestrogen and excitatory/inhibitory amino acid neurotransmitters.

The present study developed a urinary metabolomics method on the basis of UPLC-QTOF/MS to investigate the temporal variations in the metabolic profiles of rats that underwent electric stimulation over 5 days. AI techniques integrating neurofuzzy logic and neural networks were applied for the first time to find and understand the correlation of the selected potential biomarkers to the occurrence and development of liver-Qi syndrome PMS induced by electric stimulation. The minimal data set, containing 14 differential features (metabolites) that are sufficient to explain the variability of the endogenous metabolites associated with the dynamic pathological processes induced by electric stimulation, was obtained using neurofuzzy logic modelling. Therefore, the 14 differential feature metabolites were considered to be potential biomarkers for discriminating the different urine metabolic profiles on different days. Seven sub-models, implying hidden interactions between the 14 potential biomarkers, were constructed according to the intelligible rules in an “if then” format explicitly representing the cause–effect relationships contained in the experimental data during the neurofuzzy logic modelling. However, these important correlations among variables are usually neglected in the commonly used multivariate analysis with VIP values as the weight sum of the PLS loadings to evaluate the variable contributions for distinguishing the different metabolic states.

As shown in Table 1, nine of the fourteen differential endogenous metabolites discovered using the neurofuzzy logic model were found to be monoamine neurotransmitter metabolites, including 2-aminoadipic acid (1), hippuric acid (10), 4-hydroxyglutamate (5), 2,3-diaminopropionic acid (12), spermine (15), L-phenylalanine (16), proline (17), pantothenol (18) and L-methionine (19). These results are consistent with previous reports regarding the pathogenesis of PMS, which is related to amino acid metabolism and neural signal transmission.

Glutamate is the most abundant fast excitatory neurotransmitter in the mammalian nervous system, which is responsible for mediating a broad range of nervous system functions via glutamate receptors. It may be involved in the metabolism of proteins and glucose in the brain, as well as promoting oxidation and improving the function of the central nervous system.24 A previous study has explored the relationship between the pathogenesis of PMS and glutamate by determining the concentration of glutamate in serum and in different regions of the brain (including the hypothalamus, limbic lobe, frontal cortex and hippocampus) using a pre-column derivatization HPLC method. The glutamate levels in serum, the hypothalamus and limbic lobe were decreased significantly in the PMS model rats when compared with normal ones, while those in the frontal cortex and hippocampus were found to have increased after model establishment.25

2-Aminoadipic acid (1) is a primary metabolite in the lysine metabolic pathway, which antagonizes neuro-excitatory activity modulated by the glutamate receptor, N-methyl-D-aspartate (NMDA). Aminoadipic has also been shown to inhibit the production of kynurenic acid, a broad spectrum excitatory amino acid receptor antagonist.26 The disorder of 2-aminoadipic acid has been associated with varying neurological symptoms.27 Meanwhile, the metabolism of lysine also relies on the regulation by glutamate receptors, implying that the level of 2-aminoadipic acid in urine should be related to that of glutamate.

4-Hydroxyglutamate (5) is an intermediate in the metabolism of gamma-hydroxyglutamic acid. Specifically 4-hydroxyglutamate combines with 2-oxoglutarate to produce 4-hydroxy-2-oxoglutarate and glutamate.28 Therefore, the level of 4-hydroxyglutamate should also be closely related to that of glutamate. Hippuric acid (10) is an acyl glycine formed by the conjugation of benzoic acid with glycine on the basis of the action of glycine N-acyltransferase. And if glycine combines with α-ketoglutaric acid it could produce glyoxylic acid and glutamic acid. The up-regulation of these metabolites, including 2-aminoadipic acid (1), 4-hydroxyglutamate (5) and hippuric acid (10), in the urine represent an increase in the excitatory amino acid glutamate.29

In addition, shikimate-5-phosphate (4), which is the precursor of chorismic acid and tryptophan, was also discovered by the two data mining approaches. Shikimate-5-phosphate has been reported to participate in the metabolism of phenylalanine. Both L-phenylalanine (16) and tryptophan are required for the biosynthesis of monoamine neurotransmitters and play important roles in the pathogenesis of emotional disorders. Decreased phenylalanine levels were detected in the urine of the electric-stressed rats, which is in agreement with other reports.13 However, L-phenylalanine was only discovered via the AI analysis, in addition to spermine (15), proline (17), pantothenol (18), L-methionine (19) and xanthosine (20). Proline is also a derivative of glutamate, which generates hydroxyproline and then decomposes into 4-hydroxyglutamate, an excitatory amino acid neurotransmitter.

Melatonin (13), which is involved in the metabolic pathway of 5-HT, was also found using the neurofuzzy logic model. It has been suggested that PMS is related to a systemic imbalance of the neurotransmitter 5-hydroxy tryptamine (5-HT).30–32 The emergence of symptoms such as emotional instability, irritability, and anxiety are related to a decrease in 5-HT levels.33 The increased melatonin in the urine of the stressed rat model indicates the down-regulation of 5-HT. Melatonin exhibits extensive physiological activities, and its daily and seasonal rhythms are considered to be closely related to the functional regulation of immunity and the neuroendocrine and reproductive systems. The biosynthesis of melatonin is also rhythmic, as are the melatonin precursors and the related synthesis enzymes, e.g., N-acetyltransferase, HIOMT, 5-HT.34 The rhythm of N-acetyltransferase and HIOMT exhibits the same tendencies as melatonin, while 5-HT is the opposite.35 Therefore, a rise in melatonin may be attributing to the premenstrual moods of dysphoria and irritability. Other biomarkers, such as 11-epi-prostagladin F2α (3) and 6-keto-prostagladin F1α (9) have been found to be involved in signal transduction and the regulation of physiological activities, such as the synthesis of lipoproteins and carbohydrates, which are related to the development of stress/emotion-related diseases.

Genistein (7) was also identified as a differential component, and it has been reported to be related to energy metabolism. Genistein is also a primary component of rat feed, which is made from soybeans. The stress experiment may inevitably cause a loss of appetite; therefore, the reduction of genistein in the urine between the experimental and initial states could be due to various factors.

The predictive abilities of these 14 potential biomarkers were then evaluated using a neural network model. When using neural network algorithms, intelligible rules are generated on the basis of “unseen” data to provide accurate predictions. This is different from the cross validation approach that is commonly applied in multivariate analyses and neurofuzzy logic, where the test data are used. Therefore, the evaluated results obtained using neural network modelling are more credible. The 14 potential biomarkers discovered using the AI analysis are closely related to the occurrence and development of liver-Qi syndrome PMS, indicating that the AI analysis appeared to be more effective than the PLS-DA analysis for the data mining.

Upon analysing the dynamic trajectories of the holistic metabolic profiles for the 5 different days of electric stimulation in the PLS-DA score plot, the greatest difference was observed on day 2, which indicates that as a response to the electric stimulation, the metabolism of the endogenous substances and the metabolic profiles in the urine were significantly altered compared to the initial state. From day 3 to day 5, the trajectory direction gradually moved back to that of day 1, meaning that the experimental animals accommodated for the electric stimulation, and the stress states were relieved; the same intensity of stimulation could not cause a similar response. Therefore, with the exception of the fact that urine metabolites are the final products of all physiological and pathological processes, the adaption to the stress state and the potential biomarkers discovered in this study showed significant differences to the ones that were discovered in previous serum metabolomics studies.

The Baixiangdan capsule is a new TCM prescription that has been used for the treatment of PMS.36,37 Different urine metabolite patterns were observed via PLS-DA, implicating the potential efficacy of the Baixiangdan capsule on the electric-stress rat model. As shown in Fig. 4, the average intensities of 2-aminoadipic acid, 4-hydroxyglutamate and melatonin in the urine of the different groups (CG, SG and BCDG) were compared. During the first four days, compared with the levels observed in the CG rats, the levels of the three metabolites in the SG rats gradually rose. However, in the treatment group (BCDG), the levels decreased significantly compared to the SG rats. The results suggest that several relevant mechanisms, such as suppressing glutamine metabolism, inhibiting the activity of glutamate, and inhibiting the increase of neurotransmitters, may be involved in the treatment process of the Baixiangdan capsule on PMS. This could be helpful for the regulation of the nervous excitatory state of patients suffering from PMS; therefore, relieving the typical psychological symptoms, such as emotional instability, irritability and anxiety.


image file: c5ra10992b-f4.tif
Fig. 4 A comparison of the major differential metabolites in the urine of the model group (MG), the normal group (NG) and the Baixiangdan capsule-dosed group (BADG). (A) 2-Aminoadipic acid, (B) 4-hydroxyglutamate, (C) melatonin. #p < 0.05 vs. the NG rats, *p < 0.05 vs. the MG rats (student’s t-test).

In the present study, AI analysis was applied for the discovery of potential biomarkers related to the dynamic pathological processes of liver-Qi invasion syndrome PMS in an induced-stress rat model for the first time. The explored potential biomarkers have been proved to be valuable according the biochemical interpretations referred to in the corresponding literatures. However, some remaining questions are still necessary and essential in regard to the roles of the obtained biomarkers in certain metabolic pathways, and to better understand the exact pathogenesis of PMS. For example, the validation of these discovered biomarkers on the basis of biological experimental evidence; illustrations of corrections of the obtained potential biomarkers in serum/plasma and urine; the precise quantitative determination of the potential biomarkers to give rational thresholds for disease diagnosis and efficacy evaluation, etc. should be solved in following investigations. These problems would be solved in sequential investigations.

5. Conclusions

A urinary metabolomics method based on ultra-performance liquid chromatography coupled with quadrupole/time-of-flight mass spectrometry (UPLC-QTOF/MS) was employed to investigate the pathogenesis and therapeutic effect of the Baixiangdan capsule on electric-induced stress in rats for five days. Artificial intelligence technology (artificial neural networks and neurofuzzy logic) was used for the first time for the discovery of differential metabolites in the data mining of this metabolomics study. The ANN model exhibited a desirable fitness and predictive ability, and the metabolic signatures discovered using neurofuzzy logic were helpful for understanding the hidden cause–effect relationships between the experimental data. The potential mechanism of the electric stress was elucidated, and excitatory amino acid neurotransmitters related to the typical psychological symptoms of PMS, including anxiety and irritability, were found to be potential biomarkers for the diagnosis and therapeutic evaluation of PMS. This research demonstrates that artificial intelligence technologies are powerful and promising tools for modelling complex metabonomic data and discovering hidden knowledge regarding the potential biomarkers associated with the development of diseases, which are also suitable for other complex biological data sets.

Ethical conduct of research

The authors state that they have obtained appropriate institutional review board approval or have followed the principles outlined in the declaration of Helsinki for all human or animal experimental investigations.

Acknowledgements

This work was sponsored by the International Cooperation Projects of the Ministry of Science and Technology (MOST) in China (No. 2010DFA32420) and the National Natural Science Foundation of China (No. 81130066).

References

  1. E. J. Want, I. D. Wilson, H. Gika, G. Theodoridis, R. S. Plumb, J. Shockcor, E. Holmes and J. K. Nicholson, Nat. Protoc., 2010, 5, 1005–1018 CrossRef CAS PubMed.
  2. J. K. Nicholson, Mol. Syst. Biol., 2006, 2, 52 CrossRef PubMed.
  3. J. C. Lindon, J. K. Nicholson, E. Holmes, H. Antti, M. E. Bollard, H. Keun, O. Beckonert, T. M. Ebbels, M. D. Reily and D. Robertson, Toxicol. Appl. Pharmacol., 2003, 187, 137–146 CrossRef CAS.
  4. X. P. Liang, X. Chen, Q. L. Liang, H. Y. Zhang, P. Hu, Y. M. Wang and G. A. Luo, J. Proteome Res., 2011, 5, 790–799 CrossRef PubMed.
  5. P. Wang, H. Sun, H. Lv, W. Sun, Y. Yuan, Y. Han, D. Wang, A. Zhang and X. J. Wang, J. Pharm. Biomed. Anal., 2010, 53, 631–645 CrossRef CAS PubMed.
  6. B. Sun, L. Li, S. M. Wu, Q. Zhang, H. J. Li, H. B. Chen, F. M. Li, F. T. Dong and X. Z. Yan, Anal. Biochem., 2009, 395, 125–133 CrossRef CAS PubMed.
  7. H. M. Jia, Y. F. Feng, Y. T. Liu, X. Chang, L. Chen, H. W. Zhang, G. Ding and Z. M. Zou, PLoS One, 2013, 8, e63624 CAS.
  8. L. D. Han, J. F. Xia, Q. L. Liang, Y. Wang, Y. M. Wang, P. Hu, P. Li and G. A. Luo, Anal. Chim. Acta, 2011, 689, 85–91 CrossRef CAS PubMed.
  9. Z. T. Jiang, J. B. Sun, Q. L. Liang, Y. F. Cai, S. S. Li, Y. Huang, Y. M. Wang and G. A. Luo, Talanta, 2011, 84, 298–304 CrossRef CAS PubMed.
  10. X. Li, X. Lu, J. Tian and G. W. Xu, Anal. Chem., 2009, 81, 4468–4475 CrossRef CAS PubMed.
  11. J. Chen, Y. Zhang, X. Y. Zhang, R. Cao, S. L. Chen, Q. Huang, X. Lu, X. P. Wang, X. H. Wu, C. J. Xu, G. W. Xu and X. H. Lin, Metabolomics, 2011, 7, 614–622 CrossRef CAS.
  12. J. S. R. Jang, C. T. Sun and E. Mizutani, Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence, Prentice-Hall International (UK) Limited, London, 1997 Search PubMed.
  13. Q. Shao, R. C. Rowe and P. York, Eur. J. Pharm. Sci., 2006, 28, 394–404 CrossRef CAS PubMed.
  14. Q. Shao, R. C. Rowe and P. York, Eur. J. Pharm. Sci., 2007, 31, 129–136 CrossRef CAS PubMed.
  15. P. I. Sundstrom, S. Smith and M. Gulinello, Archives of Women’s Mental Health, 2003, 6, 23–41 CrossRef PubMed.
  16. M. Steiner, Lancet, 2000, 356, 1126–1127 CrossRef CAS.
  17. H. Gao, Research progress on the correlation between anger pathogenesis and Premenstrual syndrome, J. Chin. Med. Assoc., 2009, 11, 283–285 Search PubMed.
  18. Y. Zhao, H. Xu, L. M. Lei and H. L. Zhu, Internet J. Lab. Med., 2013, 34, 2627–2628 Search PubMed.
  19. S. G. Sun, Y. Lu, A. H. Wang, X. L. Sui, M. Huang, J. Liu, J. Zhu, Z. F. Li, H. Y. Zhang and M. Q. Qiao, China Pharm., 2011, 22, 209–211 Search PubMed.
  20. H. Y. Zhang, M. Q. Qiao, W. C. Zhu and J. Wang, Chin. Tradit. Pat. Med., 2002, 24, 118–119 Search PubMed.
  21. L. Li, P. Sun, Q. L. Liang, H. Y. Zhang, Y. Wang, M. Q. Qiao and G. A. Luo, Chin. Tradit. Pat. Med., 2011, 33, 762–767 Search PubMed.
  22. Y. M. Li, B. Zhang and X. H. Fan, Chin. Tradit. Pat. Med., 2009, 31, 1690–1694 CAS.
  23. G. A. Luo, Y. M. Wang, Q. L. Liang and Q. F. Liu, Systems Biology for Traditional Chinese Medicine, Science Press, Beijing, China, 2010 Search PubMed.
  24. X. Zhou and X. Q. Wang, Chin. J. Neurosci., 2003, 19, 130–133 CAS.
  25. L. Sun, Analysis of progesterone and amino acid in serum and different brain regions of rats with premenstrual syndrome liver -qi invasion. Master thesis, Shandong University of Traditional Chinese Medicine, 2008 Search PubMed.
  26. H. Q. Wu, U. Ungerstedt and R. Schwarcz, Eur. J. Pharmacol., 1995, 25, 55–61 CrossRef.
  27. K. Danhauser, S. W. Sauer, T. B. Haack, T. Wieland, C. Staufner, E. Graf, J. Zschocke, T. M. Strom, T. Traub, J. G. Okun, T. Meitinger, G. F. Hoffmann, H. Prokisch and S. Kölker, Am. J. Hum. Genet., 2012, 91, 1082–1087 CrossRef CAS PubMed.
  28. A. Goldstone and E. Adams, J. Biol. Chem., 1962, 237, 3476–3485 CAS.
  29. X. Y. Wang, C. Y. Zeng, J. C. Lin, T. L. Chen, T. Zhao, Z. Y. Jia, X. Xie, Y. P. Qiu, M. M. Su, T. Jiang, M. W. Zhou, A. H. Zhao and W. Jia, J. Proteome Res., 2012, 11, 6223–6230 CAS.
  30. X. Wang, T. Zhao, Y. P. Qiu, T. Jiang, M. W. Zhou, A. H. Zhao and W. Jia, J. Proteome Res., 2009, 8, 2511–2518 CrossRef CAS PubMed.
  31. S. Ramcharan, E. J. Love, G. H. Fick and A. Goldfien, J. Clin. Epidemiol., 1992, 45, 377–392 CrossRef CAS.
  32. J. E. Borenstein, B. B. Dean, J. Endicott, J. Wong, C. Brown, V. Dickerson and K. A. Yonkers, J. Reprod. Med., 2003, 48, 515–524 Search PubMed.
  33. A. Rapkin, Psychoneuroendocrinology, 2003, 28, 39–53 CrossRef.
  34. J. Axelrod and H. Weissbach, Science, 1960, 131, 1312–1313 CAS.
  35. M. Abe and T. Masanori, Exp. Eye Res., 1999, 68, 255–262 CrossRef CAS PubMed.
  36. X. H. Liu, H. Y. Zhang and Q. T. Zhao, J. Tradit. Chin. Med., 2007, 48, 834–836 Search PubMed.
  37. Y. G. Hu and L. Xue, Pharmacol. Clin. Chin. Mater. Med., 2007, 23, 151–153 CAS.

Footnotes

Electronic supplementary information (ESI) available. See DOI: 10.1039/c5ra10992b
These authors contributed equally to work.

This journal is © The Royal Society of Chemistry 2015
Click here to see how this site uses Cookies. View our privacy policy here.