Rapid and non-invasive diagnosis of coronary artery disease via clinical laboratory parameters and 1 H-NMR spectra of human blood plasma

Mohammad Shahbazy; Ali Zahraei; Jamshid Vafaeimanesh; Mohsen Kompany-Zareh

doi:10.1039/C5RA17262D

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/C5RA17262D (Paper) RSC Adv., 2015, 5, 104054-104061

Rapid and non-invasive diagnosis of coronary artery disease via clinical laboratory parameters and ¹H-NMR spectra of human blood plasma

Mohammad Shahbazy^a, Ali Zahraei^b, Jamshid Vafaeimanesh*^b and Mohsen Kompany-Zareh*^a
^aDepartment of Chemistry, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan 45137-66731, Iran. E-mail: kompanym@iasbs.ac.ir; Tel: +98 24 3315 3123
^bClinical Research Development Center, Qom University of Medical Sciences, Qom, Iran. E-mail: j.vafaeemanesh@muq.ac.ir; Tel: +98 25 3612 2949

Received 26th August 2015 , Accepted 13th November 2015

First published on 17th November 2015

Abstract

Coronary artery disease (CAD), one of the most common fatal diseases in the world, was examined in the present study via the investigation of the ¹H-NMR spectra of human blood plasma and clinical laboratory parameters, with the aim of early disease diagnosis. Partial least squares-discriminant analysis (PLS-DA), a common supervised pattern recognition method, assisted by a genetic algorithm (GA) based feature selection procedure, was used to classify CAD⁻ and CAD⁺ individuals based on spectral patterns and clinical parameters. Meanwhile, unsupervised pattern recognition methods (i.e., hierarchical cluster analysis (HCA) and principal component analysis (PCA)) were implemented to precisely visualize and examine the spectroscopic and clinical datasets. GA and ANOVA techniques were employed to select the discriminant and most effective clinical parameters for recognizing CAD⁻ and CAD⁺ samples. Finally, the calculated classification models were successfully able to distinguish between CAD⁻ and CAD⁺ individuals using ¹H-NMR spectra and clinical laboratory parameters as a safe, economic, simple and also non-invasive method in comparison with coronary angiography for CAD diagnosis.

1. Introduction

Coronary artery disease (CAD) is one of the most frequent leading causes of mortality and morbidity in the world.¹ Despite a slight decrease in prevalence that has occurred over the past decade, it still contributes to nearly 15% of deaths.² The progression of atherosclerotic plaques in coronary arteries leads to coronary artery disease.³ This makes silent and slow progress during the childhood and youth periods at initial stages while clinical manifestations appear in middle age.⁴

Atherosclerosis is an inflammatory disease affecting all arteries that may lead to ischemia in the heart and brain as a fatal initiated event.⁵ Risk factors associated with CAD which are strongly related to poor lifestyle and influenced by stress include hypertension, smoking, diabetes mellitus, obesity, physical inactivity and dyslipidemia.¹ These coronary risk factors lead to endothelial injury, plaque formation and the promotion of arterial thrombus deposition by different mechanisms.⁶

Among the mentioned risk factors for CAD, dyslipidemia is considered to be the most common risk factor. In order to initiate appropriate treatment and minimize morbidity and mortality, and also to optimize the cost-effectiveness of such treatments, identifying the patients at risk for CAD and the early treatment of atherosclerotic lesions are really important.¹ Atherosclerosis is a complex process and it is considered to have an inflammatory background; consequently the associations between various inflammatory markers, occurrence, severity and clinical phenomena related to CAD have been studied.⁷ The interactions between genetic and environmental factors induce the arterial wall to respond to stimuli through the actions in endothelial cells, smooth muscle, inflammatory cells and platelets that leads to plaque formation.⁸ There is much evidence that inflammation plays a key role in the pathogenesis of stable CAD and acute coronary syndromes.⁹ The most frequently studied parameters have been leukocyte count, C-reactive protein (CRP), fibrinogen, and uric acid.^10,11 Furthermore, Danesh et al. found a significant association of fibrinogen, CRP, albumin and leukocyte count with CAD.¹²

Implementing chromatographic and spectroscopic based high-tech analytical methods (e.g., mass spectrometry (MS),^13–15 fluorescence spectroscopy,^16–19 gas/liquid chromatography coupled to MS (GC/LC-MS),^20–27 comprehensive two-dimensional GC (GC × GC),²⁸ combined GC × GC with time-of-flight MS (GC × GC-TOF-MS),^29,30 proton nuclear magnetic resonance (¹H-NMR),^14,31,32 and LC-NMR³³) can be fruitful for metabolomics and proteomics studies in clinical and biological systems.

Up to 1000 metabolites can be recognized and evaluated through metabolic profiling and assessing pathways during occurred variations in metabolite concentration. Consequently, metabolomics/proteomics studies via instrumental analysis techniques as non-invasive and rapid tools might be advantageous to discriminate and recognize variations of the metabolites/proteins in diverse biofluids such as blood plasma, urine and serum toward the detection of an external malignant factor effecting a particular disease. It can be useful to identify discriminant metabolites for biomarker discovery and the early diagnosis of various diseases.^{26,31,32,34–36}

Chemometrics, a well-established analytical approach, has been increasingly utilized to associate instrumental analysis techniques with metabolomics/proteomics research.^31,37,38 This paper is concerned with the prediction of CAD clinical status and its diagnosis as a mortal disease through the analysis of selected clinical laboratory parameters as variables and the acquired ¹H-NMR spectra from human blood plasma samples, by using pattern recognition based chemometric methods. Partial least squares-discriminant analysis (PLS-DA)^39,40 is a supervised pattern recognition technique that correlates variation in the dataset with class membership and this in turn can provide an additional confidence measure for any resultant clustering. Principal component analysis (PCA) and hierarchical cluster analysis (HCA), as some common unsupervised pattern recognition methods, were used to visualize the relationship between CAD⁻ and CAD⁺ individuals with the aim of a cluster analysis of clinical parameters and ¹H-NMR spectra datasets.

In the present study, analysis of variance (ANOVA) and genetic algorithm (GA) based feature selection approaches were used to select the most vital and effective clinical parameters to anticipate CAD⁺ or CAD⁻ clinical statuses as patient and healthy classes, respectively, for the considered individuals.

2. Materials and methods

2.1. Experimental

2.1.1. Clinical parameters. All individuals signed informed consent prior to the study. The study was planned according to the ethical guidelines following the Declaration of Helsinki in Clinical Research Development Centre at Qom University of Medical Sciences (Qom, Iran) and Shahid Beheshti Hospital (Qom, Iran). The institutional review committee approved our study protocol thereby following local biomedical-research regulations. In this study, 64 patients with suspected CAD which were referred to the cardiovascular clinic of Shahid Beheshti Hospital of Qom (Iran) for coronary angiography were studied.None of the patients had acute coronary infarction or severe coronary artery disease and were selected from the patients referred to the angiography section due to suspected CAD and did not have acute coronary syndrome. The patients enrolled on the study from January 2013 to June 2013 and all of them underwent physical examinations.

During evaluation of the data it was clear that critical parameters were age, gender, information about the patient’s history including hypertension (indicated by a systolic blood pressure of ≥140 mmHg, a diastolic blood pressure of ≥90 mmHg and anti-hypertensive medication), smoking (patients who had stopped smoking for 10 years or less were classified as smokers) and biochemical parameters (i.e., hemoglobin, leucocytes, thrombocytes, C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR)).

All the blood samples were taken after overnight fasting. All the measurements were performed between 8:00 and 11:00 AM in a temperature-controlled room with the subjects in a resting supine state. The subjects abstained from alcohol, caffeine, tobacco and food for 12 hours prior to the study. Long-acting vasoactive medications including calcium channel blockers, beta-adrenergic blocking agents, nitrates and converting enzyme inhibitors were discontinued for 12 hours prior to the study.

The erythrocyte sedimentation rate was measured over a period of 1 hour and the normal value was considered to be 10 mm in the first hour. The number of leukocytes was also determined in all the patients so that the normal number of leukocytes was considered to be 4.000–10.000 cell per mL. The CRP level was determined in all the patients with a normal reference value of 6.0 mg L⁻¹.

2.1.2. ¹H-NMR spectra. Patients in the CAD⁺ group had a major coronary artery problem i.e. a diminution of more than 50% in the intraluminal diameter of each one of three major coronary arteries. Patients who were in the normal group had normal coronary arteries as considered by angiography. Blood plasma was separated from the erythrocytes by centrifugation at 4 °C, and the samples were snap frozen and preserved at −20 °C. The samples were thawed immediately before use. Sample aliquots were mixed with a saline solution with 10% D₂O as the solvent so as to provide diluted samples including a deuterated lock solvent for NMR spectroscopy. 300 μL of saline solution was added to aliquots of 400 μL and then filled up to 1 mL with D₂O. The produced samples were centrifuged at 12 [thin space (1/6-em)]

000 g for 5 min in 4 °C. Following this, 550 μL of centrifuged sample was added to the NMR tube. The ¹H-NMR spectra of the blood samples were measured at 400.22 MHz (using a Bruker-Avance 400 MHz spectrometer). There are low and high molecular weight components in serum and blood plasma, therefore these provide a wide range of signal line widths. Broad bands from protein and lipoprotein peaks contribute substantially to the ¹H-NMR spectra with sharp signals from small molecules superimposed on them. As a result of limited rotational diffusion and short T₂ relaxation times which cause difficulties in spectral interpretation, macromolecules produce broad resonances. A large proportion of the interfering NMR signals arise from water in all the bio-fluids which can both be easily eliminated using a conventional one-dimensional Carr–Purcell–Meiboom–Gill (CPMG) relaxation editing sequence with a pre-saturation profile.⁴¹

This is mainly due to the high quality of water suppression with little calibration and consistency in the obtained spectra. In this study, since the broad protein peaks have not been taken away through filtering before the NMR spectrometry measurements, CPMG as a special pulse sequence is used to remove them.³¹ Fig. 1 shows a ¹H-NMR spectrum that was obtained from one of the control samples.


	Fig. 1 A ¹H-NMR spectrum for one of the control samples. The inset shows details of the fine peaks corresponding to some of the metabolites.

2.2. Coronary angiography

Coronary angiography was performed through the femoral artery. It was carried out by left-heart catheterization and arteriography using the Judkins method.⁴² Two experienced cardiologists who were unaware of the enrolled patients in the study reviewed all of the angiograms. If these two cardiologists did not have the same opinion, the angiographic film was seen by a third cardiologist and then based on the angiographic results, the patients were divided into two groups: with and without coronary artery disease.

2.3. Instrumentation and software

All the ¹H-NMR spectra were collected using a Bruker 400 MHz NMR spectrometer at 300 K via the CPMG-presat sequence. Moreover, all calculations were carried out using routines in MATLAB® R2009a ver. 7.8.0 software (MathWorks© Inc., Natick, MA, USA). PLS Toolbox ver. 4.1.1 (Eigenvector research, Inc., Manson, Washington, USA) was implemented as a standard chemometric toolbox for applying some statistical methods.

3. Results and discussion

3.1. ¹H NMR spectra

Some regions in the ¹H-NMR spectra (i.e., δ < 0.5, δ > 9.0 and 4.5 > δ > 5.1) were excluded to compensate for variations in the water and non-informative signals. The NMR spectra dataset was arranged as a data matrix containing representative rows of the samples and corresponding columns with chemical shifts (ppm) as the measured variables (64 × 13 [thin space (1/6-em)]

747).

3.1.1. Preprocessing of data. To select the discriminant chemical shifts (features) of each class and reduce the dimensionality of the NMR spectra dataset, t-test statistics was used to identify the most informative subset of features via evaluating their p-values. Subsequently, through a binning procedure, the chemical shift range in the NMR spectra was divided into 458 bins (64 × 458). Fig. 2 illustrates the mean values of the NMR spectra in various binning regions for each class. Furthermore, to diminish the noise effect in the data signal processing, the NMR spectra were auto-scaled column-wise.


	Fig. 2 The mean of the NMR spectra for both classes (i.e., CAD⁻ (healthy) and CAD⁺) after excluding non-informative/water signals and the binning procedure.

For classification modelling, 75% of all the samples (48 samples) were randomly selected as the training set and the rest of the samples (16 individuals) were considered to be the external test set for evaluating the model performance during diverse steps of the modelling procedure (i.e., training, venetian blind cross validation and prediction of the external test set samples).

3.1.2. Factor selection and data transformation. Principal component analysis (PCA) and partial least squares (PLS) as common factor analysis methods were used to reduce the dimensionality of the NMR dataset (64 × 458) and elicit the describing factors of variance in the data space. A high range of factors (containing 40 latent variables) was utilized in the factor analysis procedure because later factors might be advantageous for better discrimination between classes. Moreover, informative and effective variations of metabolite levels caused by CAD⁺ status may be encoded in these factors.

In the present study, for the improvement of the accuracy of the classification model and also its prediction ability, GA as an evolutionary and intelligent method was used for feature selection among the PLS factors to pick out a subset including the discriminant factors to classify the samples as CAD⁻ and CAD⁺ (GA was applied on the PLS scores dataset; 64 × 40). Among the 40 PLS factors, 28 latent variables were selected as the most informative and effective factors for distinguishing CAD⁻ and CAD⁺ individuals by the GA algorithm. These 28 factors had a higher frequency of inclusion in the best of the constructed PLS-DA classification models based on the model accuracy evaluation parameters (e.g., none error rate for cross validated samples (NERcv) that examines the model performance for predicting the class membership of a validation set of samples which was not implemented during the modelling procedure). The latent variables with a lower inclusion frequency in the PLS-DA models produced during the GA procedure were discarded and not used for the classification modelling.

To further reduce the data dimensionality (from 28 to only 3 factors) and avoid over-fitting, as well as enhancing the classification model’s accuracy, a data transformation was used to transform from selected PLS factors’ space to the three rotated PLS factors via oblique rotation of the factors based on a simple fitness function (by the GA based optimization) to minimize the ratio of the distance of an object from its class’s centre to the object’s distances to other class centres.⁴³ This procedure, through producing the optimal PLS-OR factors (a dataset of 64 × 3), significantly improved the model performance and discrimination between the classes from the 400 MHz NMR spectra with medium resolution between CAD⁻ and CAD⁺ cases.

3.1.3. Cluster analysis and classification modelling. PCA was used on the ¹H-NMR dataset (64 × 458) for cluster analysis to identify significant clusters and monitor the variance of objects in the data space via scores produced on the principal components (PCs). Fig. 3 illustrates the distribution of objects in the PCA scores space and their position related to other objects.


	Fig. 3 The PCA scores plot describing the distribution of the healthy and CAD cases.

Furthermore, hierarchical cluster analysis (HCA), as a common unsupervised pattern recognition method, was used (on the ¹H-NMR dataset; 64 × 458) to properly visualize the details of the data space and how the distributions of CAD⁻ and CAD⁺ individuals are related (within and between the classes) via a k-nearest neighbour (kNN) algorithm. The obtained dendrogram from the HCA is shown in Fig. 4. Clearly, there are two clusters corresponding to CAD⁻ (healthy) and CAD⁺ cases.


	Fig. 4 A dendrogram of kNN based HCA for clustering of the objects into two main clusters; healthy and CAD.

In the next step, PLS-DA was applied to build a classification model (using a training set with 48 samples from the optimal PLS-OR factors dataset; 64 × 3) to predict the presence of CAD in unknown samples (the external test set). The produced model provided a potent ability to classify CAD⁻ and CAD⁺ individuals.

The mean values of the evaluation parameters of the PLS-DA model’s performance for the CAD and healthy cases by using the optimal PLS-OR factors are reported in Table 1.

Table 1 The evaluation parameters of the accuracy of the PLS-DA model for NMR spectra

Modelling step	Class label	Specificity	Sensitivity	Precision	NER
Training	Healthy	1	1	1	1
Training	CAD	1	1	1	1
Cross validation	Healthy	1	0.979	1	0.988
Cross validation	CAD	0.979	1	0.979	0.988
Prediction	Healthy	1	0.937	1	0.963
Prediction	CAD	0.937	1	0.937	0.963

These results were collected during the modelling steps which include training, cross validation and prediction for the external test set. According to Table 1, for the PLS-DA model via the optimal PLS-OR factors, the NERcv and NERtst values are acceptable and equal to 0.988 and 0.963, respectively.

The scores visualization of the PLS-DA model on the first two latent variables confirms excellent discrimination for the healthy and CAD cases in Fig. 5. The pink line denotes the linear discriminator boundary between the classes which was produced via the DA algorithm.


	Fig. 5 The PLS-DA scores of the first two latent variables distinguishing healthy and CAD individuals through the ¹H-NMR spectra. The pink solid line denotes the linear discriminant boundary classifying objects based on the calculated PLS-DA model using the optimal PLS-OR factors (64 × 3).

3.2. Clinical laboratory parameters

3.2.1. ANOVA based statistics. Sixty four patients were included in this study. In the data organizing step, the investigated parameters include age, gender, body mass index (BMI), blood pressure, abdominal circumference, data from the patient’s history (i.e., history of hypertension, diabetes mellitus, known hyper lipoproteinemia, renal insufficiency, smoking and CCU admission) and biochemical parameters (i.e., hemoglobin, leukocytes, thrombocytes, total cholesterol, low density lipoprotein (LDL) cholesterol, high density lipoprotein (HDL), triglycerides, creatinine, glucose, erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), insulin level, creatine phosphokinase (CPK), lactate dehydrogenase (LDH), cTnI, Helicobacter pylori (H. pylori) and IgG titer). Consequently, thirty parameters were evaluated for all the cases (CAD⁻ or CAD⁺) resulting in a dataset of 64 × 30 that was used for multivariate data analysis on these parameters.

It should be noted that N. smoker is a unit for measuring the amount a person has smoked over a long period of time. It is calculated by multiplying the number of packs of cigarettes smoked per day by the number of years the person has smoked for.

The patients were divided into two groups according to positive and negative coronary artery disease (35 and 29 patients, respectively). The characteristics of the two groups are shown in Tables 2 and 3. The differences in basic characteristics and risk factors are presented in Table 2.

Table 2 Description of patients in the study with comparison between CAD⁻ and CAD⁺ cases (basic characteristics and laboratory tests)

Parameter	N = 64 mean ± SD	CAD⁻ N = 29	CAD⁺ N = 35	p-value
Age	55.25 ± 12.09	51.34 ± 12.04	58.49 ± 11.31	0.018
Abdominal circumference	100.03 ± 13.42	102.86 ± 12.04	97.69 ± 14.21	0.126
Systolic BP	134.20 ± 22.37	128.90 ± 18.00	138.60 ± 24.83	0.084
Diastolic BP	79.34 ± 10.24	79.76 ± 10.88	79.00 ± 9.84	0.771
WBC	8061.09 ± 2266.98	8068.97 ± 2623.84	8054.57 ± 1962.88	0.980
Hemoglobin	13.87 ± 1.75	13.91 ± 1.54	13.83 ± 1.92	0.859
Hematocrit	41.78 ± 4.87	41.48 ± 5.20	42.02 ± 4.65	0.663
CPK	20.84 ± 13.72	16.41 ± 5.27	24.51 ± 17.185	0.017
LDH	356.14 ± 149.91	331.66 ± 101.04	376.43 ± 179.72	0.237
ESR	16.61 ± 17.43	10.66 ± 9.65	21.54 ± 20.76	0.012
IgG titer	70.28 ± 31.03	56.59 ± 31.12	81.63 ± 26.34	0.001
CRP	6.64 ± 10.64	4.94 ± 3.81	8.05 ± 13.91	0.248
Platelets	239984.37 ± 62078.16	249379.31 ± 70969.01	232200.00 ± 53434.29	0.274
N. smoker	3.30 ± 8.46	1.24 ± 5.64	5.00 ± 10.00	0.077
Troponin T	0.08 ± 0.27	0.01 ± 0.00	0.13 ± 0.36	0.078

Table 3 Description of patients in the study with comparison between CAD⁻ and CAD⁺ individuals (basic characteristics and risk factors)

Parameter	Status	N (%)	CAD⁻ N = 29	CAD⁺ N = 35	p-value
Gender	Male	34(53.1)	12(41.4)	22(62.9)	0.087
Gender	Female	30(46.9)	17(58.6)	13(37.1)	0.087
Smoker	Yes	12(18.8)	2(6.9)	10(28.6)	0.027
Smoker	No	52(81.2)	27(93.1)	25(71.4)	0.027
Hypertension	Yes	42(65.6)	16(55.2)	26(74.3)	0.109
Hypertension	No	22(34.4)	13(44.8)	9(25.7)	0.109
Cardiac disease	Yes	22(34.4)	4(13.8)	18(51.4)	0.002
Cardiac disease	No	42(65.6)	25(86.2)	17(48.6)	0.002
Cardiac failure	Yes	33(51.6)	14(48.3)	19(54.3)	0.632
Cardiac failure	No	31(48.4)	15(51.7)	16(45.7)	0.632
CCU admission	Yes	33(51.6)	7(24.1)	26(74.3)	0.000
CCU admission	No	31(48.4)	22(75.9)	9(25.7)	0.000
Troponin	Positive	11(17.2)	1(3.4)	10(28.6)	0.008
Troponin	Negative	53(82.8)	28(96.6)	25(71.4)	0.008
Bloating	Yes	21(32.8)	13(44.8)	8(22.9)	0.062
Bloating	No	43(67.2)	16(55.2)	27(77.1)	0.062
Gastroesophageal reflux	Yes	30(46.9)	17(58.6)	13(37.1)	0.087
Gastroesophageal reflux	No	34(53.1)	12(41.4)	22(62.9)	0.087
Bitterness of the mouth	Yes	30(46.9)	16(55.2)	14(40.0)	0.226
Bitterness of the mouth	No	34(53.1)	13(44.8)	21(60.0)	0.226
Dyspepsia	Yes	32(50.0)	18(62.1)	14(40.0)	0.079
Dyspepsia	No	32(50.0)	11(37.9)	21(60.0)	0.079
H. pylori	Positive	27(42.2)	16(55.2)	11(31.4)	0.056
H. pylori	Negative	37(57.8)	13(44.8)	24(68.6)	0.056

Significant differences were found between the CAD⁺ and CAD⁻ patients in terms of age, CPK, ESR, IgG titer and history of CCU admission. The mean value of age was equal to 58.49 ± 11.31 years in CAD⁺ while it was 51.34 ± 12.04 years in CAD⁻ patients and it was also statistically significant (p-value = 0.018). Furthermore, the CPK level was higher in CAD⁺ patients (24.51 ± 17.185 vs. 16.41 ± 5.27, p-value = 0.017).

Moreover, the ESR level was higher in CAD⁺ patients (21.54 ± 20.76 vs. 10.66 ± 9.65, p-value = 0.012). It is confirmed that ESR can be used as a predictor of coronary artery disease.⁴⁴ In fact, some researchers such as Prakash and colleagues believe that this parameter is one of the few laboratory markers that differs between CAD patients and healthy individuals.⁴⁵

Besides, another factor that was different between the CAD⁺ patients and healthy people was the IgG titer. However, another study found that H. pylori infection and consequently H. pylori antibody titer was higher among CAD⁺ patients.⁴⁶

In this study, no significant difference was observed between the two groups in terms of CRP but the opposite finding has been mentioned in a study by Leite et al.⁴⁷

The basic differences in the laboratory parameters are shown in Table 3. The smoker status is yes if the individual has a history of smoking. If the individual has a history of heart disease, the cardiac disease status is noted as yes. Heart failure, often referred to as congestive heart failure, occurs when the heart is unable to pump sufficiently to maintain blood flow to meet the body’s needs. When a person has been hospitalized in the intensive care unit for the heart, the status is yes for CCU admission. Troponin was determined using bioMerieux kits. Results of more than 0.01 are considered to be positive. The H. pylori parameter was detected based on serum titers of higher than 30 AU mL⁻¹.

Among patients with coronary artery disease, positive troponin levels were obviously higher (p-value = 0.008). Also, the smoking rate was higher among CAD⁺ patients. Additionally, a history of heart disease, previous hospitalization in a CCU ward and high blood pressure were significantly more likely in CAD⁺ patients and these are known risk factors for CAD.

Although our study showed that hematological parameters such as the number of white blood cells and platelets were not associated with CAD, Jia et al.’s study did find this association.⁴⁸

3.2.2. Feature selection. By applying a feature selection procedure by GA to the clinical parameters dataset (64 × 30), thirteen parameters among the thirty parameters including age, gender, abdominal circumference, systolic BP, diastolic BP, cardiac disease, white blood cell count (WBC), ESR, troponin, CRP, LDH, IgG titer and dyspepsia were selected as the most important and effective variables to discriminate CAD⁻ and CAD⁺ individuals (leading to a dataset of 64 × 13). According to the accuracy of the constructed PLS-DA models (NERcv) using diverse subsets of the parameters as a fitness function, GA selected the discriminant and most effective parameters. This procedure significantly reduced over-fitting and enhanced the model performance in classifying CAD⁻ and CAD⁺ cases.

3.2.3. Classification modelling. The parameters selected by GA (as a training set with 48 samples from the mentioned dataset; 64 × 13) were subjected to a PLS-DA classifier to build a classification model for healthy and CAD patients based on the measured clinical laboratory parameters. The model produced by PLS-DA showed an excellent ability to classify individuals during all the steps of the modelling (i.e., training, Venetian blind cross validation and prediction of the external test set). The obtained results as model performance parameters exhibiting the accuracy and precision of model were calculated and collected in Table 4.

Table 4 The calculated evaluation parameters of the classification model performance for healthy and CAD cases via the measured clinical laboratory parameters

Modelling step	Class label	Specificity	Sensitivity	Precision	NER
Training	Healthy	1	1	1	1
Training	CAD	1	1	1	1
Cross validation	Healthy	0.931	0.952	0.923	0.947
Cross validation	CAD	0.952	0.931	0.976	0.947
Prediction	Healthy	1	0.875	0.953	0.937
Prediction	CAD	0.875	1	0.917	0.937

The scores obtained for the first two identified latent variables by the PLS-DA model are shown in Fig. 6. It is clear that there is good discrimination between CAD⁻ and CAD⁺ individuals. The blue and red lines around objects show the individual space of each class in the score space for healthy and CAD cases, respectively. It can be concluded that the mentioned clinical laboratory parameters would be advantageous to distinguish CAD patients from healthy cases with high accuracy and precision.


	Fig. 6 The PLS-DA scores for the first two latent variables classifying healthy and CAD individuals via clinical laboratory parameters. The PLS-DA model was obtained by using the selected parameters dataset (64 × 13).

4. Conclusions

In the present study, the application of supervised pattern recognition and statistical methods for the analysis of biological/biomedical samples aimed at medicinal goals was evaluated. By acquiring ¹H-NMR spectra of human blood plasma samples and applying pattern recognition based chemometric methods, CAD⁻ cases can be properly distinguished from CAD⁺ individuals. Significantly, a classification model was built to predict and make discriminations between healthy and CAD cases as a rapid and safe disease diagnosis technique.

Furthermore, through clinical parameters measured in the laboratory, CAD⁻ and CAD⁺ individuals were identified by classification modelling. Among thirty parameters, thirteen parameters (i.e., age, gender, abdominal circumference, systolic BP, diastolic BP, cardiac disease, WBC, ESR, troponin, CRP, LDH, IgG titer and dyspepsia) were selected as the discriminant and most important to distinguish CAD⁻ from CAD⁺ using a genetic algorithm based feature selection approach.

Finally, it was demonstrated that the above-mentioned workflow and approaches of using the acquired ¹H-NMR spectra and measured clinical laboratory parameters were able to accurately predict CAD disease in the suspected cases with lower risk than other clinical methods, being fast, simple and non-invasive.

Acknowledgements

The authors would like to appreciate the cooperation of Post Angiography Ward Staff of Shahid Beheshti Hospital of Qom.

Notes and references

N. Bogavac-Stanojević, G. Ivanova Petrova, Z. Jelić-Ivanossvić, L. Memon and S. Spasić, Clin. Biochem., 2007, 40, 1180–1187 CrossRef PubMed.
G. Lippi, Eur. J. Intern. Med., 2013, 24, 97–99 CrossRef CAS PubMed.
V. Kincl, R. Panovsky, J. Meluzin, J. Semenka, L. Groch, D. Tomcikova, J. Jarkovsky and L. Dusek, Acta Univ. Palacki. Olomuc., Fac. Med., 2010, 154, 227–233 CrossRef CAS.
A Preliminary Report From the Pathobiological Determinants of Atherosclerosis in Youth (PDAY) Research Group, JAMA, J. Am. Med. Assoc., 1990, 264, 3018–3024 Search PubMed.
A. B. A. Bampi, C. E. Rochitte, D. Favarato, P. A. Lemos and P. L. d. Luz, Clinics, 2009, 64, 675–682 CrossRef PubMed.
R. Gen, M. Demir and H. Ataseven, South. Med. J., 2010, 103, 190–196 CrossRef PubMed.
M. Rasouli, A. M. Kiasari and B. Bagheri, Clin. Chim. Acta, 2007, 377, 127–132 CrossRef CAS PubMed.
R. Ross, N. Engl. J. Med., 1999, 340, 115–126 CrossRef CAS PubMed.
S. G. Foussas, M. N. Zairis, A. G. Lyras, N. G. Patsourakos, V. G. Tsirimpis, K. Katsaros, D. J. Beldekos, S. M. Handanis, D. Z. Mytas, K. S. Karidis, P. G. Tselioti, A. A. Prekates and J. A. Ambrose, Am. J. Cardiol., 2005, 96, 533–537 CrossRef CAS PubMed.
E. Cavusoglu, V. Chopra, A. Gupta, C. Ruwende, S. Yanamadala, C. Eng, L. T. Clark, D. J. Pinsky and J. D. Marmur, Am. J. Cardiol., 2006, 98, 1189–1193 CrossRef PubMed.
H. Taniguchi, Y. Momiyama, R. Ohmori, A. Yonemura, T. Yamashita, S. Tamai, H. Nakamura and F. Ohsuzu, Atherosclerosis, 2005, 178, 173–177 CrossRef CAS PubMed.
J. Danesh, R. Collins, P. Appleby and R. Peto, JAMA, J. Am. Med. Assoc., 1998, 279, 1477–1482 CrossRef CAS PubMed.
A. Scalbert, L. Brennan, O. Fiehn, T. Hankemeier, B. Kristal, B. van Ommen, E. Pujos-Guillot, E. Verheij, D. Wishart and S. Wopereis, Metabolomics, 2009, 5, 435–458 CrossRef CAS PubMed.
Z. Pan and D. Raftery, Anal. Bioanal. Chem., 2007, 387, 525–527 CrossRef CAS PubMed.
M. Castro-Puyana and M. Herrero, TrAC, Trends Anal. Chem., 2013, 52, 74–87 CrossRef CAS.
A. Lawaetz, R. Bro, M. Kamstrup-Nielsen, I. Christensen, L. Jørgensen and H. Nielsen, Metabolomics, 2011, 8, 111–121 CrossRef.
O. S. Wolfbeis and M. Leiner, Anal. Chim. Acta, 1985, 167, 203–215 CrossRef CAS.
S. Madhuri, P. Aruna, M. I. Summiya Bibi, V. S. Gowri, D. Koteeswaran and S. Ganesan, Proc. Soc. Photo-Opt. Instrum. Eng., 1997, 2982, 41–45 CrossRef CAS.
V. Masilamani, K. Al-Zhrani, M. Al-Salhi, A. Al-Diab and M. Al-Ageily, J. Lumin., 2004, 109, 143–154 CAS.
C. DeHaven, A. Evans, H. Dai and K. Lawton, J. Cheminf., 2010, 2, 9 Search PubMed.
W. Lu, B. D. Bennett and J. D. Rabinowitz, J. Chromatogr. B: Anal. Technol. Biomed. Life Sci., 2008, 871, 236–242 CrossRef CAS PubMed.
G. Theodoridis, H. G. Gika and I. D. Wilson, TrAC, Trends Anal. Chem., 2008, 27, 251–260 CrossRef CAS.
A. Jiye, J. Trygg, J. Gullberg, A. I. Johansson, P. Jonsson, H. Antti, S. L. Marklund and T. Moritz, Anal. Chem., 2005, 77, 8086–8094 CrossRef PubMed.
K. Hiller, J. Hangebrauk, C. Jäger, J. Spura, K. Schreiber and D. Schomburg, Anal. Chem., 2009, 81, 3429–3439 CrossRef CAS.
M. P. Styczynski, J. F. Moxley, L. V. Tong, J. L. Walther, K. L. Jensen and G. N. Stephanopoulos, Anal. Chem., 2006, 79, 966–973 CrossRef PubMed.
K. K. Pasikanti, P. C. Ho and E. C. Y. Chan, J. Chromatogr. B: Anal. Technol. Biomed. Life Sci., 2008, 871, 202–211 CrossRef CAS.
P. Jonsson, A. I. Johansson, J. Gullberg, J. Trygg, A. Jiye, B. Grung, S. Marklund, M. Sjöström, H. Antti and T. Moritz, Anal. Chem., 2005, 77, 5635–5642 CrossRef CAS PubMed.
K. A. Kouremenos, J. Pitt and P. J. Marriott, J. Chromatogr. A, 2010, 1217, 104–111 CrossRef CAS PubMed.
X. Li, Z. Xu, X. Lu, X. Yang, P. Yin, H. Kong, Y. Yu and G. Xu, Anal. Chim. Acta, 2009, 633, 257–262 CrossRef CAS PubMed.
A. C. Beckstrom, E. M. Humston, L. R. Snyder, R. E. Synovec and S. E. Juul, J. Chromatogr. A, 2011, 1218, 1899–1906 CrossRef CAS.
A. Smolinska, L. Blanchet, L. M. C. Buydens and S. S. Wijmenga, Anal. Chim. Acta, 2012, 750, 82–97 CrossRef CAS PubMed.
I. F. Duarte and A. M. Gil, Prog. Nucl. Magn. Reson. Spectrosc., 2012, 62, 51–74 CrossRef CAS PubMed.
V. Exarchou, M. Krucker, T. A. van Beek, J. Vervoort, I. P. Gerothanassis and K. Albert, Magn. Reson. Chem., 2005, 43, 681–687 CrossRef CAS.
M. H. Hamdan, Cancer Biomarkers: Analytical Techniques for Discovery, John Wiley & Sons, Inc., 2007 Search PubMed.
M. Arjmand, M. Kompany-Zareh, M. Vasighi, N. Parvizzadeh, Z. Zamani and F. Nazgooei, Talanta, 2010, 81, 1229–1236 CrossRef CAS PubMed.
M. Vasighi, A. Zahraei, S. Bagheri and J. Vafaeimanesh, J. Chemom., 2013, 27, 318–322 CrossRef CAS.
R. Madsen, T. Lundstedt and J. Trygg, Anal. Chim. Acta, 2010, 659, 23–33 CrossRef CAS PubMed.
J. Trygg, E. Holmes and T. Lundstedt, J. Proteome Res., 2007, 6, 469–479 CrossRef CAS PubMed.
M. Barker and W. Rayens, J. Chemom., 2003, 17, 166–173 CrossRef CAS.
R. G. Brereton, Chemometrics for Pattern Recognition, first edn., John Wiley & Sons, Chichester, 2009 Search PubMed.
O. Beckonert, H. C. Keun, T. M. D. Ebbels, J. Bundy, E. Holmes, J. C. Lindon and J. K. Nicholson, Nat. Protocols, 2007, 2, 2692–2703 CAS.
C. Bush, D. VanFossen, A. J. Kolibash, R. Magorien, J. Bacon, G. Ansel, G. Eaton, M. Ramancik, A. Orsini and S. Palmer, Cathet Cardiovasc Diagn, 1993, 29, 267–272 CrossRef CAS.
M. Shahbazy, Exploratory analysis of excitation-emission fluorescence data from colorectal cancer with oblique rotation of factors and self-organizing maps, MSc thesis, Institute for Advanced Studies in Basic Sciences (IASBS), 2013 Search PubMed.
J. Yayan, Vasc. Health Risk Manage., 2012, 8, 219–223 CrossRef PubMed.
S. Prakash, K. Dhingra and S. Priya, European Journal of Dentistry, 2012, 6, 287–294 Search PubMed.
J. Vafaeimanesh, S. F. Hejazi, V. Damanpak, M. Vahedian, M. Sattari and M. Seyyedmajidi, Sci. World J., 2014, 2014, 6 Search PubMed.
W. F. Leite, J. A. F. Ramires, L. F. P. Moreira, C. M. C. Strunz and J. A. Mangione, Arq. Bras. Cardiol., 2014, 104, 202–208 Search PubMed.
E.-z. Jia, Z.-j. Yang, B. Yuan, X.-l. Zang, R.-h. Wang, T.-b. Zhu, L.-s. Wang, B. Chen and W.-z. Ma, Acta Pharmacol. Sin., 2005, 26, 1057–1062 CrossRef CAS PubMed.

Click here to see how this site uses Cookies. View our privacy policy here.

Rapid and non-invasive diagnosis of coronary artery disease via clinical laboratory parameters and 1H-NMR spectra of human blood plasma