Thiago
Mazzu-Nascimento
ab,
Giorgio Gianini
Morbioli
abc,
Luis Aparecido
Milan
d,
Diego Furtado
Silva
e,
Fabiana Cristina
Donofrio
f,
Carlos Alberto
Mestriner
g and
Emanuel
Carrilho
*ab
aInstituto de Química de São Carlos, Universidade de São Paulo, Av. Trabalhador São-carlense, 400, 13566-590 São Carlos, SP, Brazil. E-mail: emanuel@iqsc.usp.br
bInstituto Nacional de Ciência e Tecnologia de Bioanalítica-INCTBio, 13083-970 Campinas, SP, Brazil
cSchool of Chemistry and Biochemistry, Georgia Institute of Technology, 30332 Atlanta, GA, USA
dDepartamento de Estatística, Universidade Federal de São Carlos, Rod. Washington Luís km 235, 13565-905 São Carlos, SP, Brazil
eInstituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, 13566-590 São Carlos, SP, Brazil
fInstituto de Ciências da Saúde, Universidade Federal de Mato Grosso, 78557-267 Sinop, MT, Brazil
gWama Produtos para Laboratório Ltda, 13560-971 São Carlos, SP, Brazil
First published on 22nd March 2017
Paper-based devices are an excellent match for low-cost point-of-care testing (POCT) tools. Their user-friendliness, portability, and short time of analysis, coupled with ease of local manufacture make these devices the best option for inexpensive diagnostic testing tools. However, despite all their positive features, these low-cost diagnostic devices must present good performance indicators, such as sensitivity, specificity, and accuracy. We developed and validated a paper-based ELISA for toxoplasmosis diagnosis through the detection of Toxoplasma gondii immunoglobulin G (IgG) antibodies in 100 human serum samples. From among the different ways to define the cut-off value, we chose Youden's J index (cut-off = 21.73 A.U.), which presented a higher sensitivity value. Our paper-based assay presented a sensitivity of 0.96, a specificity of 0.87, and a gray zone comprising 16 samples (±15% of the cut-off value, with 3 false positive outputs). The accuracy of the test was estimated by using ROC curves (AUC = 0.97). We also created a macro in Microsoft Excel® to estimate the accuracy of the test (m-Accuracy) based on a non-parametric method, which evidenced a value = 0.88, which classifies our test as moderately to highly accurate. We also provide the m-Accuracy macro for download and the paper-based microplate designs for printing, in order to collaborate with the scientific community and facilitate further studies using this platform. The improvement of these diagnostic tools can bring this technology for those who need it, contributing to population health and well-being.
Paper-based ELISA (p-ELISA) has the same versatility as conventional immunoenzymatic assays,2 but it presents several advantages over the traditional analytical tools, such as low-cost, portability, low sample and reagent consumption, user-friendliness and analytical flexibility, besides a reduced waste generation.2–4 Colorimetric paper-based outcomes can be digitalized using a flatbed scanner or simply photographed using a cellphone camera, and the digital image can be analyzed later offsite by a specialist or processed in real-time by a smartphone application.3–9 p-ELISA has the potential to be implemented in developing countries and needy regions to diagnose neglected diseases, such as malaria,10 dengue fever,11 syphilis,12 and HIV13 which demonstrates the applicability and versatility of POCT devices such as ours.5,8,14–16
Much is said about the advantages related to the costs of these devices,17 but besides the advantage of their low cost, these POCT tools bring an uncertainty related to their output, being necessary to establish parameters to evaluate their usefulness. The most common approaches involve the assessment of the traditional figures of merit of the method, which includes the limits of detection (LODs),18 quantification (LOQs), dynamic linear range and analytical sensitivity, or a direct graphical comparison between the reference method (gold-standard) and the new method (the paper-based test, in this case), fitting a model to observe if there is a correlation between the methods.19 However, the graphical analysis by itself is not capable of assessing the real performance of the new test. Moreover, a simple analytical curve and the determination of the conventional figures of merit of the new method (such as LOD and LOQ) will be dealing with negative outcomes,20 which are not relevant for clinical applications.
Thus, to determinate a safety margin to the outcomes of such low-cost testing devices, it is necessary to assess their performance indicators,20 such as sensitivity (proportion of diseased people with a positive outcome), specificity (proportion of healthy people with a negative outcome)21 and the accuracy (proportion of correct classifications).22 Moreover, a critical step for the establishment of a new diagnostic test is the determination of the cut-off value, which will define the threshold value between healthy and diseased patients.20,22
Here we discuss these parameters: first, we developed a paper-based ELISA for detection of immunoglobulin G anti-Toxoplasma gondii, comparing the results with those of an ELISA benchtop assay (the gold-standard test). Then, we performed a statistical assessment of the performance of p-ELISA using ROC curves (sensitivity, specificity, and accuracy), besides comparing different methods to define the cut-off value of the test and uncertainty zone of the test (gray zone). Moreover, we bring for the first time a different approach to estimate the accuracy of the test using a macro in Microsoft Excel® (m-Accuracy), based on a non-parametric method23 (available for downloading – ESI†).
The sensitivity, specificity, and accuracy of the test were calculated using the area under the curve (AUC) of the receiver operating characteristic (ROC) curves, generated by the free software environment for statistical computing and graphics (R), the R-package “pROC”.28
Additionally, we have created a macro in Microsoft Excel® to estimate the accuracy of the test using a non-parametric method, based on the work of Obuchowski et al.23 (eqn (1)).
(1) |
- w = 1, case t > s and pit > pjs or t < s e pit < pjs;
- w = 0.5, case t = s and pit = pjs;
- w = 0, otherwise.
p-ELISA devices are being used for detection of auto-immune antibodies,19 neuropeptides Y,29 as well as HIV antigens,2 showing versatility in clinical applications. The advantages of p-ELISA usage arise due to the paper characteristics, such as capillarity, thinness, lightweight and a large surface area, in comparison with traditional plastic substrates, which facilitates sample absorption and speeds up solvent dryness. These enzyme-immunoassays on a paper platform allow all steps to be performed at room temperature without the use of sophisticated equipment. Additionally, the p-ELISA devices can be manufactured locally in the laboratory meeting the needs of developing countries and needy regions for monitoring neglected diseases.
p-ELISA can make use of conventional technologies to digitalize the images, such as a flatbed scanner or a camera of a cellphone, and a software to analyze images (such as the freeware ImageJ – available at: https://imagej.nih.gov/ij/), with a total approximate cost of $ 100.2 With the continuous studies in the area, paper-based analytical devices should provide as good results as the conventional tests do, with the benefits of low-cost, simplicity and user-friendliness.
The wax printing technology to fabricate paper-based devices is the simplest and most versatile way to create distinct patterns on paper, presenting great efficiency in the creation of hydrophobic barriers to contain reactions inside the testing zones (Fig. 2A and B), which avoids cross-contamination.
Considering the volume spent on carrying out the assays, the p-ELISA presents great advantage over the commercial ELISA tests. While p-ELISA uses 5 μL per step and per spot, conventional ELISA requires a minimum of 100 μL per step and per spot (Fig. S1 – ESI†). The low reagent and sample consumption is an attractive feature to neglected disease diagnosis due to the overall low-cost, besides generating less biological waste.
Regarding time requirements for carrying out the assays, the paper-based method also presents advantages over conventional tests. p-ELISA requires 20 min for each step, and 20 s incubation with secondary antibody (anti-human IgG), thus requiring around 60 min to complete the whole process. ELISA testing, on the other hand, requires over 300 min to complete the assay (Fig. S2 – ESI†), which is five times longer than this low-cost alternative.
In terms of costs, a p-ELISA microplate costs only US $ 0.10, against US $ 5.00 for a conventional ELISA plate (Fig. 2C). Thus, considering individual assays, our p-ELISA for toxoplasmosis diagnosis presented a cost of ∼US $ 0.34 per assay, against ∼US $ 0.92 per assay for the conventional ELISA assay (Fig. 2C). When we analyze costs per step, p-ELISA devices presented smaller costs than the conventional ELISA mainly in sensitization and revealing steps, resulting in a total cost difference of ∼$ 55.00. Moreover, the high cost associated with the ELISA microplate reader results in a larger price discrepancy between the conventional immunological technique and the paper-based assays.
The cut-off value is used as the threshold to differentiate sick from healthy patients.27,30,31 The gold-standard ELISA bench kit used in this study for toxoplasmosis diagnosis presented a cut-off value at an absorbance of 1.0 and a confidence interval of ±10% (described by the manufacturer in the diagnostic kit label). This means that samples presenting an absorbance value below 0.9 were classified as negative, above 1.1 were classified as positive, and the rest situated between these values (0.9 < Abs < 1.1) were classified as uncertain (gray zone).
One hundred human serum samples were tested using the gold-standard ELISA bench kit, in which 62 samples were classified as positive, 33 as negative and 5 were classified as uncertain.
During the development of a new diagnostic test the determination of the cut-off value is necessary. A common approach is to plot a graph of sensitivity (Se) and specificity (Sp) as a function of cut-off, and the intersection point between both curves is chosen as the cut-off value.25 By using this method, we have obtained a cut-off of 24.74 A.U. (arbitrary units from the mean pixel intensity of the yellow channel), with a specificity = sensitivity = 0.87 (Fig. 3A).
A second method involves a plot of Youden's J-index (where J = specificity + sensitivity − 1, and ranges from 0 to 1) as a function of cut-off, being the highest point of the curve chosen as the cut-off value of the test.26,27 By using this method, we have obtained a cut-off of 21.73 A.U., with sensitivity = 0.96 and specificity = 0.87 (Fig. 3B).
It is relevant to underline that in Fig. 3A there is a region in between 19 to 24 A.U. in which the specificity of the test remains unchanged, while the sensitivity presents a sharp drop. It would be possible to choose an arbitrary point in this region in which both sensitivity and specificity are as high as possible. However, it is not advisable to define an arbitrary cut-off value for a diagnostic test. When Youden's J-index is used instead, the cut-off value is defined as the highest value for the combination of the variables (exactly in the region in between 19 to 24 A.U.), with the advantage of being a defined point (non-subjective decision), so we have chosen this non-arbitrary method to define the cut-off value for this diagnostic test.
The cut-off value cannot be used directly as a threshold value to separate diseased from health patients: even the gold-standard ELISA bench kit presents an uncertainty zone (0.9 < Abs < 1.1, in the present case) in which it is not possible to define patient's status.32,33 Then, it is necessary to create a three-zone partition to provide a confidence interval also for the new diagnostic test: diseased, non-diseased, and an inconclusive outcome region in between, known as the gray zone.32 Analysis of the positive likelihood ratio (LR+) and negative likelihood ratio (LR−) can be used to create this zone,32,33 as well as numerical factors on standard deviation (SD), or even a tolerance percentage on the cut-off value.34–36 Different tolerance percentages on the cut-off value (10%, 12%, 15% or 20%) have showed suitability for the establishment of a gray zone in paper-based immunoassays for detection of tumor markers.20 We have opted for the later approach, in an attempt of eliminating the largest number of samples classified as false positive or false negative (Fig. 4B–F).
From Fig. 4A it is observed that when no tolerance percentage is used there are 2 false negative and 4 false positive outcomes misclassified by the p-ELISA test. When the tolerance percentage is set to ±10% of the cut-off value (Fig. 4B), the gray zone encompasses 6 samples (but misses 2 false-negatives and 4 false-positives). When the tolerance percentage is set to ±12% of the cut-off value (Fig. 4C), the gray zone encompasses 8 samples (still missing 2 false-negatives and 4 false-positives). These percentages (±10% and ±12% of the cut-off value) were not able to eliminate the false negative outcomes or to decrease the amount of false positives.
However, when the tolerance percentage is set to ±15% of the cut-off value (Fig. 4D), the gray zone encompasses 16 samples and removes all false negative results, besides reducing the number of samples classified as false positives to 3. A further increase in the tolerance percentage (20% of the cut-off value, Fig. 4E) presents no advantage over 15%, once it encompasses 23 samples (23% of all tested samples) and misses 3 false-positive outcomes. We have evaluated the best tolerance percentage (±15%) on the cut-off value determined by the first method (24.74 A.U.), which resulted in a gray zone encompassing 20 samples, but presenting a false-positive and 2 false-negative outcomes (Fig. 4F). These results have demonstrated that the use of the cut-off value determined by Youden's J-index and the tolerance percentage of ±15% of the cut-off value presented the best choice for this assay.
Paper-based ELISA protocols still require optimizations, because of their intrinsic multistep requirements that heavily depend on the operator skills (washing, blocking, and sensing steps).1,20 Even though lateral flow immunoassays14,37,38 and other bioassays carried out in low-cost platforms39,40 do not present such requirements, being faster and easier to use, pELISA testing tools overcome these testing tools because they present advantages such as a great multiplexing capacity and improved quantitative outputs,20 and their low-cost also allows for broad-range screening tests in needy regions. These advantages are attainable only for tests with a high sensitivity (low number of false negative outputs),41 as those demonstrated by our device after the definition of its tolerance percentage. Patients with a positive outcome in a screening test are then referred to a health center for further testing. In the case of a false positive output for toxoplasmosis, for example, the clinical center would perform more specific tests in order to confirm the absence of the disease, which does not affect the patient's health. However, when tests present false negative outcomes the individuals do not receive adequate treatment, which allows for disease progression, increased treatment costs and lower chances of success. These scenarios justify a tolerance percentage of ±15% of the cut-off value due to the complete elimination of false negative outputs.
The Receiver Operating Characteristic (ROC) curve is one of the most used tools for the measurement of accuracy of diagnostic tests.23,25,41 The ROC curve is generated by plotting the sensitivity of the test against (1 − specificity),41 and it requires the classification of the samples by the gold-standard in a dichotomous outcome (1 for positive and 0 for negative).23,25 There is a comparison between the classification made by the gold standard and the outputs of the new diagnostic test (p-ELISA).23 However, there are cases in which the gold standard provides a continuous outcome instead of dichotomous outcome, and the common practice is to force the dichotomization of the results, which requires the exclusion of the samples classified in the gray zone area of the gold standard (5 samples in the present case). We used the R-package pROC28 to obtain the ROC curve for the toxoplasmosis p-ELISA testing of the 95 remaining samples, as shown in Fig. 5.
The accuracy was calculated using the area under the curve (AUC) of Fig. 5. The maximum accuracy value is one, which represents a perfect test, and our paper-based immunoassay toxoplasmosis diagnosis presented an AUC = 0.97, defined as a highly precise diagnostic test, close to the maximum accuracy.23 It is relevant to say that the forced dichotomization of the test with the subsequent exclusion of the samples classified in the gray zone area of the gold standard tends to overestimate the calculated accuracy of the diagnostic assay being evaluated. It is necessary then to compute the samples in the gray zone of the gold standard in order to obtain a more realistic estimate of the accuracy of the new diagnostic test.23
The m-Accuracy method compares each sample output with all other sample outputs in both the gold-standard and the new method, analyzing if the variation that occurs in each sample pair in the gold standard method also occurs in the paper-based ELISA. Different weights are assigned according to the event that occurs. In other words, a value of 1 is assigned to a specific pair when the sample with a higher outcome value in the gold-standard also presents a higher value in the new method. If there is no difference in values between a specific pair of samples, then a value of 0.5 is assigned to that pair. If neither of these events occurs, then the model interprets that as a “misleading outcome” and a value of 0 is assigned to that pair. Thus, to estimate the accuracy of the test these values are summed up and then divided by the number of pairs compared. The analysis is performed by a single command (ctrl + shift + T) and the functioning of the m-Accuracy tool is very straightforward. There are two columns in the macro: in the first column the values obtained by the gold standard test are inputted (in this case 100 results (mean of triplicates)), while in the second column the values obtained by the new diagnostic test are inputted (for the same samples, in the same order). The ROC curve provided an overestimated accuracy of 0.97, against a more reliable value of accuracy of 0.88 provided by the m-Accuracy tool, which classifies the test as moderately to highly accurate.23Fig. 6 illustrates the operation of m-Accuracy in a tutorial format.
It is relevant to notice that the use of the m-Accuracy tool becomes very attractive since all samples are used in the accuracy computation, including those within the gray zone of the gold standard, by comparing the results obtained by the gold standard method and the new diagnostic test. Due to the ease of use and more reliable results provided by this tool we provide m-Accuracy for download as part of our ESI.†
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c7ay00505a |
This journal is © The Royal Society of Chemistry 2017 |