Olgac
Özarslan‡
a,
Begum Kubra
Tokyay‡
ai,
Cansu
Soylemez
a,
Mehmet Tugrul
Birtek
a,
Zihni Onur
Uygun
d,
İpek
Keles
e,
Begum
Aydogan Mathyk
f,
Cihan
Halicigil
g and
Savas
Tasoglu
*bchij
aGraduate School of Sciences & Engineering, Koc University, Istanbul, 34450, Turkiye
bKoç University Is Bank Artificial Intelligence Lab (KUIS AI Lab), Koç University, Sariyer, Istanbul 34450, Turkiye. E-mail: stasoglu@ku.edu.tr
cMechanical Engineering Department, School of Engineering, Koç University, Istanbul, 34450 Turkiye
dDepartment of Medical Biochemistry, Faculty of Medicine, Kafkas University, Kars 36100, Turkiye
eKoc University Hospital, Assisted Reproduction Unit, Istanbul, Turkiye
fDepartment of Obstetrics and Gynecology, Morsani College of Medicine, University of South Florida, Tampa, Florida 33606, USA
gYale University School of Medicine, Dept. Obstetrics, Gynecology & Reproductive Sciences, Yale University, Connecticut 06520, USA
hKoç University Arçelik Research Center for Creative Industries (KUAR), Koc University, Istanbul, 34450 Turkiye
iKoç University Translational Medicine Research Center (KUTTAM), Koc University, Istanbul, 34450 Turkiye
jBogazici Institute of Biomedical Engineering, Bogazici University, Istanbul, 34684 Turkiye
First published on 31st January 2025
The development of paper-based systems has revolutionized point-of-care (POC) applications by enabling rapid, robust, accurate and sensitive biochemical analysis, infectious disease diagnosis, and fertility monitoring, in particular, in male fertility monitoring, offering portable, cost-effective solutions compared to traditional methods. This innovation addresses high costs and limited accessibility of male fertility testing in resource-poor settings. Male infertility, a significant issue globally, often faces stigma, hindering men from seeking care. This study introduces a novel approach to male fertility testing using colorimetric analysis of paper-based assays, enhanced by synthetic imagery and the YOLOv8 (You Only Look Once) object detection algorithm. Synthetic imagery was employed to train and fine-tune YOLOv8, enhancing its capability to accurately detect color changes in paper-based tests. This colorimetric detection leverages smartphone imaging, making it both accessible and scalable. Initial experiments demonstrate that YOLOv8’s precision and efficiency, when combined with synthetic data, significantly enhance the system's ability to recognize and analyze colorimetric signals, positioning it as a promising tool for male fertility POC diagnostics. In our study, we evaluated 39 semen samples for pH and sperm count using standard clinical tests, comparing these results with a novel paper-based semen analysis kit. This kit utilizes reaction zones that exhibit color changes when exposed to semen samples, with images captured using a smartphone under varied lighting conditions. Despite a limited number of images, our synthetically trained YOLOv8 model achieved an accuracy of 0.86, highlighting its potential to improve the reliability of colorimetric analysis for both home and clinical use.
Paper-based sensor systems offer a solution to many existing challenges by enabling user-friendly sperm testing in the comfort and privacy of a patient's home.6 Additionally, paper-based microfluidics integrated into POC tests can provide low-cost, disposable, rapid, and sensitive sperm analysis7 to: (i) assist clinicians in initial infertility screening without imposing significant financial burdens on patients,8 (ii) facilitate male infertility diagnosis in resource-limited settings, and (iii) help alleviate concerns and stigma associated with male infertility testing.9–11 Colorimetric detection using a mobile phone camera is challenging due to variations in lighting, angle, and camera quality, which can lead to inconsistent color readings and reduced accuracy in quantitative analysis. Additionally, differences in color calibration across devices and ambient light interference can complicate the reliable detection and interpretation of color changes, especially for subtle variations.12 YOLO is a high-speed, accurate object detection model capable of recognizing and locating items in images, making it ideal for real-time applications. For colorimetric detection in home-based paper tests, YOLO can be fine-tuned to detect and analyze specific color changes or intensities on test strips, enabling accurate, automated interpretation of results from smartphone-captured images, even in variable lighting and environments.13 However, the number of images required to fine-tune a YOLO model effectively can vary depending on the complexity of the task, the variation within the dataset, and the accuracy desired. Generally, a minimum of 500–1000 labeled images per class is recommended for initial fine-tuning, but more images (often in the range of 5000–10000) can significantly improve performance, especially for challenging or nuanced detection tasks like colorimetric analysis.14
To create a YOLO model capable of accurately capturing custom sensing regions and assigning correct labels, the dataset should include high-resolution images of the paper-based sensors with bounding box annotations around each region of interest (ROI), labeled according to the analyte level or test result. Additionally, a variety of lighting conditions, color variations, augmented samples, and synthetic images (if necessary) will help the model adapt to real-world variations and detect subtle color changes effectively. Obtaining a dataset for a colorimetric semen test kit is challenging due to the need for precise color variations that represent different fertility levels, which may not be naturally abundant or easy to replicate. Additionally, capturing consistent, high-quality images under diverse real-world conditions, while also labeling and annotating each sensing region accurately, requires extensive time, specialized equipment, and rigorous quality control to ensure the model's effectiveness in various environments.
Unreal Engine and Unity are popular, powerful game engines widely used for creating 3D environments, simulations, and interactive experiences due to their sophisticated rendering and lighting capabilities, physics engines, and flexible development tools. They enable developers to design realistic scenes by combining advanced shaders, global illumination, and ray tracing techniques to simulate natural light behaviors. Through these tools, artists and developers can fine-tune lighting and textures, allowing for highly accurate visual renderings that are essential for realism in games, simulations, and even architectural visualization. We utilized Unity to generate synthetic images that closely mimic actual semen test kits by implementing custom shaders that procedurally generate varying sensing regions based on the sensing regions of actual semen test images under varying lighting conditions.
Here, we present a paper-based colorimetric semen analysis sensor to accurately measure the sperm count and pH along with a mobile application that includes an ML-enabled colorimetric image analysis system which can overcome several drawbacks associated with conventional semen testing and provide a low-cost, accurate, and user-friendly male fertility monitoring strategy. A laser cutter was used to fabricate multiple channels and reaction zones on Whatman filter paper. Reaction zones were chemically modified to allow color changes based on the sperm count and pH values of patient semen samples and the resulting color changes captured using a smartphone. The captured images are standardized by two pre-processing steps. YOLOv8 by Ultralytics employed to detect and quantify color changes and map the corresponding labels, thereby minimizing inter-user variability in result interpretation, and the pipeline of this study is shown in Fig. 1. Following the development of paper-based tests, samples with known pH and sperm count values (established using conventional clinical laboratory tests) were applied onto test strips. Images of test strips were then captured using a smartphone at various orientations and light conditions. The sensing regions of those images are then used for procedural image generation to fine tune the YOLOv8 model. Although the model achieved 0.86 accuracy, it is significant to acknowledge the success of the classification approach under highly varying disturbances of the smart phone imaging with a scarce dataset. Leveraging computer graphics algorithms, synthetic images with the fine-tuned powerful YOLO-based model prove to be a highly promising tool for addressing paper-based colorimetry, since it eliminates troublesome data gathering processes. This system can revolutionize male fertility tracking particularly in areas with limited access to healthcare resources and simultaneously support clinicians in screening for male infertility.
To further investigate the relationship between sperm concentration and MTT assay results, a sperm dilution series was prepared. Initial sperm concentrations were determined using a hemocytometer,16 and semen samples were subsequently diluted to a series of concentrations. Each diluted sample was applied to a paper-based semen assay developed for this study, as illustrated in Fig. 2C. A clear correlation was observed between formazan intensity and sperm concentration; higher sperm counts consistently produced more formazan, indicating greater metabolic activity.
Images captured with a smartphone were analyzed and validated using ImageJ software, and a calibration curve was constructed based on these results (Fig. 2D). From this result, the limit of detection (LOD) and limit of quantification (LOQ) for sperm concentration were determined to be 8.27 million sperm per mL and 25.3 million sperm per mL, respectively, underscoring the assay's sensitivity and quantitative capability.
The assay results revealed distinct responses of the sensor to the range of tested chemicals. Only PBS, sodium citrate, and MgCl2 produced measurable signals, suggesting potential interaction with the sensor. Notably, these substances displayed only modest responses relative to the baseline, implying a level of specificity in the sperm sensor's detection capabilities for sperm count. Conversely, chymotrypsin, urea, KCl, glycine, glucose, CaCl2, BSA, Na2SO4, and NaCl exhibited minimal or no response, indicating limited or no interaction (Fig. 2E). This response highlights the sensor's selectivity towards metabolically active sperm count like we expected.
Deionized water served as a blank to create baseline responses in Fig. 2E, also, with the unadjusted results shown in ESI† Fig. S5, providing a complete view of the responses to each analyte tested. This comprehensive dataset underscores the sensor's ability to selectively detect metabolically active sperm within complex biological matrices.
YOLOv8 is an object detection model that operates in a single-stage process, directly predicting bounding boxes and class probabilities for objects in an image. It efficiently extracts features from the input image using convolutional layers and then generates bounding box predictions and class probabilities using a prediction head where non-maximum suppression is applied to filter out redundant detections, resulting in a set of bounding boxes with associated class labels and confidence scores.17 YOLOv8 is known for its speed, accuracy, flexibility, and scalability, making it suitable for various real-time object detection applications.
To enable capturing class probabilities that the model hasn't seen before, a representative collection of images containing the target in a compatible format (image and its corresponding .txt file that contains class ID, bounding box coordinates, and the bounding box size between 0 and 1) and training the model on the dataset using a pre-trained model is required. Then, the model's performance on validation and testing sets are evaluated to assess accuracy. Therefore, detecting custom classes relies on the protocol-compliant dataset and data augmentation techniques to make sure the dataset effectively captures the desired features and the model learns them effectively.18 In our case, obtaining a dataset with similar class distribution is a challenging task since regular semen pH clusters within the 7.2–8 range, and sperm count clusters within 15–200 million,19 but images with pH and sperm count values that do not fall into these ranges are also needed in a similar amount. Besides the challenging nature of obtaining well-distributed class instances, trained operators, a significant number of donors, time-consuming lab procedures and expensive lab equipment to obtain gold standard semen analysis data are strict requirements. In addition, varying lighting conditions also affect the bitwise spatial information of the captured image. Therefore, sample images need to be populated under varying lighting conditions to enable accurate prediction.
Procedural image generation can be a valuable tool for augmenting or creating training datasets and improving the generalization capabilities of YOLOv8 models. By generating synthetic images with diverse variations of the target objects, one can increase the model's exposure to different scenarios and reduce the risk of overfitting. Techniques like GANs (generative adversarial networks) can be used to create realistic-looking images that mimic the appearance and variations of real-world objects.20 This can be particularly useful when dealing with limited or biased training data, as it allows us to artificially expand the dataset and introduce new variations that might not be present in the original images. By incorporating procedurally generated images into the training process, YOLOv8's ability to detect and localize target objects in a wider range of conditions can be enhanced.21
To procedurally generate images that preserve the disturbances and dynamics of the real images and applicable to the YOLOv8 convention, we organized our dataset with 3 labels for pH (5 < pH < 7, 7 < pH < 8, and 8 < pH < 10) and 2 labels for sperm count (sperm count <10 million and sperm count >10 million); we manually extracted the region of interests for each label and combined cropped sensing regions falling into label intervals in the same group (Fig. 3A). Texture transformation function is employed to manipulate texture coordinates with different mapping algorithms (Fig. S9†), resulting in varying flow patterns and color distribution, mimicking spatial and bitwise information of the real sample (Fig. 3B). These transformed textures are then inserted into the corresponding locations on top of the 3D paper body mesh. A virtual camera is placed to obtain the top-down view of the generated 3D sample and saves the rendered results as an image file to create the desired dataset (Fig. 3C). Game engines like Unity and Unreal Engine employ a variety of techniques to simulate realistic lighting in virtual worlds. These techniques include ray tracing for highly accurate and realistic effects, hybrid rendering for a balance of quality and performance, rasterization for speed and efficiency, screen space ambient occlusion for depth and realism, lightmaps for pre-calculated lighting, and deferred shading for efficient handling of multiple light sources. By carefully selecting and combining these methods, game engines can create visually stunning and immersive lighting effects that enhance the overall player experience. Therefore, with a bare-minimum virtual scene and custom designed algorithms using Unity, we manage to control various parameters of the paper sensor such as the texture array index (spread pattern type), shear strength, shear rotation inclination (left–right-up-down) and lighting conditions such as the incident light angle, intensity, hue, and saturation. As a result, we obtained 2500 images capturing the target domain by slightly mimicking the lateral flow patterns and real-life disturbances applicable to the YOLO training format (Fig. 3D).
When test images were used as input, the results revealed that the model achieved 0.71 accuracy for 5 < pH < 7, 0.875 for 7 < pH < 8, and 0.75 for 8 < pH < 10; the model predicted sperm count labels without any error for protocol-compliant dataset while it struggled to assign the correct labels to the 7 < pH < 8 label, achieving 0.75 for 5 < pH < 7 and 0.24 for 7 < pH < 8. (Fig. 4A). In addition, the model accurately detected regions of interest (ROIs) and produced well-bounded boxes (Fig. 4B and C). The results highlight the significance of the testing conditions and the limitations of using synthetic data for training. The YOLOv8s model, trained solely on synthetic images, demonstrated promising accuracy when the waiting time and protocols were strictly followed, as observed in the protocol-compliant dataset test. However, the challenging dataset introduced additional complexity. This dataset used artificial solutions with varying pH values, interfaced with the instrument without adhering to a standardized protocol. Moreover, the image quality varied significantly—images were captured in less than 3 minutes after interfacing, often while the solutions were still wet (Fig. 4C). These factors, combined with inconsistent lighting and environmental conditions, contributed to the model's poor accuracy, particularly for the 7 < pH < 8 range, underscoring the importance of controlled conditions.
Our approach, utilizing a YOLOv8 model trained with synthetic images, effectively demonstrates the potential of this methodology for colorimetric sperm analysis. The results suggested that by employing a more sophisticated virtual scene setup, higher classification accuracy might be achievable. In our case, the difficulty of obtaining gold-standard sperm image data in large quantities is mitigated by algorithmically generating sample images by preserving the significant features in actual images. Procedural image generation emerged as a powerful tool for creating a representative dataset, became an essential tool for us to tackle various challenges related to colorimetric sperm analysis using mobile phone cameras. We explored a single-step solution using paper-based colorimetric sensing and it holds significant promise with the potential to significantly reduce the workload associated with time-consuming procedures for obtaining ground truth samples, while simplifying the real-life applications, eventually leading to the more adopted continuous health-monitoring platforms.22
Our study emphasizes the significance of integrating cutting-edge ML methodologies with breakthrough diagnostic technology to address urgent healthcare concerns. We further demonstrated an effective approach for enabling colorimetric detection that requires a minimal number of precious samples and expensive laboratory hardware with trained personnel. By relying on a YOLO model, fine-tuned with synthetic data, generated by bare-minimum rendering settings and a virtual scene setup, we achieved 0.86 when the test protocol was strictly followed. We aim to increase the early detection of male infertility and ultimately give millions of couples who are trying to start families hope by bridging the accessibility and accuracy gap. These paper-based diagnostic kits can be enhanced and made more comprehensive through additional research and development, making them more potent weapons in the global war against infertility.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sd00348a |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |