A web-based automated machine learning platform to analyze liquid biopsy data†
Abstract
Liquid biopsy (LB) technologies continue to improve in sensitivity, specificity, and multiplexing and can measure an ever growing library of disease biomarkers. However, clinical interpretation of the increasingly large sets of data these technologies generate remains a challenge. Machine learning is a popular approach to discover and detect signatures of disease. However, limited machine learning expertise in the LB field has kept the discipline from fully leveraging these tools and risks improper analyses and irreproducible results. In this paper, we develop a web-based automated machine learning tool tailored specifically for LB, where machine learning models can be built without the user's input. We also incorporate a differential privacy algorithm, designed to limit the effects of overfitting that can arise from users iteratively developing a panel with feedback from our platform. We validate our approach by performing a meta-analysis on 11 published LB datasets, and found that we had similar or better performance compared to those reported in the literature. Moreover, we show that our platform's performance improved when incorporating information from prior LB datasets, suggesting that this approach can continue to improve with increased access to LB data. Finally, we show that by using our platform the results achieved in the literature can be matched using 40% of the number of subjects in the training set, potentially reducing study cost and time. This self-improving and overfitting-resistant automatic machine learning platform provides a new standard that can be used to validate machine learning works in the LB field.