SNRFCB: sub-network based random forest classifier for predicting chemotherapy benefit on survival for cancer treatment†
Abstract
Adjuvant chemotherapy (CTX) should be individualized to provide potential survival benefit and avoid potential harm to cancer patients. Our goal was to establish a computational approach for making personalized estimates of the survival benefit from adjuvant CTX. We developed Sub-Network based Random Forest classifier for predicting Chemotherapy Benefit (SNRFCB) based gene expression datasets of lung cancer. The SNRFCB approach was then validated in independent test cohorts for identifying chemotherapy responder cohorts and chemotherapy non-responder cohorts. SNRFCB involved the pre-selection of gene sub-network signatures based on the mutations and on protein–protein interaction data as well as the application of the random forest algorithm to gene expression datasets. Adjuvant CTX was significantly associated with the prolonged overall survival of lung cancer patients in the chemotherapy responder group (P = 0.008), but it was not beneficial to patients in the chemotherapy non-responder group (P = 0.657). Adjuvant CTX was significantly associated with the prolonged overall survival of lung cancer squamous cell carcinoma (SQCC) subtype patients in the chemotherapy responder cohorts (P = 0.024), but it was not beneficial to patients in the chemotherapy non-responder cohorts (P = 0.383). SNRFCB improved prediction performance as compared to the machine learning method, support vector machine (SVM). To test the general applicability of the predictive model, we further applied the SNRFCB approach to human breast cancer datasets and also observed superior performance. SNRFCB could provide recurrent probability for individual patients and identify which patients may benefit from adjuvant CTX in clinical trials.