ReCirc: prediction of circRNA expression and function through probe reannotation of non-circRNA microarrays†
Abstract
Growing evidence shows that circular RNAs (circRNAs) play important roles in physiological and pathological processes, but our knowledge about the function of circRNAs in diseases is still limited. CircRNA functions are closely related to their expression levels. We developed a probe reannotating program named ReCirc, which is based on sequence alignment between microarray probes and circRNAs, to reannotate circRNAs from non-circRNA microarrays (any microarray that was not designed to profile circRNAs) with microarray probe sequences that were aligned to the body and back-spliced junction sequences of circRNAs to identify circRNAs. Through ReCirc, we obtained 39 818 reannotated probe set-circRNA pairs, which involved 5388 circRNAs, from an Affymetrix human exon array. We evaluated our method by comparing circRNAs obtained by us with golden standard RNase R-resistant (RNase R+) circRNAs, predicted by an RNA-seq-based method find_circ, in the HeLa cell line. The results showed that ReCirc circRNAs, especially those with higher expression level, were partially present in RNase R+ data. In addition to RNA-seq, a circRNA microarray, such as the Agilent-069978 Arraystar Human CircRNA microarray, was also applied to predict and profile circRNAs. Thus, we compared the circRNA profile obtained from ReCirc with that from the circRNA microarray. The results showed that circRNA expression is similar between ReCirc and circRNA microarray in samples from the same tissue. We also evaluated ReCirc, by comparing ReCirc with the find_circ program, in their abilities to compute circRNA expression variation in multiple cell lines and performed molecular verification in the HeLaS3 cell line for those circRNAs that got good performance. As a result, 5 of the 9 randomly selected circRNAs were successfully verified. Functional analysis of identified circRNAs in 4 different cancers indicated that circRNAs may be crucial biomarkers for cancer diagnosis and prognosis. Thus, ReCirc allows us to identify circRNAs from any non-circRNA microarray, and to back-annotate old microarray data from public data sets, which would facilitate re-utilization of the wealth of microarray data sets, to enable the characterization of circRNAs in tissues and cell lines. Here we state that our method is designed only for microarrays and cannot be used for RNAseq data.