Issue 3, 2015

lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning

Abstract

Long noncoding RNAs (lncRNAs) are emerging as a novel class of noncoding RNAs and potent gene regulators, which play an important and varied role in cellular functions. lncRNAs are closely related with the occurrence and development of some diseases. High-throughput RNA-sequencing techniques combined with de novo assembly have identified a large number of novel transcripts. The discovery of large and ‘hidden’ transcriptomes urgently requires the development of effective computational methods that can rapidly distinguish between coding and long noncoding RNAs. In this study, we developed a powerful predictor (named as lncRNA-MFDL) to identify lncRNAs by fusing multiple features of the open reading frame, k-mer, the secondary structure and the most-like coding domain sequence and using deep learning classification algorithms. Using the same human training dataset and a 10-fold cross validation test, lncRNA-MFDL can achieve 97.1% prediction accuracy which is 5.7, 3.7, and 3.4% higher than that of CPC, CNCI and lncRNA-FMFSVM predictors, respectively. Compared with CPC and CNCI predictors in other species (e.g., anole lizard, zebrafish, chicken, gorilla, macaque, mouse, lamprey, orangutan, xenopus and C. elegans) testing datasets, the new lncRNA-MFDL predictor is also much more effective and robust. These results show that lncRNA-MFDL is a powerful tool for identifying lncRNAs. The lncRNA-MFDL software package is freely available at http://compgenomics.utsa.edu/lncRNA_MDFL/ for academic users.

Graphical abstract: lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning

Supplementary files

Article information

Article type
Paper
Submitted
04 Nov 2014
Accepted
06 Jan 2015
First published
06 Jan 2015

Mol. BioSyst., 2015,11, 892-897

Author version available

lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning

X. Fan and S. Zhang, Mol. BioSyst., 2015, 11, 892 DOI: 10.1039/C4MB00650J

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Spotlight

Advertisements