Issue 17, 2014

Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors

Abstract

Multi-family enzymes are of great importance in life, disease and other domains. However, in terms of the classification of enzymes, the information of multi-family enzymes is always removed from the dataset to account for the limitation of traditional single-label prediction methods. In order to predict multiple classes of multi-family enzymes, we adopted two multi-label learning algorithms, namely RAkEL-RF and MLKNN, and two types of protein descriptors, namely CTD and PseAAC, to generate four predictors, RAkEL-RF-CTD, RAkEL-RF-PseAAC, MLKNN-CTD and MLKNN-PseAAC. When the four predictors were tested on a training set with 10-fold cross validation, the overall success rates reached 97.99%, 96.07%, 96.01% and 95.31%, respectively. For the independent test set, the corresponding rates reached 97.57%, 95.03%, 95.9% and 93.9%, respectively. In conclusion, it proved the outstanding prediction capability and robustness of our predictors from the extremely small difference between two sets for each predictor and the relatively higher accuracy. In addition, three of seven pairs of homologous enzymes with different functions and eighteen of twenty-three distantly related enzymes with a similar family were correctly classified by the RAkEL-RF-CTD predictor. These results indicated the extensive applicability of our predictors.

Graphical abstract: Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors

Supplementary files

Article information

Article type
Paper
Submitted
24 May 2014
Accepted
16 Jun 2014
First published
17 Jun 2014

Anal. Methods, 2014,6, 6832-6840

Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors

Y. Wang, R. Jing, Y. Hua, Y. Fu, X. Dai, L. Huang and M. Li, Anal. Methods, 2014, 6, 6832 DOI: 10.1039/C4AY01240B

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements