Chiang Mai Journal of Science

Print ISSN: 0125-2526 | eISSN : 2465-3845

1,647
Articles
Q3 0.80
Impact Factor
Q3 1.3
CiteScore
7 days
Avg. First Decision

Hierarchical Multi-label Associative Classification for Protein Function Prediction Using Gene Ontology

Sawinee Sangsuriyun, Thanawin Rakthanmanon and Kitsana Waiyamai
* Author for corresponding; e-mail address: savinee.sa@ku.th
Volume: Vol.46 No.1 (January 2019)
Research Article
DOI:
Received: 28 June 2016, Revised: -, Accepted: 8 October 2018, Published: -

Citation: Sangsuriyun S., Rakthanmanon T. and Waiyamai K., Hierarchical Multi-label Associative Classification for Protein Function Prediction Using Gene Ontology, Chiang Mai Journal of Science, 2019; 46(1): 165-179.

Abstract

In this paper, protein function prediction is considered as a complex hierarchical multi-label classification problem. Each instance can be classified into several classes and these are organized in a hierarchical structure where each class has a parent-child relationship with one another. eHMAC is an extended Hierarchical Multi-label Associative Classification that has been proposed for automated protein function prediction. Main objective of this paper is to improve both accuracy and explanation abilities of Hierarchical Multi-label Associative Classification (HMAC) in predicting functions of new protein sequences. The idea is to utilize the gene ontology as background knowledge and integrate it into different steps of HMAC. Three domains of gene ontology which are molecular function, biological process, and cellular component are used as background knowledge to generate high-quality classification rules to predicted protein functions. The experimental results showed that the eHMAC method using background knowledge provided significantly better results than the previously proposed HMAC. Not only the prediction accuracy was greatly improved, but also the explanation abilities of the function prediction model in terms of association between motifs and Gene Ontology (GO) terms.

Keywords: protein function prediction, associative classification, hierarchical classification, multi-label classification, negative rules
Outline
Figures