Home > Published Issues > 2021 > Volume 12, No. 3, August 2021 >

Improved Protein Function Prediction by Combining Clustering with Ensemble Classification

Haneen Altartouri and Tobias Glasmachers
Ruhr-University Bochum, Germany

Abstract—Predicting protein functions is a challenging task in bioinformatics, different machine learning algorithms have been used for this task. In this paper, we investigate the effect of applying clustering and ensembles of classifiers to improve the performance of the prediction. Two approaches are proposed, the first approach depends on clustering to build an ensemble of classifiers, while the second approach uses the clustering to break down the complex dataset into sub-datasets, then an ensemble of different classifiers train inside each sub-dataset. We observed that this combination of clustering and classifications improved the performance of prediction in the most cases.
Index Terms—protein function classification, clustering, stacking, diverse classifiers

Cite: Haneen Altartouri and Tobias Glasmachers, "Improved Protein Function Prediction by Combining Clustering with Ensemble Classification," Journal of Advances in Information Technology, Vol. 12, No. 3, pp. 197-205, August 2021. doi: 10.12720/jait.12.3.197-205

Copyright © 2021 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.