Prediction of Essential Protein Using Machine Learning Technique

Author:- Md. Inzmam-Ul-Hossain and Md. Rafiqul Islam
Category:- Journal; Year:- 2021
Discipline:- Computer Science & Engineering Discipline
School:- Science, Engineering & Technology School

Abstract

For the survival and reproduction of organisms, essential proteins are crucial. Identification of the essential protein is important for cell working and drug design. Essential proteins are predicted from many protein-protein interactions (PPI) networks that are developed using high-throughput techniques. Computational methods are used by many existing proposed techniques to identify essential proteins. Many of them considered topological features for essential protein prediction. Some of the research works consider both topological and biological features to identify essential proteins. In this paper, we have proposed a method using machine learning techniques to accomplish the purpose. Here the Saccharomyces Cerevisiae dataset is considered for essential protein prediction. Three classifiers such as XGBoost, Random Forest, and decision tree have been used to predict the essential proteins. We also apply ensemble methods combined with the three classifiers XGBoost, Random Forest, and decision tree for essential protein prediction. The ensemble method gives the highest accuracy to identify essential proteins compared with the other existing methods. On the other hand, XGBoost gives the highest F1 score.

Read More