Home
Author Guide
Editor Guide
Reviewer Guide
Published Issues
Special Issue
Introduction
Special Issues List
Sections and Topics
Sections
Topics
journal menu
Aims and Scope
Editorial Board
Indexing Service
Article Processing Charge
Open Access
Copyright and Licensing
Preservation and Repository Policy
Publication Ethics
Editorial Process
Contact Us
General Information
ISSN:
1798-2340 (Online)
Frequency:
Bimonthly
DOI:
10.12720/jait
Indexing:
ESCI (Web of Science)
,
Scopus
,
CNKI
,
etc
.
Acceptance Rate:
19%
APC:
450 USD
Average Days to Accept:
112 days
Journal Metrics:
2.4
2021
CiteScore
44th percentile
Powered by
Editor-in-Chief
Prof. Kin C. Yow
University of Regina, Saskatchewan, Canada
I'm delighted to serve as the Editor-in-Chief of the
JAIT
Editorial Board.
JAIT
is intended to reflect new directions of research and report latest advances in information technology. I will do my best to increase the prestige of the journal.
What's New
2023-05-16
Vol. 14, No. 1 and No. 2 have been indexed by Crossref.
2023-05-16
JAIT Vol. 14, No. 1 has been indexed by Scopus.
2023-04-26
Vol. 14, No. 2 has been published online!
Home
>
Published Issues
>
2020
>
Volume 11, No. 2, May 2020
>
Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes
Tahani Daghistani and Riyad Alshammari
Health Informatics Department, College of Public Health and Health Informatics, King Saud Bin Abdulaziz University for Health Sciences (KSAU-HS), King Abdullah International Medical Research Center (KAIMRC), Ministry of National Guard Health Affairs, Riyadh, KSA
Abstract
—Diabetes is one of the global concerns in the healthcare domain and one of the leading challenges locally in Saudi Arabia. The prevalence of diabetes is anticipated to rise; early prediction of individuals at high risk of diabetes is a significant challenge. This study aims to compare RandomForest machine learning algorithm and Logistic Regression algorithm towards the prediction of diabetes. We analyzed 66,325 records that extracted from the Ministry of National Guard Hospital Affairs (MNGHA) databases in Saudi Arabia between 2013 and 2015. Both Machine Learning algorithms were applied to predict diabetes based on 18 risk factors. The evaluation criteria to compare the two algorithms were based on precision, Recall, True Positive rate, False Negative rate, F-measure and Area under the curve. The overall prevalence of diabetes in the data set is 64.47%. Male represents 55.50% of the data set while female represents 44.50%. For RandomForest (RF) model, the precision, Recall, True Positive Rate, False Positive Rate and F-measure value for predicting diabetes were 0.883, 0.88, 0.88, 0.188 and 0.876, respectively, while Logistic Regression model were only 0.692, 0.703, 0.703,0.454 and 0.675, respectively. Area under the ROC curve (AUC) value was 0.944 for the RF model and 0.708 for Logistic Regression model, which demonstrates higher predictive performance for RF than the Logistic Regression model. The RF algorithm showed superior prediction performance over Logistic Regression technique in predicting diabetes based on various matrices.
Index Terms
—diabetes, predictive model, machine learning, RandomForest, logistic regression
Cite: Tahani Daghistani and Riyad Alshammari, "Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes," Journal of Advances in Information Technology, Vol. 11, No. 2, pp. 78-83, May 2020. doi: 10.12720/jait.11.2.78-83
Copyright © 2020 by the authors. This is an open access article distributed under the Creative Commons Attribution License (
CC BY-NC-ND 4.0
), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.
6-IJMLC-25-Edited-KSA
PREVIOUS PAPER
Detection and Visualization of Bilingual Trending Topics
NEXT PAPER
Edges of Interpolating Tetrahedron Based Encryption Algorithm for 3D Printing Model