Home > Published Issues > 2024 > Volume 15, No. 1, 2024 >
JAIT 2024 Vol.15(1): 27-32
doi: 10.12720/jait.15.1.27-32

Coronary Heart Disease Prediction: A Comparative Study of Machine Learning Algorithms

Ahmad Hammoud *, Ayman Karaki *, Reza Tafreshi, Shameel Abdulla, and Md Wahid
Mechanical Engineering Department, Texas A&M University at Qatar, Doha, Qatar
Email: ahmad.hammoud@qatar.tamu.edu (A.H.); ayman.karaki@qatar.tamu.edu (A.K.); reza.tafreshi@qatar.tamu.edu (R.Y.); Shameel.abdulla@qatar.tamu.edu (S.A.); md.wahid@qatar.tamu.edu (M.W.)
*Corresponding author

Manuscript received June 12, 2023; revised August 3, 2023; accepted September 29, 2023; published January 3, 2024.

Abstract—Efforts to enhance the precision of heart disease detection methods are crucial in reducing the expensive healthcare expenses associated with the diagnostic processes. Extracting patterns from medical data can unlock associations to improve heart disease diagnosis techniques. This study aims to construct an efficient machine learning model to act as a reliable component of the medical decision support system. Seven different machine learning models were investigated including Logistic Regression, Support Vector Classifier, K-Nearest Neighbor (KNN), Random Forest, Decision Tree, Naïve Bayes, and Gradient Boosting Classifier, which are comprehensively explored for heart disease classification. Hyperparameter optimization for these algorithms involves three techniques: Grid Search, Random Search, and Bayes Search. The assessment of each model’s performance incorporates measuring specificity, sensitivity, and F1-scores, leveraging the dataset with 12 attributes and 1189 observations from three medical clinics (Cleveland, Statlog, Hungary). Feature selection methods, including the wrapper method, embedded method Chi-Sqaured, and variance analysis, are deployed to identify highly correlated features, ultimately reducing the data’s dimensionality to 7 features. The evaluation process employs 10-fold cross-validation, demonstrating that the Random Forest Model achieves the highest average accuracy at 92.85%, surpassing the previously reported 86.9%. Additionally, 10-fold cross-validation ensures the models’ reliability and resilience to data imbalance. Ensemble-based methods reaffirm the Random Forest’s superior performance in diagnosing heart diseases, boasting an accuracy of 94.96%. In sum, this developed model exhibits reliability in heart disease classification and presents a promising solution for medical applications, to effectively mitigate diagnostic costs and time constraints.
 
Keywords—applied machine learning, coronary heart disease, random forest

Cite: Ahmad Hammoud, Ayman Karaki, Reza Tafreshi, Shameel Abdulla, and Md Wahid, "Coronary Heart Disease Prediction: A Comparative Study of Machine Learning Algorithms," Journal of Advances in Information Technology, Vol. 15, No. 1, pp. 27-32, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.