Home > Published Issues > 2025 > Volume 16, No. 8, 2025 >
JAIT 2025 Vol.16(8): 1169-1177
doi: 10.12720/jait.16.8.1169-1177

Comparative Analysis of Thyroid Disease Prediction Models: A Study of Logistic Regression, Decision Tree, and Random Forest Approaches

Pravin Satyanarayan Metkewar 1, Diaa S. Metwally 2, Prakhar Kapoor 3, and Aafaq A. Rather 4,*
1. School of Computer Science and Engineering, Dr Vishwanath Karad MIT World Peace University, Pune, India
2. Department of Accounting, Faculty of Business, Imam Mohammad Ibn Saud Islamic University (IMSIU),
Riyadh, Saudi Arabia
3. Symbiosis Institute of Computer Studies and Research (SICSR), Symbiosis International (Deemed University) (SIU), Pune, India
4. Symbiosis Statistical Institute, Symbiosis International (Deemed University), Pune, India
Email: pravin.metkewar@gmail.com (P.S.M.); dmetwally@imamu.edu.sa (D.S.M.);
prakharkapoor5@gmail.com (P.K.); aafaq7741@gmail.com (A.A.R.)
*Corresponding author

Manuscript received February 12, 2025; revised April 9, 2025; accepted May 9, 2025; published August 18, 2025.

Abstract—When researchers examine a broad range of thyroid-related conditions with respect to patient outcomes and treatment effectiveness. Modifications to our lifestyle can significantly reduce our risk of developing certain thyroid disorders. Regrettably, the number of fatalities caused by various thyroid conditions has been rising. Our thyroid is examined by endocrinologists, thyroid specialists, or computed ultrasound. Most people, however, cannot afford these tests. Nowadays, the primary means of prolonging the life of a thyroid patient is medication. However, the goal of this study is to use machine learning to forecast a person’s risk of developing thyroid disease before they experience any symptoms or issues. Preventing an issue from arising in the first place is the same as preventing it rather than treating it. Basic patient data, including age, gender, blood pressure, hyperthyroidism, and hypothyroidism, are used in this prediction process. Machine learning is used by endocrinologists to assist them in selecting the most appropriate treatment plan for each patient. It has been demonstrated that machine learning algorithms can reliably generate accurate results based on input data. This study compares three distinct methods to ascertain which one yields the best results and efficacy levels: Random Forest, Decision Tree Algorithm, and Logistic Regression. The results demonstrate the importance of algorithm selection and lay the groundwork for further studies to improve predictive models for better thyroid disease prevention and treatment. The authors found that, with an accuracy of 94.04%, Logistic Regression performed better than both of their methods in this investigation. Accurate outputs based on input data have been established.
 
Keywords—logistic regression, decision tree, random forest, supervised learning, thyroid disease

Cite: Pravin Satyanarayan Metkewar, Diaa S. Metwally, Prakhar Kapoor, and Aafaq A. Rather, "Comparative Analysis of Thyroid Disease Prediction Models: A Study of Logistic Regression, Decision Tree, and Random Forest Approaches," Journal of Advances in Information Technology, Vol. 16, No. 8, pp. 1169-1177, 2025. doi: 10.12720/jait.16.8.1169-1177

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions