Home > Published Issues > 2025 > Volume 16, No. 6, 2025 >
JAIT 2025 Vol.16(6): 854-868
doi: 10.12720/jait.16.6.854-868

Beyond Classical Approaches: Fine-Tuning Clinical BERT Model on Structured Data for Alzheimer’s Disease Diagnosis

Hager Saleh 1,2,3,*, Michael McCann 4, John G. Breslin 1,5, Shaker El-Sappagh 6,7, and for the Alzheimer’s Disease Neuroimaging Initiative **
1. Insight Research Ireland Centre for Data Analytics, University of Galway, Galway H91 TK33, Ireland
2. Atlantic Technological University, Letterkenny, Ireland
3. Faculty of Computers and Artificial Intelligence, Hurghada University, Hurghada, Egypt
4. Department of Computing, Atlantic Technological University, Letterkenny, Ireland
5. School of Engineering, University of Galway, Galway H91 TK33, Ireland
6. Faculty of Computer Science and Engineering, Galala University, Suez, Egypt
7. Faculty of Computers and Artificial Intelligence, Benha University, Banha, Egypt
Email: hager.saleh.fci@gmail.com (H.S.); Michael.McCann@atu.ie (M.M.); john.breslin@universityofgalway.ie (J.G.B.); sh.elsappagh@gmail.com (S.E.-S.)
*Corresponding author
** Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu).

Manuscript received February 11, 2025; revised March 11, 2025; accepted March 27, 2025; published June 19, 2025.

Abstract—Alzheimer’s Disease (AD) is a neurodegenerative disorder requiring early diagnosis for effective intervention. This study presents a comprehensive evaluation of classical machine learning models, fine-tuned BERT-based models, and Large Language Models (LLMs) for AD diagnosis using structured data. The comparative analysis demonstrates that fine-tuned BERT-based models, particularly BioBERT and ClinicalBERT, achieve superior performance, with BioBERT attaining an accuracy of 94.48%, precision of 94.64%, recall of 94.48%, and an F1-score of 94.52% based only on structured numerical data. These models outperform classical approaches such as random forest and AdaBoost, which achieved accura-cies of 91.87% and 91.41%, respectively. In contrast, LLMs, including GPT-4 and LLama 3.2, exhibited suboptimal results, with GPT reaching a maximum accuracy of 53.988% in the zero-shot setting, highlighting their limitations in handling structured clinical data without extensive fine-tuning. The study highlights the importance of domain-specific fine-tuning, as general-purpose LLMs struggle with structured data due to their reliance on prompt engineering. While traditional ML models provide a solid baseline, fine-tuned BERT models offer enhanced diagnostic capabilities by capturing intricate data relationships.
 
Keywords—Alzheimer’s disease diagnosis, Large Language Model (LLM), language model, in-context learning

Cite: Hager Saleh, Michael McCann, John G. Breslin, Shaker El-Sappagh, and Alzheimer’s Disease Neuroimaging Initiative, "Beyond Classical Approaches: Fine-Tuning Clinical BERT Model on Structured Data for Alzheimer’s Disease Diagnosis," Journal of Advances in Information Technology, Vol. 16, No. 6, pp. 854-868, 2025. doi: 10.12720/jait.16.6.854-868

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions