Home > Published Issues > 2024 > Volume 15, No. 7, 2024 >
JAIT 2024 Vol.15(7): 798-811
doi: 10.12720/jait.15.7.798-811

Classification of Obsessive-Compulsive Disorder Symptoms in Arabic Tweets Using Machine Learning and Word Embedding Techniques

Malak Fahad Al-Haider 1,*, Ali Mustafa Qamar 2, Hasan Shojaa Alkahtani 1, and Hafiz Farooq Ahmad 1
1. Computer Science Department, College of Computer Sciences and Information Technology (CCSIT),
King Faisal University, Al-Ahsa, Saudi Arabia
2. Department of Computer Science, College of Computer, Qassim University, Buraydah, Saudi Arabia
Email: Malak.alhaider@outlook.com (M.F.A.); al.khan@qu.edu.sa (A.M.Q.); hsalkahtani@kfu.edu.sa (H.S.A.), hfahmad@kfu.edu.sa (H.F.A.);
*Corresponding author

Manuscript received November 25, 2023; revised January 5, 2024; accepted April 11, 2023, published July 8, 2024.

Abstract—Obsessive-Compulsive Disorder (OCD) is a mental health condition that is characterized by persistent and intrusive thoughts, images, or impulses (obsessions), as well as the presence of repetitive behaviors or mental acts (compulsions) that are aimed at reducing anxiety. These behaviors are typically rigid and performed according to specific rules; furthermore, they can be time-consuming and cause significant distress or impairment in daily functioning. Detecting the symptoms of OCD could help individuals become aware of them and seek a medical diagnosis. People increasingly rely on social media such as Twitter to express and disclose their feelings and thoughts. Although researchers have researched OCD in English, there is no substantial work in this domain concerning Arabic tweets. Therefore, this research proposes investigating and detecting OCD in Arabic tweets using Machine Learning (ML) and word embedding techniques. First, we obtained tweets via web scraping and manually annotated the data by involving medical professionals. Secondly, we conducted exploratory data analysis on the textual data and emojis used to find their correlation. Furthermore, we focused on the quality of word representation. Efficient word representation approaches (word embeddings) combined with recent ML models have shown reasonable progress on text classification tasks. We trained our classification model using the Arabic version of fastText. The proposed models were also tested on our dataset. The analysis indicated that utilizing fastText as a word embedding technique is a particularly promising approach.
Keywords—obsessive-compulsive-disorder, machine learning models, text classification, word embedding, fastText

Cite: Malak Fahad Al-Haider, Ali Mustafa Qamar, Hasan Shojaa Alkahtani, and Hafiz Farooq Ahmad, "Classification of Obsessive-Compulsive Disorder Symptoms in Arabic Tweets Using Machine Learning and Word Embedding Techniques," Journal of Advances in Information Technology, Vol. 15, No. 7, pp. 798-811, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.