Home
Author Guide
Editor Guide
Reviewer Guide
Published Issues
Special Issue
Introduction
Special Issues List
Sections and Topics
Sections
Topics
Internet of Things (IoT) in Smart Systems and Applications
journal menu
Aims and Scope
Editorial Board
Indexing Service
Article Processing Charge
Open Access
Copyright and Licensing
Preservation and Repository Policy
Publication Ethics
Editorial Process
Contact Us
General Information
ISSN:
1798-2340 (Online)
Frequency:
Monthly
DOI:
10.12720/jait
Indexing:
ESCI (Web of Science)
,
Scopus
,
CNKI
,
etc
.
Acceptance Rate:
12%
APC:
1000 USD
Average Days to Accept:
87 days
Journal Metrics:
Impact Factor 2023: 0.9
4.2
2023
CiteScore
57th percentile
Powered by
Article Metrics in Dimensions
Editor-in-Chief
Prof. Kin C. Yow
University of Regina, Saskatchewan, Canada
I'm delighted to serve as the Editor-in-Chief of
Journal of Advances in Information Technology
.
JAIT
is intended to reflect new directions of research and report latest advances in information technology. I will do my best to increase the prestige of the journal.
What's New
2024-09-25
Vol. 15, No. 9 has been published online!
2024-08-28
Vol. 15, No. 8 has been published online!
2024-07-29
Vol. 15, No. 7 has been published online!
Home
>
Published Issues
>
2022
>
Volume 13, No. 5, October 2022
>
JAIT 2022 Vol.13(5): 486-502
doi: 10.12720/jait.13.5.486-502
Effect of Features Extraction Techniques on Cyberstalking Detection Using Machine Learning Framework
Arvind Kumar Gautam and Abhishek Bansal
Department of Computer Science, Indira Gandhi National Tribal University, Amarkantak, M.P., India
Abstract
—Various cybercriminals are active with predefined and preplanned agendas to carry out cybercrimes in the Internet world. Cyberstalking, cyberbullying, cyber terrorism, cyber hacking, data leakage, identity theft, phishing, and other types of cyber harassment continually occur in the virtual world. Cyberstalking and cyberbullying are near to close in content and intent, involving the same internet-based technology to harass, bully and undermine others online. This paper implemented a cyberstalking detection model and analyzed the effect of various feature extraction techniques on different machine learning classifiers for cyberstalking detection. For feature extraction, the proposed model applied Word2vec, BOW, TF-IDF, FastText, GloVe, ELMo, and BERT. Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Random Forest (RF), Naive Bayes (NB), and Decision Tree (DT) were used for classification. Effects of each feature extraction method to enhance the performance of the detection model were determined based on the performance results of applied classifiers with each feature extraction process. Experimental results show that BOW and TF-IDF outperformed advanced word embedding-based feature extraction methods. BOW (for LR) achieved the highest accuracy of 95.7%, highest precision of 97.9%, and highest F-Score of 97.3%. TF-IDF achieved the highest recall of 99.8% for NB. SVM classifier achieved the second-highest accuracy of 95.2% with TF-IDF. BERT model successfully obtained maximum accuracy of 90.9% and 90.7% for LR and SVM, respectively. ELMo model also performed well and produced maximum accuracy of 90.5% and 90.2% for LR and SVM, respectively. The SkipGram model of Word2Vec provided an accuracy of 85% for the LR classifier. GloVe provided 81.2% accuracy for the RF classifier. SkipGram and the CBOW model of FastText provided 85.7% and 82.2% accuracy, respectively, for the RF classifier.
Index Terms
—features extraction, word embedding, machine learning, cyberstalking detection, cyberbullying bag of words, TF-IDF, Word2Vec, GloVe, FastText, ELMo, BERT
Cite: Arvind Kumar Gautam and Abhishek Bansal, "Effect of Features Extraction Techniques on Cyberstalking Detection Using Machine Learning Framework," Journal of Advances in Information Technology, Vol. 13, No. 5, pp. 486-502, October 2022.
Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License (
CC BY-NC-ND 4.0
), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.
11-JAIT-4224-Final-India
PREVIOUS PAPER
UAV Pilot Status Identification Algorithm Using Image Recognition and Biosignals
NEXT PAPER
An Integrated Crowdsourcing Application for Embedded Smartphone Sensor Data Acquisition and Mobility Analysis