TEXTWISE: Text Exploration through Interactive Natural Language Processing for Wide-Ranging Insights and Semantic Exploration

Home > Published Issues > 2025 > Volume 16, No. 8, 2025 >

JAIT 2025 Vol.16(8): 1048-1060
doi: 10.12720/jait.16.8.1048-1060

Hema Pandey 1, Benjamin Kloepper 1, Ruben Huehnerbein 2, Muskaan Singh 3, Binh Vu 1, Sina Mehraeen 1, Mehrdad Jalali 1, and Swati Chandna 1,*

1. Applied Data Science and Analytics, SRH University Heidelberg, Heidelberg, Germany
2. Industrial Data Analytics, ABB Corporate Research, Ladenburg, Germany
3. School of Computing, Engineering and Intelligent Systems, Ulster University, Derry/Londonderry, UK
Email: hemapandey.srh@gmail.com (H.P.); kloepper@posteo.de (B.K.); ruben.huehnerbein@de.abb.com (R.H.); m.singh@ulster.ac.uk (M.S.); binh.vu@srh.de (B.V.); sina.mehraeen@srh.de (S.M); mehrdad.jalali@srh.de (M.J); swati.chandna@srh.de (S.C.)
*Corresponding author

Manuscript received August 11, 2024; revised September 6, 2024; accepted February 21, 2025; published August 8, 2025.

Abstract—The task of generating accurately labeled datasets in Natural Language Processing (NLP) is notably challenging due to the high cost and extensive time requirements, compounded by the reliance on large volumes of unstructured data scraped from the web. Addressing this, our research introduces a novel framework utilizing Explanatory Interactive Machine Learning (XIL) and Explainable Artificial Intelligence (XAI). This framework enables the dynamic labeling of text data without predefined categories, significantly reducing the dependence on human annotators. Our methodology employs a topic modeling approach that allows a single annotator to label data efficiently with minimal oversight. In testing, this method trained a classifier on as few as 600 documents, achieving a precision of approximately 0.70. This precision is comparable to that of a classifier trained on a fully labeled dataset of 13,000 documents, demonstrating our system’s effectiveness while using less than 5% of the labeled data typically required. These findings highlight how our approach not only enhances the transparency of the labeling process but also reduces its resource intensity, offering substantial improvements over traditional methods in both scalability and efficiency. This proof of concept paves the way for broader applications of explainable interactive NLP across various domains.

Keywords—Explainable Artificial Intelligence (XAI), Explanatory Interactive Machine Learning (XIL), text mining, topic modelling, text labeling, unsupervised learning

Cite: Hema Pandey, Benjamin Kloepper, Ruben Huehnerbein, Muskaan Singh, Binh Vu, Sina Mehraeen, Mehrdad Jalali, and Swati Chandna, "TEXTWISE: Text Exploration through Interactive Natural Language Processing for Wide-Ranging Insights and Semantic Exploration," Journal of Advances in Information Technology, Vol. 16, No. 8, pp. 1048-1060, 2025. doi: 10.12720/jait.16.8.1048-1060

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Click to download

PREVIOUS PAPER

First page

NEXT PAPER

IoT-Based Secure Framework for Smart Grids Using Machine Learning and Blockchain Technologies

Home

Author Guide

Editor Guide

Reviewer Guide

Published Issues

Special Issue

Sections and Topics

journal menu

General Information

Editor-in-Chief

Prof. Kin C. Yow

What's New

Home > Published Issues > 2025 > Volume 16, No. 8, 2025 >

TEXTWISE: Text Exploration through Interactive Natural Language Processing for Wide-Ranging Insights and Semantic Exploration

Article Metrics in Dimensions