Home > Published Issues > 2022 > Volume 13, No. 4, August 2022 >
JAIT 2022 Vol.13(4): 387-392
doi: 10.12720/jait.13.4.387-392

Novel Shared Input Based LSTM for Semantic Similarity Prediction

D. Meenakshi and A. R. Mohamed Shanavas
Department of Computer Science, Jamal Mohamed College (Autonomous), Affiliated to Bharathidasan University, Trichy, Tamil Nadu, India

Abstract—Automated similarity detection in text is a core component for several applications. This work is an independent component of the formative assessment architecture being developed by the authors. The proposed model is constructed using a Shared Input based Long Short Term Memory (SI-LSTM) model. The model is composed of a preprocessing phase, an embedding layer creation phase and the SI-LSTM model. The SI-LSTM model is composed of two sections. The first section has been designated for building features. The input for the model is composed of two distinct inputs, whose level of similarity is to be identified. Feature matrices are created for the inputs using the embedding layer and the LSTM layer. The resultant features are integrated and passed through a deep learning model for duplicate identification. Experiments were performed using the Quora Question Pairs dataset. Comparisons with the existing state-of-the-art model indicate an improvement in accuracy of 1.7%, improvements in recall of 11.9% and improvements in F-Score of 5.7%.
 
Index Terms—duplicate identification, semantic similarity identification, deep learning, LSTM, text processing, online education, automated test

Cite: D. Meenakshi and A. R. Mohamed Shanavas, "Novel Shared Input Based LSTM for Semantic Similarity Prediction," Journal of Advances in Information Technology, Vol. 13, No. 4, pp. 387-392, August 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.