Home > Published Issues > 2022 > Volume 13, No. 6, December 2022 >
JAIT 2022 Vol.13(6): 578-589
doi: 10.12720/jait.13.6.578-589

A Noun-Centric Keyphrase Extraction Model: Graph-Based Approach

Rilwan O. Abimbola 1, Iyabo O. Awoyelu 2, Folasade O. Hunsu 2, Bodunde O. Akinyemi 2, and Ganiyu A. Aderounmu 2
1. First Technical University, Ibadan, Nigeria
2. Obafemi Awolowo University, Ile-Ife, Nigeria

Abstract—The graph-based approach has proven to be the most effective method of extracting keyphrases. Existing graph-based extraction methods do not include nouns as a component, resulting in keyphrases that are not noun-centric, leading to low-quality keyphrases. Also, the clustering approach employed in most of the keyphrase extraction has not yielded good results. This study proposed an improved model for extracting keyphrases that uses a graph-based model with noun phrase identifiers and effective clustering techniques. Relevant data was collected from selected documents in the English language. A graph-based model was formulated by integrating the textrank algorithm for node ranking, a noun phrase identifier for noun phrase scoring, an affinity propagation algorithm for selecting cluster groups, and k-means for clustering. The formulated model was implemented and evaluated by benchmarking it with an existing model using recall, f-measure, and precision as performance metrics. Final results showed that the developed model has a higher precision of 5.5%, a recall of 5.3%, and an f-measure score of 5.5% over the existing model. This implied that the noun-centric keyphrase extraction ensured high-quality keyphrase extraction.
 
Index Terms—keyphrase, keyphrase extraction, noun-centric, graph-based model, clustering  
 
Cite: Rilwan O. Abimbola, Iyabo O. Awoyelu, Folasade O. Hunsu, Bodunde O. Akinyemi, and Ganiyu A. Aderounmu, "A Noun-Centric Keyphrase Extraction Model: Graph-Based Approach," Journal of Advances in Information Technology, Vol. 13, No. 6, pp. 578-589, December 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.