Multilingual Context Ontology Rule Enhanced Focused Web Crawler

General Information

ISSN: 1798-2340 (Online)
Frequency: Monthly
DOI: 10.12720/jait
Indexing: ESCI (Web of Science), Scopus, CNKI, etc.
Acceptance Rate: 19%
APC: 500 USD
Average Days to Accept: 135 days
Journal Metrics:

Impact Factor 2022: 1.0

3.1

2022CiteScore

49th percentile

Powered by

Editor-in-Chief

Prof. Kin C. Yow

University of Regina, Saskatchewan, Canada

I'm delighted to serve as the Editor-in-Chief of Journal of Advances in Information Technology. JAIT is intended to reflect new directions of research and report latest advances in information technology. I will do my best to increase the prestige of the journal.

What's New

2024-03-28

Vol. 15, No. 3 has been published online!

2024-02-26

The papers published in Vol. 15, Nos. 1&2 have been registered with Crossref.

2024-02-26

Vol. 15, No. 2 has been published online!

Home > Published Issues > 2010 > Volume 1, No. 1, February 2010 >

Mukesh Kumar and Renu Vig

University Institute of Engineering and Technology, Panjab University, Chandigarh ,India

Abstract—Rapidly growing size and increasing number of Non-English resources on World-Wide-Web poses unprecedented challenges for general purpose crawlers and Search Engines. It is impossible for any search engine to index the complete Web. Focused crawler cope with the growing size by selectively seeking out pages that are relevant to a predefined set of topics and avoiding irrelevant regions of the Web. Rather than collecting and indexing all accessible Web documents, focused crawler analyses its crawl boundary to find the links likely to be the most relevant for the crawl. This paper presents a focused crawler whose crawl strategy is based upon the scores calculated from context ontologies and adaptive classification rules, and which is capable to deal with intermediate multilinguity situations (the situations in which the query language is same as that of target language but the intermediate path may pass through some pages which are written in mixed, in query and some other language, way). It enhances the quality of pages retrieved, because it may be possible that the English meaning of the other language word sequence may itself or point to some pages which are most relevant to the query, and hence should be included in the results, which, yet, are left untouched by all the existing crawlers.

Index Terms—Focused Crawler, Search Engines, Information Retrieval, Ontology, Adaptive Rules

Cite: Mukesh Kumar and Renu Vig, "Multilingual Context Ontology Rule Enhanced Focused Web Crawler," Journal of Advances in Information Technology, Vol. 1, No. 1, pp. 21-25, February, 2010.doi:10.4304/jait.1.1.21-25

v1n1-06

PREVIOUS PAPER

A Review of Machine Learning Algorithms for Text-Documents Classification

NEXT PAPER

Integrated Performance and Visualization Enhancements of OLAP Using Growing Self Organizing Neural Networks

Home

Author Guide

Editor Guide

Reviewer Guide

Published Issues

Special Issue

Sections and Topics

journal menu