Home
Author Guide
Editor Guide
Reviewer Guide
Published Issues
Special Issue
Introduction
Special Issues List
Sections and Topics
Sections
Topics
Internet of Things (IoT) in Smart Systems and Applications
Human-Computer Interaction (HCI) in Modern Technological Systems
journal menu
Aims and Scope
Editorial Board
Indexing Service
Article Processing Charge
Open Access
Copyright and Licensing
Preservation and Repository Policy
Publication Ethics
Editorial Process
Contact Us
General Information
ISSN:
1798-2340 (Online)
Frequency:
Monthly
DOI:
10.12720/jait
Indexing:
ESCI (Web of Science)
,
Scopus
,
CNKI
, EBSCO,
etc
.
Acceptance Rate:
17%
APC:
1000 USD
Average Days to Accept:
106 days
Managing Editor:
Ms. Mia Hu
E-mail:
editor@jait.us
Journal Metrics:
Impact Factor 2023: 0.9
4.2
2023
CiteScore
57th percentile
Powered by
Editor-in-Chief
Prof. Kin C. Yow
University of Regina, Saskatchewan, Canada
I'm delighted to serve as the Editor-in-Chief of
Journal of Advances in Information Technology
.
JAIT
is intended to reflect new directions of research and report latest advances in information technology. I will do my best to increase the prestige of the journal.
What's New
2025-04-02
Included in Chinese Academy of Sciences (CAS) Journal Ranking 2025: Q4 in Computer Science
2025-03-20
JAIT Vol. 16, No. 3 has been published online!
2025-02-27
JAIT has launched a new Topic: "Human-Computer Interaction (HCI) in Modern Technological Systems."
Home
>
Published Issues
>
2021
>
Volume 12, No. 2, May 2021
>
A Study of Job Failure Prediction at Job Submit-State and Job Start-State in High-Performance Computing System: Using Decision Tree Algorithms
Anupong Banjongkan, Watthana Pongsena, Nittaya Kerdprasop, and Kittisak Kerdprasop
School of Computer Engineering, Suranaree University of Technology (SUT), Thailand
Abstract
—In High-Performance Computing (HPC) system, job failure is a major problem because it means the losses in computation time, resources, and power. Job failure also degrades significantly overall efficiency of the HPC system. In this paper, we propose two sets of models to predict job failure at two points of submission: job submit-state and job start-state. The models can be used as guiding tools for HPC-user to make efficient decision on managing their job submisison on the HPC system. The tools are thus for improving the efficiency of the HPC system at the job level. In the evaluation stage, we conduct a comparative study in order to compare performance of the job failure predictive models developed based on the decision-tree induction techniques including C5.0, Classification and Regression Tree (CART), and Chi-square Automatic Interaction Detector (CHAID). The datasets used for training and testing the models are the two workload logs collected from the HPC system at the National Electronics and Computer Technology Center (NECTEC), Thailand, and the Los Alamos National Laboratory (LANL), USA. To predict failure at the job submit-state and at the job start-state, the results show that the models built from C5.0 algorithm provide the highest accuracy of prediction (around 85% for the NECTEC dataset and 87% for the LANL dataset). The experimental results regarding prediction at different job states reveal that failure forecasting at the job start-state is slightly more accurate than making prediction at the job submit-state (accuracy improvement is around 1.45% for the NECTEC dataset and 0.46% for the LANL dataset). However, when considering both criteria of the performance of the models and the overhead of job waiting time, job failure prediction modeling at the job submit-state provides the best efficiency.
Index Terms
—decision tree, high-performance computing, job failure prediction, workload log
Cite: Anupong Banjongkan, Watthana Pongsena, Nittaya Kerdprasop, and Kittisak Kerdprasop, "A Study of Job Failure Prediction at Job Submit-State and Job Start-State in High-Performance Computing System: Using Decision Tree Algorithms," Journal of Advances in Information Technology, Vol. 12, No. 2, pp. 84-92, May 2021. doi: 10.12720/jait.12.2.84-92
Copyright © 2021 by the authors. This is an open access article distributed under the Creative Commons Attribution License (
CC BY-NC-ND 4.0
), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.
1-EE031_Thailand
PREVIOUS PAPER
First page
NEXT PAPER
A Novel Model for Cloud Computing Analytics and Measurement
Article Metrics in Dimensions