Home > Published Issues > 2022 > Volume 13, No. 5, October 2022 >
JAIT 2022 Vol.13(5): 423-432
doi: 10.12720/jait.13.5.423-432

Self-Adaptable Infrastructure Management for Analyzing the Efficiency of Big Data Stores

Konstantinos Mavrogiorgos, Athanasios Kiourtis, Argyro Mavrogiorgou, and Dimosthenis Kyriazis
Department of Digital Systems, University of Piraeus, Piraeus, Greece

Abstract—Currently a continuously increasing amount of data is generated and processed in a daily basis towards improving decision-making and facilitating the gaining of insights. In this context, current era is characterized as the “Era of Big Data” with data characteristics including high volume, velocity, variety, or veracity, creating multiple chances and challenges. Several Information and Communication Technology (ICT) firms, enterprises and research projects are working upon the overall Big Data challenges with an increasing amount of effort being given to identify the means of effectively and efficiently collecting, storing, retrieving, analyzing and reusing Big Data in order to improve their services, increase their competitive advantage and support competent decisions. Such approaches deal with several sectors including the domains of healthcare, agricultural, environmental, transportation, governance, or insurance. Towards this goal, in order to identify the most efficient and less-time consuming database for using and reusing the stored data, in this paper we contribute into the selection of the most appropriate database for efficiently storing and retrieving Big Data. More specifically, considering the challenges and the nature of Big Data, as well as the main categories of databases that currently exist, three (3) NoSQL document-based databases are being described and compared under different working environments and conditions, namely the ArangoDB, the MongoDB and the CouchDB. These working environments depend on the Diastema platform that provides the ability of the adaptive allocation and management of infrastructures based on the networking, computing, and storing requirements of each database. Consequently, the overall performance and efficiency of these databases is calculated along with the latter platform, and is being based on specific metrics and criteria, which include the average execution time of CRUD operations and the corresponding requirements for resources, thus concluding to the most suitable databases to store Big Data.
 
Index Terms—big data, storage, document-based, infrastructure management, ArangoDB, MongoDB, CouchDB
 
Cite: Konstantinos Mavrogiorgos, Athanasios Kiourtis, Argyro Mavrogiorgou, and Dimosthenis Kyriazis, "Self-Adaptable Infrastructure Management for Analyzing the Efficiency of Big Data Stores," Journal of Advances in Information Technology, Vol. 13, No. 5, pp. 423-432, October 2022.

Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.