Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster

Context. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware an...

Full description

Bibliographic Details
Main Authors: Sulewski, Patryk, Jesper, Hallborg
Format: Others
Language:English
Published: 2017
Subjects:
LXC
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544
id ndltd-UPSALLA1-oai-DiVA.org-bth-14544
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-bth-145442017-06-22T05:34:59ZExperimental Investigation of Container-based Virtualization Platforms For a Cassandra ClusterengSulewski, PatrykJesper, Hallborg2017Container VirtualizationCassandraDockerLXCBig dataMicroservicesLinux distributionsComputer SystemsDatorsystemContext. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware and software, which in turn makes the process isolation expensive. Newtechniques, known as Microservices or containers, has been developed to deal with the overhead.The infrastructure is conjoint with storing, processing and serving vast and unstructureddata sets. The overall cloud system needs to have high performance while providing scalabilityand easy deployment. Microservices can be introduced for all kinds of applications in a cloudcomputing network, and be a better fit for certain products.Objectives. In this study we investigate how a small system consisting of a Cassandra clusterperform while encapsulated in LXC and Docker containers, compared to a non virtualizedstructure. A specific loader is built to stress the cluster to find the limits of the containers.Methods. We constructed an experiment on a three node Cassandra cluster. Test data is sentfrom the Cassandra-loader from another server in the network. The Cassandra processes are thendeployed in the different architectures and tested. During these tests the metrics CPU, disk I/O,network I/O are monitored on the four servers. The data from the metrics is used in statisticalanalysis to find significant deviations.Results. Three experiments are being conducted and monitored. The Cluster test pointed outthat isolated Docker container indicate major latency during disk reads. A local stress test furtherconfirmed those results. The step-wise test in turn, implied that disk read latencies happened dueto isolated Docker containers needs to read more data to handle these requests. All Microservicesprovide some overheads, but fall behind the most for read requests.Conclusions. The results in this study show that virtualization of Cassandra nodes in a clusterbring latency in comparison to a non virtualized solution for write operations. However, thoselatencies can be neglected if scalability in a system is the main focus. For read operationsall microservices had reduced performance and isolated Docker containers brought out thehighest overhead. This is due to the file system used in those containers, which makes disk I/Oslower compared to the other structures. If a Cassandra cluster is to be launched in a containerenvironment we recommend a Docker container with mounted disks to bypass Dockers filesystem or a LXC solution. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Container Virtualization
Cassandra
Docker
LXC
Big data
Microservices
Linux distributions
Computer Systems
Datorsystem
spellingShingle Container Virtualization
Cassandra
Docker
LXC
Big data
Microservices
Linux distributions
Computer Systems
Datorsystem
Sulewski, Patryk
Jesper, Hallborg
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
description Context. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware and software, which in turn makes the process isolation expensive. Newtechniques, known as Microservices or containers, has been developed to deal with the overhead.The infrastructure is conjoint with storing, processing and serving vast and unstructureddata sets. The overall cloud system needs to have high performance while providing scalabilityand easy deployment. Microservices can be introduced for all kinds of applications in a cloudcomputing network, and be a better fit for certain products.Objectives. In this study we investigate how a small system consisting of a Cassandra clusterperform while encapsulated in LXC and Docker containers, compared to a non virtualizedstructure. A specific loader is built to stress the cluster to find the limits of the containers.Methods. We constructed an experiment on a three node Cassandra cluster. Test data is sentfrom the Cassandra-loader from another server in the network. The Cassandra processes are thendeployed in the different architectures and tested. During these tests the metrics CPU, disk I/O,network I/O are monitored on the four servers. The data from the metrics is used in statisticalanalysis to find significant deviations.Results. Three experiments are being conducted and monitored. The Cluster test pointed outthat isolated Docker container indicate major latency during disk reads. A local stress test furtherconfirmed those results. The step-wise test in turn, implied that disk read latencies happened dueto isolated Docker containers needs to read more data to handle these requests. All Microservicesprovide some overheads, but fall behind the most for read requests.Conclusions. The results in this study show that virtualization of Cassandra nodes in a clusterbring latency in comparison to a non virtualized solution for write operations. However, thoselatencies can be neglected if scalability in a system is the main focus. For read operationsall microservices had reduced performance and isolated Docker containers brought out thehighest overhead. This is due to the file system used in those containers, which makes disk I/Oslower compared to the other structures. If a Cassandra cluster is to be launched in a containerenvironment we recommend a Docker container with mounted disks to bypass Dockers filesystem or a LXC solution.
author Sulewski, Patryk
Jesper, Hallborg
author_facet Sulewski, Patryk
Jesper, Hallborg
author_sort Sulewski, Patryk
title Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
title_short Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
title_full Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
title_fullStr Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
title_full_unstemmed Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
title_sort experimental investigation of container-based virtualization platforms for a cassandra cluster
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544
work_keys_str_mv AT sulewskipatryk experimentalinvestigationofcontainerbasedvirtualizationplatformsforacassandracluster
AT jesperhallborg experimentalinvestigationofcontainerbasedvirtualizationplatformsforacassandracluster
_version_ 1718461968867655680