Influence of File Systems on Performance When Working with an Abundance of Small Files

High-performance computing is widely used within the scientific community to perform demanding computational work. Using the resources available at a high-performance center in an efficient manner is of great importance. One potential bottleneck for high-performance computing is file systems. In thi...

Full description

Bibliographic Details
Main Author: Andersson, Simon
Format: Others
Language:English
Published: Umeå universitet, Institutionen för datavetenskap 2017
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-142504
id ndltd-UPSALLA1-oai-DiVA.org-umu-142504
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-umu-1425042017-12-02T05:24:54ZInfluence of File Systems on Performance When Working with an Abundance of Small FilesengAndersson, SimonUmeå universitet, Institutionen för datavetenskap2017Engineering and TechnologyTeknik och teknologierHigh-performance computing is widely used within the scientific community to perform demanding computational work. Using the resources available at a high-performance center in an efficient manner is of great importance. One potential bottleneck for high-performance computing is file systems. In this study two different file systems, the Lustre file system and MATLAB Datastore, have been evaluated in terms of performance when performing computations on an abundance of small files. The performance test consisted of classification of large numbers of small (<2 megabytes) images in MATLAB using the high-performance computer system Kebnekaise at HPC2N in Umeå. Results indicate that MATLAB Datastore gives better performance than the Lustre file system for all images sets tested in the study. This makes it possible to recommend using MATLAB Datastore over the Lustre file system in situations where large number of smaller files are to be read and from the file system. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-142504UMNAD ; 1120application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Engineering and Technology
Teknik och teknologier
spellingShingle Engineering and Technology
Teknik och teknologier
Andersson, Simon
Influence of File Systems on Performance When Working with an Abundance of Small Files
description High-performance computing is widely used within the scientific community to perform demanding computational work. Using the resources available at a high-performance center in an efficient manner is of great importance. One potential bottleneck for high-performance computing is file systems. In this study two different file systems, the Lustre file system and MATLAB Datastore, have been evaluated in terms of performance when performing computations on an abundance of small files. The performance test consisted of classification of large numbers of small (<2 megabytes) images in MATLAB using the high-performance computer system Kebnekaise at HPC2N in Umeå. Results indicate that MATLAB Datastore gives better performance than the Lustre file system for all images sets tested in the study. This makes it possible to recommend using MATLAB Datastore over the Lustre file system in situations where large number of smaller files are to be read and from the file system.
author Andersson, Simon
author_facet Andersson, Simon
author_sort Andersson, Simon
title Influence of File Systems on Performance When Working with an Abundance of Small Files
title_short Influence of File Systems on Performance When Working with an Abundance of Small Files
title_full Influence of File Systems on Performance When Working with an Abundance of Small Files
title_fullStr Influence of File Systems on Performance When Working with an Abundance of Small Files
title_full_unstemmed Influence of File Systems on Performance When Working with an Abundance of Small Files
title_sort influence of file systems on performance when working with an abundance of small files
publisher Umeå universitet, Institutionen för datavetenskap
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-142504
work_keys_str_mv AT anderssonsimon influenceoffilesystemsonperformancewhenworkingwithanabundanceofsmallfiles
_version_ 1718563463190544384