Benchmarking Computational Performance of Virtual Machines Within the Secure Unified Research Environment Versus Physical Computers

Introduction The Secure Unified Research Environment (SURE) is a high-powered computing environment located within Sax Institute (Sydney, Australia). SURE was established through the financial support of the Australian Government National Collaborative Research Infrastructure Strategy (NCRIS) as pa...

Full description

Bibliographic Details
Main Authors: Max Moldovan, Chris Radbone, Maria C Inacio
Format: Article
Language:English
Published: Swansea University 2020-12-01
Series:International Journal of Population Data Science
Online Access:https://ijpds.org/article/view/1582
Description
Summary:Introduction The Secure Unified Research Environment (SURE) is a high-powered computing environment located within Sax Institute (Sydney, Australia). SURE was established through the financial support of the Australian Government National Collaborative Research Infrastructure Strategy (NCRIS) as part of the Population Health Research Network (PHRN). SURE is approved by the Australian Government as the only secure platform for analysing unit record level sensitive health and other Australian Government data, providing computational resources and secure infrastructure in a form of virtual machines (VMs) accessible by approved researchers in Australia and overseas. Objectives and Approach We aim to compare computational performance of SURE VMs of different configurations with the performance of physical computers by running a series of standardised computational tasks involving different numbers of central processing unit (CPU) cores available on each computer. The approach utilised the benchmark test maintained by the H2O.ai group (https://h2oai.github.io/db-benchmark/). The results were measured over the datasets of different sizes, ranging from 500MB to 50GB in Random Access Memory (RAM). Results Our benchmarking outcomes have revealed that computational efficiency of physical computers uniformly outperform the efficiency of the current standard SURE VM configuration offerings, sometimes demonstrating a nearly double performance. For the range of typical analytical tasks assessed, computational performance greatly benefits from extending the number of computational cores available on a machine. Conclusion / Implications SURE is a highly valuable tool enabling research and collaborations involving confidential population-based data. The shortage of RAM and CPU cores can be a major bottleneck even for moderately large datasets. VMs currently offered by SURE yet fall short of reaching computational performance of physical desktop computers. The results are to guide the funders and providers of secure remote access data laboratories responsible for providing Research Infrastructure as a Service (IaaS) tailored to meet the needs of participating research groups.
ISSN:2399-4908