Towards Efficient, Work-Conserving, and Fair Bandwidth Guarantee in Cloud Datacenters

Bandwidth guarantee is a critical feature to enable performance predictability in cloud datacenters. This process is expected to achieve three requirements: work conservation, fairness, and simplicity. However, the distributed nature of datacenters raises significant challenges to attaining those re...

Full description

Bibliographic Details
Main Authors: Baraa Saeed Ali, Kang Chen, Imran Khan
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8772053/
Description
Summary:Bandwidth guarantee is a critical feature to enable performance predictability in cloud datacenters. This process is expected to achieve three requirements: work conservation, fairness, and simplicity. However, the distributed nature of datacenters raises significant challenges to attaining those requirements at the same time. In this paper, we propose an efficient approach that can satisfy the three requirements simultaneously. Our scheme takes advantage of multipath TCP (MPTCP) to generate explicit bandwidth guarantee (BG) traffic and work conservation (WC) traffic. We further prioritize the BG traffic over the WC traffic in the network fabric. Due to the priority setting, WC cannot harm bandwidth guarantees and thus is effectively supported. We show that the MPTCP fits this direction well but presents some new issues when the WC subflows own a low priority. We thus adapt the MPTCP to handle these issues through a customized scheduler (which strictly prioritizes BG subflow during packet scheduling) and adopting a large receive buffer. In addition, we enable tenants to share unused bandwidth fairly by managing the overall aggressiveness of the WC traffic. The proposed system can be easily implemented with commercial off-the-shelf servers and switches. We have implemented with the Linux kernel MPTCP for experiments. The extensive experiments in a small cluster (including one MapReduce experiment) and trace-driven simulations show that our scheme achieves the design goals effectively.
ISSN:2169-3536