Towards Elastic High-Performance Geo-Distributed Storage in the Cloud

University dissertation from Stockholm : KTH Royal Institute of Technology

Abstract: In this thesis, we have presented techniques and algorithms to reduce request latency of distributed storage services that are deployed geographically. In addition, we have proposed and designed elasticity controllers to maintain predictable performance of distributed storage systems under dynamic workloads and platform uncertainties. Firstly, we have proposed a lease-based data consistency algorithm that allows a distributed storage system to serve read-dominant workload efficiently in a global scale. The leasing algorithm allows replicas with valid leases to serve read requests locally. As a result, most of the read requests are served with little latency. Then, we have investigated the efficiency of quorum-based data consistency algorithms when deployed globally. We have proposed MeteorShower framework, which is based on replicated logs and loosely synchronized clocks, to augment quorum-based data consistency algorithms. As a result, the quorum-based data consistency algorithms no longer need to query for updates from remote replicas, which significantly reduces request latency.  Based on similar insights, we build a transaction framework, Catenae, for geo-distributed data stores. It employs replicated logs to distribute transactions and aggregate the execution results. This allows Catenae to commit a serializable read-write transaction experiencing only a single inter-DC RTT delay in most of the cases.We examine and control the factors that cause performance degradation when scaling a distributed storage system. First, we have proposed BwMan, which is a model-based network bandwidth manager. It alleviates performance degradation caused by data migration activities. Then, we have systematically modeled the impact of data migrations.  Using this model, we have built an elasticity controller, namely, ProRenaTa, which combines proactive and reactive controls to achieve better control accuracy. ProRenaTa is able to calculate the best possible scaling plan to resize a distributed storage system under the constraint of achieving scaling deadlines, reducing latency SLO violations and minimizing VM provisioning cost. Consequently, ProRenaTa yields much higher resource utilization and less latency SLO violations comparing to state-of-the-art approaches. Based on ProRenaTa, we have built an elasticity controller named Hubbub-scale, which adopts a control model that generalizes the data migration overhead to the impact of performance interference caused by multi-tenancy in the Cloud.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)