Improving the reliability and performance of virtual cloud infrastructures
Abstract
The widespread adoption of cloud computing has reshaped the information technology (IT) landscape in the past few years. Leveraging cloud resources, in place of or in addition to local resources, has led to greater performance, elasticity and ubiquity of online and offline services and has substantially reduced the cost of ownership, operation, and maintenance of the software and hardware used for computation. One prevalent feature of cloud services is the use of virtualization. Virtualization is primarily used to enable multi-tenancy, where a number of different users can share the same computational infrastructure in isolation. The use of virtualization in cloud services is very prominent in Infrastructure as a Service (IaaS) systems, where the computation infrastructure is offered as a service to cloud users. In an IaaS system (e.g., Amazon EC2), the cloud provider hosts a large number of servers in a data center where each server runs one or multiple virtual machines (VMs). The IaaS system can then be "sliced up" for users as virtual cloud infrastructures in the form of individual VMs or networks of VMs. This dissertation addresses two important challenges with respect to the virtual cloud infrastructures. More specifically, this dissertation focuses on improving (1) the reliability and (2) the network performance of such systems. The first half of the dissertation presents VNsnap, a system that takes distributed snapshots (checkpoints) of virtual networked infrastructures with minimal downtime. Such snapshots facilitate fault-tolerance and execution resumability for a large class of distributed applications. The second half presents vSnoop as a solution that addresses adverse effects of server consolidation on TCP throughput. vSnoop is a system where TCP acknowledgement is offloaded from the VMs to the driver domain of a cloud host in order to hide VM scheduling-induced latency. The evaluation results show that deploying vSnoop leads to significant improvements in terms of network-level and application-level throughput within a virtual infrastructure. Both VNsnap and vSnoop demonstrate that it is possible to improve the reliability and performance of virtual cloud infrastructures without any modifications to applications and operating systems running in the VMs. As a result, both solutions can transparently support custom VM images that cloud users supply to IaaS systems.
Degree
Ph.D.
Advisors
Xu, Purdue University.
Subject Area
Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.