Intelligent System Towards Maintaining Service Level Agreements Related to Application Performance in Cloud Infrastructure

Saurav Nanda, Purdue University

Abstract

Application owners are aggressively moving to cloud platforms because of two major reasons: 1) a pay-as-you-go model for resource utilization; and 2) performance-based SLAs for cloud applications. Hence, application owners are not worried about the overhead of hardware maintenance and they do not have to pay for the idle resources. They can dynamically adjust the system resources, such as CPU, memory, disk I/O and network, to handle highly dynamic workload, and technical complexity to ensure an acceptable application performance. Kohavi and Longbotham (2007) described the significance of application performance as their experimental results related to a well-known e-business portal and a most common search portal demonstrates that their financial loss is directly proportional to the response time of the application. In the same report, Kohavi and Longbotham (2007) also described that a well-known e-business portal experienced 1 percent drop in their sales due to 100ms delay in the response time of their portal, and a prominent search portal experienced 20 percent drop in their sales due to 500ms delay their search results. Moreover, it is challenging for cloud infrastructure providers to offer a strict service level agreements (SLAs) related to application performance as explained by Islam, Keung, Lee, and Liu (2012). The ultimate goal of our research is to ensure strict SLAs based on the response time of web/mobile applications hosted in a cloud environment, primarily by using different control theoretic approaches. To achieve this broad goal we break down the following problem into sub-problems: How can performance-based service level agreements (SLAs) for web/mobile applications hosted in cloud environment be maintained? - How to characterize and model the dynamic nature of application workload and tune elastic system resources (CPU, memory, network) accordingly? - What are different resource scaling techniques that can minimize the resource allocation and maximize the resource utilization? - How to filter noise from the resource utilization data that is collected from the virtual and physical hosts? - How can cloud providers maximize the resource usage and at the same time ensure no SLA violations occur related to response time of the application? - Can cloud providers enhance the performance of web/mobile applications using global live migration of VMs? - Which prediction technique would be more accurate to predict the resource utilization on the virtualized infrastructure? - How to use an ensemble approach to have multiple prediction techniques in a single control system that can choose the best prediction technique for different environmental scenarios? We have already addressed all of these seven significant challenges in separate chapters of this dissertation. In Nanda, Hacker, and Lu (2016), we adopted a weighted cost model related to SLA violations to define our convex optimization problem and determine a near-optimal solution for allocating resource dynamically. To maintain a strict response time SLA, we implemented a feedback-based control system (FCS) that comprises of a prediction function to forecast the resource usage of the target application using an ARIMA model, and tuned these resources in advance based on the forecast. We focused on CPU and memory resources, and performed vertically scaling of resource dynamically. In another work, Nanda and Hacker (2017), we focused on improving the user experience of mobile applications, we suggest Global Live Migration (GLM) of VMs that allows VMs to be migrated between different geographical regions. We formulate a Mixed Integer Programming (MIP) for strategic placement of VMs, and propose a Lagrangian relaxation based subgradient technique to solve this problem. We using a control system with self-tuning regulators to improve the migration sequence regularly using the feedback from each migration cycle. We extended our prior work Nanda et al. (2016) to leverage a combination of vertical and horizontal scaling (i.e. diagonal scaling) with different prediction techniques to predict resource utilization. We aim to leverage an ensemble learning approach that can choose the best prediction technique depending upon different environmental variables. Also, our future plan is to extend our VM placement work and solve the containers/VM consolidation problem using a deep neural network. In the remainder of this dissertation, we describe a part of our research in which we addressed some of the aforementioned sub-problems. Together, all our research objective addresses an important real world problem for performance of web/mobile applications hosted in cloud environment. For each part of our research, we describe the novelty of the work and present a solution to one of the sub-problems of our main research objective. Finally, we present a road map for our future research project for the coming years.

Degree

Ph.D.

Advisors

Hacker, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS