Date of Award

Fall 2014

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics

First Advisor

William S. Cleveland

Second Advisor

Bowei Xi

Committee Chair

William S. Cleveland

Committee Co-Chair

Bowei Xi

Committee Member 1

Hao Zhang

Committee Member 2

Chuanhai Liu

Abstract

In this thesis multiple methods are proposed and applied to the Akamai CIDR time series data. The Akamai network is one of the world's largest distributed-computing platforms, with more than 250,000 servers in more than 80 countries. It is responsible for 15-20 percent of all web traffic. We obtained 110 GB raw CIDR data over a 18 month period, collected on the Akamai network from November 2011 to April 2013. ^ The Seasonal-Trend Decomposition procedure based on loess (STL+) is used to model the CIDR series. Motivated by the CIDR series analysis, we propose a general prediction based model selection procedure, where extensive visual diagnostics are part of the procedure for selecting the best performing model. Factorial experimental designs are used to explore the parameter space. We evaluate the performance of different models for the CIDR series using our proposed prediction based model selection procedure. Furthermore the analysis and modeling of the CIDR series is performed under the Divide and Recombine for large data framework. And we conduct a theoretical Divide and Recombine time series estimation study. ^ We also study the performance of Divide and Recombine estimates for Gaussian auto-regressive time series, Gaussian long range dependent series, and auto-regressive series with tails heavier than Gaussian.

Share

COinS