Date of Award

Spring 2015

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics

First Advisor

William S. Cleveland

Committee Chair

William S. Cleveland

Committee Member 1

Chuanhai Liu

Committee Member 2

Bowei Xi

Committee Member 3

Lingsong Zhang

Abstract

Divide and recombine (D&R) is a statistical framework for the analysis of large complex data. The data are divided into subsets. Numeric and visualization methods, which collectively are analytic methods, are applied to each subset. For each analytic method, the outputs of the application of the method to the subsets are recombined. So each analytic method has associated with it a division method and a recombination method. Here we study D&R methods for likelihood-based model fitting. We introduce a notion of likelihood analysis and modeling. We divide the data and fit a likelihood model on each subset. The fitted model is characterized by a set of parameters much smaller than the subset data size, but retains as much information as possible about the true subset likelihood. Analysis of subset likelihoods and their fitted models consists of visualizations on an appropriate scale and region. These visualizations allow the analyst to verify the choice and fit of the model. The fitted models are recombined across subsets to form a model of the the all-data likelihood, which we maximize to obtain a likelihood modeling estimate (LME). We present simulation results demonstrating the performance of our method compared with the all-data maximum likelihood estimate (MLE) for the case of logistic regression.

Share

COinS