Joint modeling of highly skewed data with excess zeros using copulas
Abstract
Data with excess zeros arise in many real-world applications involving counts, such as the number of insects along a transect or the number of dental carries in adolescent children. In the healthcare arena, annual healthcare costs, such as Medicaid prescription drug costs, also have excess zeros. These arise because a subject never incurred a cost during the year or some other entity paid the bill. Zero-inflated models, such as the zero-inflated negative binomial, and two-part models, where one part models the probability of a zero and the other part models the distribution of a positive cost, are typically used to fit these data. Unlike most count data, the positive healthcare costs are highly skewed, thus further complicating the fit. In this research we address another important aspect of healthcare costs and that is the fact that different types of annual costs are correlated even after adjusting for covariates. This has rarely been addressed in the literature and when it has, either the costs are assumed to arise from the same distribution (i.e. bivariate zero-inflated negative binomial) or random effects have been used to create the dependence. Here, we use copulas to correlate the costs. Our model estimation/selection procedure is first described. This includes choosing appropriate marginal distributions to describe the costs and choosing the type of copula. In choosing the type of copula, we augment the existing estimation procedures of the dependence for two continuous variables to handle two-part models and describe both likelihood and visual selection methods. We then demonstrate the robustness of this approach compared to the random effects model through a simulation study. Finally, we apply our approach to the 2004 Indiana Medicaid Data, focusing on the annual costs associated with dementia.
Degree
Ph.D.
Advisors
Craig, Purdue University.
Subject Area
Statistics
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.