Methods for missing values in dichotomous response variables

Yun Wang, Purdue University


In this dissertation, we focus on methods for analyzing data with missing values for dichotomous response variables. We compare several methods, including complete cases, weighting, the EM algorithm, multiple imputation and predicted mean matching for ignorable missing values in a dichotomous variable. We also study the use of the two-stage Heckman model to adjust selection bias when the missing values are non-ignorable. To compare the performance of each method, we create simulated missing values for a dichotomous response variable under three different missing mechanisms: missing completely at random, missing at random, and non-ignorable missing. We then compare the performance of different methods for analyzing these data. Existing validating criteria, including sensitivity and specificity, Mahalanobis distance, sum of difference in predicted probabilities, R 2, deviance, AIC and SC, are used to compare the performance under different missing mechanisms. We apply six methods for handling missing values for dichotomous response variable to the PACE data, and develop a valid predictive model in the presence of missing data.




Sands, Purdue University.

Subject Area


Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server