Some problems in hazard estimation with smoothing splines
Penalized likelihood method can be used for hazard estimation with lifetime data that are right-censored, left-truncated, and possibly with covariates. This thesis consists of three parts. The first two parts address issues in the penalized likelihood method for single event lifetime data, and the third part extends the model to recurrent event data, where a subject can suffer from multiple failures during the study. Besides theoretical derivation, the techniques in each part are evaluated through empirical studies and/or analysis of real data examples. In Chapter 2, we first consider more scalable computation of the penalized likelihood method by restricting the estimation to certain q-dimensional spaces with q increasing at a much slower rate than the sample size n. Then we derive approximate Bayesian confidence intervals for log hazard through a quadratic approximation of the log likelihood. In the presence of continuous covariate, the penalized likelihood method can be time-consuming due to repeated numerical integrations. We propose in Chapter 3 an alternative nonparametric approach, with the computationally expensive log-likelihood part replaced by a term named as the pseudo-likelihood, which is much more computationally efficient as well as representative of the goodness-of-fit. Accordingly; a new cross-validation score is designed to reduce the load in the smoothing parameter selection step. Asymptotic convergence rates for the new estimates are established. In Chapter 4, we propose a penalized likelihood model to estimate the hazard function for gap times in recurrent event data, as a function of both gap time and covariate. Method for smoothing parameter selection is developed and Bayesian confidence intervals for log hazard are derived. Asymptotic convergence rates are also established by assuming no gap times of a subject are the same. When applying the proposed techniques to the well-known bladder tumor cancer data, we have found a new feature of the data. Discussions at the end of Chapter 2--4 conclude the chapters correspondingly. Thus a separate summary chapter is not provided here.
Gu, Purdue University.
Off-Campus Purdue Users:
To access this dissertation, please log in to our