Chi-square tests for randomly censored data

Joo Han Kim, Purdue University

Abstract

Under the random censorship model, we asume that the responses $X\sb1,\cdots,X\sb{n}$ are independent nonnegative random variables with continuous distribution function F. The censoring variables $Y\sb1,\cdots,Y\sb{n}$ are also nonnegative and are assumed to be a random sample, drawn independently of the $X\sb{j}$'s, from a population with continuous distribution function G. We say that the $X\sb{j}$'s are censored on the right by the $Y\sb{j}$'s since we can only observe $Z\sb{j}$ = min($X\sb{j}{,}Y\sb{j}$) and $\delta\sb{j}$ = $I\lbrack Z\sb{j}$ = $X\sb{j}\rbrack$, which indicates whether $Z\sb{j}$ is an uncensored observation or not. The problem of goodness of fit for censored data is to test the null hypothesis that F is a member of a family $\{F(\cdot\vert\theta )\}$ of distribution functions indexed by a parameter $\theta$ running over a parameter space $\Omega$. In Chapter 2, three different chi-square statistics are presented. For the simple null hypothesis, we obtain a chi-square statistic as a nonnegative definite quadratic form in the estimated cell frequencies using the product-limit estimator introduced by Kaplan and Meier (1958). For the composite null hypothesis, two chi-square statistics are developed by employing the minimum chi-square estimator and the raw data maximum likelihood estimator. In Chapter 3, we consider general chi-square statistics, which are nonnegative definite quadratic forms in th estimated cell frequencies obtained from the product-limit estimator, allowing random cells and general estimators of nuisance parameters. The large sample behavior of these statistics under the null hypothesis and local alternatives is presented. In Chapter 4, the chi-square statistics developed in Chapter 2 and the statistics proposed by Akritas (1988) are compared on the basis of asymptotic relative Pitman efficiency and approximate Bahadur slopes. In the testing of simple hypothesis, it is shown that neither statistic dominates the other. The efficiencies are shown to depend on the degree of censoring. In Chapter 5, the specific forms of the chi-square statistics developed in Chapter 2 for testing fit to the exponential and Weibull families are presented, and the calculation of the chi-square statistics in sets of real data is illustrated.

Degree

Ph.D.

Advisors

Moore, Purdue University.

Subject Area

Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS