SOME RESULTS IN THE THEORY OF SUBSET SELECTION PROCEDURES

HWA-MING YANG, Purdue University

Abstract

Selection and ranking (ordering) problems in statistical inference arise mainly because the classical tests of homogeneity are often inadequate in certain situations where the experimenter is interested in comparing k ((GREATERTHEQ) 2) populations, treatments or processes with the goal of selecting one or more worthwhile (good) populations. A formulation in which the population of interest is selected with a fixed minimum probability P* over the entire parameter space is called the subset selection formulation which was given by Gupta (1956, 1963, 1965). In this formulation, based on any given sample sizes, one chooses a subset of populations, the size of which depends on the observed outcome of the experiment. This formulation differs from the so-called 'indifference zone' formulation in which the emphasis is on designing the experiment (finding the common minimum sample size) so that the selected population is truly the one of interest is at least equal to a number P*, whenever the parameters lie outside an indifference zone. Chapter I of this thesis considers the problem of selecting a subset containing all populations that are better than a control under the assumptions of an ordering prior. Here, by an ordering prior we mean that there exists a known simple or partial order relationship among the unknown parameters of the treatments (excluding the control). Three new selection procedures are proposed and studied. These procedures do meet the usual requirement that the probability of a correct selection is greater than or equal to a pre-determined number P*. Two of the three procedures use the isotonic regression over the sample means of the k treatments with respect to the given ordering prior. Tables which are necessary to carry out the selection procedure with isotonic approach for the selection of unknown means of normal populations and gamma populations are given. Monte Carlo comparisons on the performance of several procedures for the normal or gamma means problem were carried out in several selected cases. The results of this study seem to indicate that the procedures based on isotonic estimators always have superior performance, especially, when there are more than one bad populations (in comparison with the control). Chapter II deals with a new 'Bayes-P*' approach about the problem of selecting a subset which contains the 'best' of k populations. Here, by best we mean the (unknown) population with the largest unknown mean. The (non-randomized) Bayes-P* rule refers to a rule with minimum risk in the class of (non-randomized) rules which satisfy the condition that the posterior probability of selecting the best is at least equal to P*. Given the priors of the unknown parameters, two 'Bayes-P*' subset selection procedure (psi)('B) and (psi)(,NR)('B) (randomized and non-randomized, respectively) under certain loss functions are obtained and compared with the classical maximum-type means procedure (psi)('M). The comparisons of the performance of (psi)('B) with (psi)(,NR)('B) and (psi)('M), based on Monte Carlo studies, indicate that the procedure (psi)('B) always has higher 'efficiency' and smaller expected size of the selected subset. The studies also indicate that (psi)('B) is robust when the true distributions are not normal but are some other symmetric distributions such as, the logistic, the double exponential (Laplace) and the gross error model (the contaminated distribution).

Degree

Ph.D.

Subject Area

Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS