MAXIMUM LIKELIHOOD DISCRIMINATION AND LOGISTIC REGRESSION

TZU-CHEG KAO, Purdue University

Abstract

Suppose that a random vector X comes from one of the two populations H(,0) and H(,1). An optimal classification rule is one which minimizes the probability of misclassification. The optimal rule can be estimated from either a random sample of size n or random samples of size n(,0) and n(,1) from H(,0) and H(,1) respectively. Let n(,0) = n(1 - (pi)*), n(,1) = n(pi)*, 0 < (pi)* < 1. Classification rules can be constructed using the discriminant function or the logistic regression approach. Problems are (1) to compare the two sampling schemes and (2) to determine the optimal value for (pi)*. To solve these problems one needs theorems on the consistency and asymptotic normality of the estimator of the parameter vector (beta) obtained by maximizing (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI) with respect to (beta). Here the g(,i)(y(,i),(beta))'s are some positive functionals, and the Y(,i)'s are independent but not necessarily identically distributed with distribution functions F(,i)((.),(theta)(,s),v). The special case of two normal populations differing in mean but not covariance is considered in detail. Criteria are given for determining which sampling scheme is more efficient. For both sampling schemes, the discriminant function estimation procedure is preferred. Under the stratified sampling scheme optimal choices for (pi)* are found using different criteria.

Degree

Ph.D.

Subject Area

Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS