A decision theoretic approach to the dimensionality problem in the context of multivariate analysis of variance

Anindya K De, Purdue University

Abstract

In the context of Multivariate Analysis of Variance (MANOVA) it is important to know the dimension of the space generated by the mean vectors and the space itself. This would help identification of a structural relation, if there is any. In this work the following multiple decision theoretic set up is considered to address the problem. Let $X\sb{ij} (d\times1)$ be the jth observation from the ith population having a continuous distribution with mean $\mu\sb{i}$ and covariance matrix $\Omega\sb{i} i = 1,\...,p, p > d; j = 1,\...,n.$ Let M = ($\mu\sb1 - \bar\mu,\...,\mu\sb{p} - \bar\mu),$ where $\bar\mu = {1\over p} \sum \mu\sb{i}.$ Possible actions or decisions are $a\sb0,\..., a\sb{d}$ where $a\sb{k}$ stands for the decision that the rank of the matrix M is k. Initially all the $\Omega\sb{i}$'s are assumed to be identical. An ad hoc rule is proposed from heuristic considerations for the case d = 2 based on the eigenvalues of the between sum of squares and product matrix H, H = $n\sum\sbsp{i=1}{p}(\bar X\sb{i} - \bar X)(\bar X\sb{i} - (\bar X)\sp\prime$ where $\bar X\sb{i}$ is the sample mean for the ith population and $\bar X$ is the overall sample mean. It is shown that when $X\sb{ij}$'s are normal, the proposed ad hoc rule is close to being Bayes for a suitable prior. The ad hoc rule is then slightly modified to incorporate some of the finer features of the Bayes rule. Simulation studies show that for p = 3, 4, 5 the ad hoc rule attains the Bayes risk up to a factor of 110%, so that no other procedure can dominate it uniformly by more than 10%, hence it is close to being admissible. Therefore the rule should appeal to both the Frequentists and the Bayesians. For general p and d corresponding results are also derived. The performance of the rule is studied in detail for the cases d = 2 and d = 3 in the frequentist paradigm. Some asymptotic results when p or n tends to infinity are obtained. Estimate of the space in which the mean vectors lie and the estimates of the mean vectors are also obtained. The ad hoc rule can be extended very easily to the cases where (i) $\Omega\sb{i}$'s are identical but unknown, (ii) $\Omega\sb{i}$'s are different but known and (iii) $\Omega\sb{i}$'s are different and unknown. The performance of the rule in these cases is quite satisfactory. Some further investigation is done by constructing some reference and non-informative priors for the problem when d = 2. The performance of the Bayes rules is studied and compared with that of the ad hoc rule. Also an 'automatic' model selection criterion is proposed when the priors are improper.

Degree

Ph.D.

Advisors

McCabe, Purdue University.

Subject Area

Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS