The B-average divergence for m-distinct ~asses, resulting from the linear transformation y = Bx, is proposed as a feature selection criterion, where B is a k by n matrix of rank k <= n. It is shown that if the B-average divergence resulting from B is large enough, then the probability of misclassification, considered as a function of the class of all k by n matrices, is essentially minimized by B. A computer program, utilizing a gradient procedure, is developed to numerically maximize the B-average divergence and results are presented for the Cl flight line. For this example, corresponding to 9-distinct classes, most of the discriminatory information is found to lie in a 3-dimensional subspace, defined by an appropriately chosen 3 by 12 matrix B.

Date of this Version