Department of Electrical and Computer Engineering Technical Reports

CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH LIMITED TRAINING SAMPLES

Saldju Tadjudin, Purdue University School of Electrical and Computer Engineering
David Landgrebe, Purdue University School of Electrical and Computer Engineering

Abstract

An important problem in pattern recognition is the effect of limited training samples on classification performance. When the ratio of the number of training samples to the dimensionality is small, parameter estimates become highly variable, causing the deterioration of classification performance. This problem has become more prevalent in remlote sensing with the emergence of a new generation of sensors. While the new sensor technology provides higher spectral and spatial resolution, enabling a greater number of spectrally separable classes to be identified, the needed labeled samples for designing the classifier remain difficult and expensive to acquire. In this thesis, several issules concerning the classification of high dimensional data with limited training samples are addressed. First of all, better parameter estimates can be obtained using a large number of unlabeled samples in addition to training samples under the mixture model. However, the estimation method is sensitive to the presence of statistical out1:iers. In remote sensing data, classes with few samples are difficult to identify and may constitute statistical outliers. Therefore, a robust parameter estima.tion method for the mixture model is introduced. Motivated by the fact that covariance estimates become highly variable with limited training samples, a covariance estimator is developed using a Bayesian formulation. The proposed covariance estimator is advAntageous when the training set size varies and reflects the prior of each class. Finally, a binary tree design is proposed to deal with the problem of varying training sample size. The proposed binary tree can function as both a classifiler and a feature extraction method. The benefits and limitations of the proposed methods are discussed and demonstrated with experiments.

Date of this Version

April 1998

Download

COinS

Department of Electrical and Computer Engineering Technical Reports

CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH LIMITED TRAINING SAMPLES

Abstract

Date of this Version

Search

Links

Links for Authors

Browse

Department of Electrical and Computer Engineering Technical Reports

CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH LIMITED TRAINING SAMPLES

Authors

Abstract

Date of this Version

Share

Search

Links

Links for Authors

Browse