Hierarchical classification in high-dimensional, numerous class cases

Byungyong Kim, Purdue University

Abstract

As the progress of new sensor technology continues, increasingly high resolution imaging sensors have been developed. HIRIS (High Resolution Imaging Spectrometer), for example, will gather data simultaneously in 192 spectral bands in the 0.4-2.5 micrometer wavelength region at 30 m spatial resolution. Also, AVIRIS (Airborne Visible and Infrared Imaging Spectrometer) covers the 0.4-2.5 micrometer in 224 spectral bands. These sensors give more detailed, complex data for each picture element and greatly increase the dimensionality of data over past systems. In applying pattern recognition methods in remote sensing problems, an inherent limitation is that there is almost always only a small number of training samples with which to design the classifier. Both the growth in the dimensionality and the number of classes is likely to aggravate the already significant limitation of training samples. Thus ways must be found for future data analysis which can perform effectively in the face of large numbers of classes without unduly aggravating the limitations on training. A set of requirements for a valid list of classes for remote sensing data is that the classes must each be of informational value (i.e. useful in a pragmatic sense) and the classes be spectrally or otherwise separable (i.e., distinguishable based on the available data). Therefore, how to simultaneously reconcile a property of the data (being separable) and a property of the application (informational value) is important in developing the new approach to classifier design. In this work we propose decision tree classifiers which are more efficient and accurate in this situation of high dimensionality and large numbers of classes, and in particular, three methods for designing a decision tree classifier, a top down approach, a bottom up approach, and a hybrid approach. Also, remote sensing systems which perform pattern recognition tasks on high dimensional data with small training sets require efficient methods for feature extraction and prediction of the optimal number of features to achieve minimum classification error. Three feature extraction techniques are implemented. Canonical and extended canonical techniques are mainly dependent upon the mean difference between two classes. An autocorrelation technique is dependent upon the correlation differences. The mathematical relationship between sample size, dimensionality, and risk value is derived. The incremental error is simultaneously affected by two factors, dimensionality and separability. For predicting the optimal number of features, it is concluded that in a transformed coordinate space it is best to use the best one feature when only small numbers of samples are available. Empirical results indicate that a reasonable sample size is six to ten times the dimensionality.

Degree

Ph.D.

Advisors

Landgrebe, Purdue University.

Subject Area

Electrical engineering|Remote sensing

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS