Pattern recognition in remote-sensing imagery using data mining and statistical techniques

Rajesh Kumar Singh, Purdue University


The remote sensing image classification domain has been explored and examined by scientists in the past using classical statistical and machine-learning techniques. Statistical techniques like Bayesian classifiers are good when the data is noise-free or normalized, while implicit models, or machine learning algorithms, such as artificial neural networks (ANN) are more of a "black box", relying on iterative training to adjust parameters using transfer functions to improve their predictive ability relative to training data for which the outputs are known. The statistical approach performs better when a priori information about categories is available, but they have limitations in the case of objective classification and when the distribution of data points are not known, as is the case with remote sensing satellite data. Data mining algorithms, which have potential advantages over classical statistical classifiers in analyzing remote sensing imagery data, were examined for use in land use classification of remote sensing data. Spectral classifications of LANDSAT™ imagery from 1989 were conducted using data mining and statistical techniques. The site selected for this research was NASA's Kennedy Space Center (KSC) in Florida. The raw satellite data used in classification was obtained using feature-extraction image processing techniques. The classification effort can broadly be divided into two major categories: (a) Supervised classification with subjectively defined prior known classes, and (b) Unsupervised classification with objectively categorized natural groups of similar attributes. Several predictive models and segmentation classification schemes were developed. The techniques used for evaluation of spectral patterns were based on both statistical and data mining algorithms. The statistical technique involved k-nearest neighbor statistical method, while data mining algorithms included: (1) back-propagation artificial neural network technique for two architectures; (2) our decision-tree predictive (C4.5) models using 3 split-score methods and a 'stacking' process; and (3) six clustering segmentation schemes using the Expectation Maximization (EM) algorithm. The methodology developed for each classifier involved descriptive analysis and transformation techniques to obtain derived variables and categorized variables. Data mining techniques, like decision trees, neural networks and segmentation clustering, found patterns based on each algorithm's computational ability and are not guided by user interaction and thus yielded a better performance. It was concluded that data mining techniques have the potential to improve spectral classification in a remote sensing imagery. (Abstract shortened by UMI.)




Engel, Purdue University.

Subject Area

Agricultural engineering|Remote sensing

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server