A newly developed system for pattern recognition in unsupervised environments, capable of processing large volume data sets with minimal computational resources and human intervention, which is currently operational at the Marshall Space Flight Center, is presented in this study. Here, in this system, the problem of unsupervised learning is viewed as one of clustering the large volume multidimensional data sets and is approached through the novel gambit of terrain development in the multidimensional histogram space. The terrain is developed by connecting each histogram cell to all of its higher density neighbors. This process leads to amalgamation of all the cells belonging to each of the clusters. Certain of these cells, being connected to more than one cluster, define fuzzy boundaries between the clusters. Discriminant hyperplanes, which not only separate these clusters but also form least square fits to the centroids of the cells defining the fuzzy boundaries, are derived. The design of these discriminant functions is through a new algorithm developed specifically for catering to this problem environment of discriminating between clusters with fuzzy boundaries. The dimensionality curse, an often encountered problem of computational complexity arising from higher dimensionality of data sets, is tackled here by a relatively simple pre-processing technique of ordering and selecting features on the basis of an adhoc histogram information measure sensed for each of the features. The conceptual and computational claims of this HINDU system, presently in operational status on an IBM 360/65, have been verified by simulation tests using remotely sensed multispectral LANDSAT data.

Date of this Version