PATTERN RECOGNITION WITH STRINGS, SUBSTRINGS AND BOUNDARIES

BASANTKUMAR JOHN OOMMEN, Purdue University

Abstract

The purpose of this research is to study similarity and dissimilarity measures between strings, substrings and polygons, and to use these measures in various pattern recognition problems. An abstract basis for many of the known similarity and dissimilarity measures involving a set of strings has been presented. By virtue of the abstract formulation, many of the numerical and non-numerical measures of similarity involving strings can be computed using a common computational scheme. A deterministic algorithm which possesses certain optimal computational properties has been proposed for the recognition of noisy strings. Further, a stochastic model for a channel causing deletion, insertion and substitution errors in strings according to an arbitrary distribution has been discussed. An algorithm to compute the probability of receiving one string Y, given that a string X was transmitted, has been presented. Using these results, error correction of strings can be achieved with a minimum probability of error. The question of estimating a set of words containing a certain string by processing a noisy version of this string has been studied. This problem, which has been untackled in the literature, is called the noisy substring matching problem. A deterministic algorithm has been proposed to solve this problem. Finally, some geometrical dissimilarity measures between polygons has been proposed. These measures utilize the entire geometrical information in the boundaries of the contours, and not merely the global features of the boundaries. Using these dissimilarity measures pattern recognition of closed contours can be performed. Experimental results have been included which justify the theoretical results presented. In the study of strings and substrings, the experiments have been conducted using subsets of the 1023 most common English words. The Four Great Lakes of North America, Erie, Huron, Michigan and Superior, have been used in the experiments related to the recognition of closed contours.

Degree

Ph.D.

Subject Area

Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS