Learning and convergence properties of linear threshold elements or percept,rons are well understood for the case where the input vectors (or the training sets) to the perceptron are linearly separable. However, little is known about the behavior of a linear threshold element when the training sets are linearly non-separable. In this paper we present the first known results on the structure of linearly non-separable training sets and on the behavior of perceptrons when the set of input vectors is linearly non-separable. More precisely, we show that using the well known perceptron learning algorithm a linear threshold element can learn the input vectors that are provably learnable, and identify those vectors that cannot be learned without committing errors. We also show how a linear threshold element can be used to learn large linearly separable subsets of any given non-separable training set. In order to develop our results, we first establish formal characterizations of linearly non-separable training sets and define learnable structures for such patterns. We also prove computational complexity results for the related learning problems. Next, based on such characterizations, we show that a perceptron do,es the best one can expect for linearly non-separable sets of input vectors and learns as much as is theoretically possible.
Date of this Version