Learning algorithms for multilayer neural networks with application to tree-structured classifiers

Heng Guo, Purdue University

Abstract

Much of the recent interest in training neural networks has been focused on the backpropagation training algorithm with application to pattern recognition and machine learning. This thesis addresses two issues in training multilayer neural networks. The first one is the convergence property of the backpropagation algorithm. The second one is the construction of structured neural network systems. First, it is shown that the backpropagation algorithm exhibits some unique dynamical properties which are related to the multilayer structure of the net: there is a manifold of equilibrium (or very near equilibrium) points, and weights converge to different equilibrium points from different initial conditions. The analysis proceeds by deriving a simplified deterministic model using a Describing-Function-type approach. The global and local properties of this model are then studied via an associated ordinary differential equation (ODE), and an estimated stability bound on the stepsize is obtained. A simple two-neuron net and an XOR net are used to illustrate the analysis. Next, a classification tree with neural network feature extraction method is proposed. In this method, small multilayer neural nets are used at the decision nodes of a binary classification tree to perform local nonlinear feature extraction. The proposed method improves on standard classification tree methods such as CART which use single coordinate and linear features. It also provides a structured approach to neural network classifier design which transfers the problem of selecting the net size to the simpler problem of finding a right sized tree. The proposed tree construction method consists of tree growing and pruning phases. The tree is grown by training the nets with the backpropagation algorithm in conjunction with a class aggregation scheme. The tree is pruned by a bottom-up recursive algorithm which obtains a minimum error subtree. In a waveform recognition problem and a handwritten character recognition problem, the proposed method compares favorably to CART in terms of error rate, tree size and training time, and has similar error rate and requires less training time than a single large multilayer net.

Degree

Ph.D.

Advisors

Gelfand, Purdue University.

Subject Area

Electrical engineering|Artificial intelligence

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS