Nonlinear techniques for signal processing and recognition have the promise of achieving systems which are superior to linear systems in a number of ways such as better performance in terms of accuracy, fault tolerance, resolution, highly parallel architectures and cloker similarity to biological intelligent systems. The nonlinear techniques proposed are in the form of multistage neural networks in which each stage can be a particular neural network and all the stages operate in parallel. The specific approach focused upon is the parallel, self-organizing, hierarchical neural networks (PSHNN's). A new type of PSHNN is discussed such that the outputs are allowed to be continuous-valued. The perfo:rmance of the resulting networks is tested in problems of prediction of speech and of chaotic time-series. Three types of networks in which the stages are learned by the delta rule, sequential least-squares, and the backpropagation (BP) algolrithm, respectively, are described. In all cases studied, the new networks achieve better performarnce than linear prediction. This is shown both theoretically and experimentally. A revised BP algorithm is discussed for learning input nonlinearities. The advantage of the revised BP algorithm is that the PSHNN with revised BP stages can be extended to use the sequential leastsquares (SLS) or the least mean absolule value rule (LMAV) in the last stage. A forward-backward training algorithm for parallel, self-organizing hierarchical neural networks is described. Using linear algebra, it is shown that the forward-backward training of an n-stage PSHNN until convergence is equivalent to the pseudo-inverse solution for a single, total network designed in the leastsquares sense with the total input vector consisting of the actual input vector and its additional nonlinear transformations. These results are also valid when a single long input vector is partitioned into smaller length vectors. A number of advantages achieved are small modules for easy and fast learning, parallel implementation of small modules during testing, faster convergence rate, better numerical error-reduction, and suitability for learning input nonlinear transformations by the backpropagation algorithm. Better performance in terms of deeper minimum of the error function and faster convergence rate is achieved when a single BP network is replaced by a PSHNN of equal complexity in which each stage is a BP network of smaller complexity than the single BP network.

Date of this Version

April 1992