Optimal linear combinations of neural networks

Sherif Hashem, Purdue University

Abstract

Neural network (NN) based modeling often involves trying multiple networks with different architectures, learning techniques, and training parameters in order to achieve "acceptable" model accuracy. Typically, one of the trained networks is chosen as "best", while the rest are discarded. In this dissertation, using optimal linear combinations (OLCs) of the corresponding outputs of a set of NNs is proposed as an alternative to using a single network. Modeling accuracy is measured by mean squared error (MSE) with respect to the distribution of random inputs. Optimality is defined by minimizing the MSE, with the resultant combination referred to as MSE-OLC. MSE-OLCs are investigated for four cases: allowing (or not) a constant term in the combination and requiring (or not) combination-weights to sum to one. In each case, deriving the MSE-OLC is straightforward and the optimal combination-weights are simple, requiring only matrix manipulations of bias, variance, and covariance information. In practice, information on bias, variance, and covariance is often difficult to obtain. Thus, the optimal combination-weights need to be estimated from observed data: observed inputs, the corresponding true responses, and the corresponding outputs for each component network. Given the data, computing the estimated optimal combination-weights is straightforward. Collinearity among the outputs and/or the errors of the component NNs sometimes degrades the generalization ability of the estimated MSE-OLC. In the presence of degrading collinearity, some of the component NNs may be dropped from the combination in order to improve generalization. Thus, six algorithms for selecting subsets of the NNs for the MSE-OLC are developed and tested. Several examples, including a real-world problem and an empirical study, are discussed. The examples illustrate the importance of addressing collinearity and demonstrate significant improvements in model accuracy as a result of employing MSE-OLCs supported by the NN selection algorithms.

Degree

Ph.D.

Advisors

Yih, Purdue University.

Subject Area

Industrial engineering|Statistics|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS