Parametric models such as linear models are widely used in constructing discriminant functions. They provide useful, interpretable description of simple structure in data. However, sometimes such simple structure does not extend across an entire data set. A trade-off exists between the ease of interpretation of the model and its predictive power. We propose a multistage linear SVM method to address this problem. For this purpose, we develop a strategy to partition the dataset into a rejected subset and an accepted subset based on the sample-margin induced by a linear SVM model. The samples in the accepted subset are considered as adequately modeled whereas the samples in the rejected subset are not. A second linear SVM model is then trained on the accepted subset, and the rejected subset is forwarded to the next stage for further processing. This procedure proceeds until no further partitioning is necessary or possible. The recursive partitioning of the dataset naturally leads to a multistage classifier. The aggregation of simple models obtained upon termination provides us the power to handle complex data while maintaining the ease of model interpretability. Empirical results with different data sets show that the multistage linear SVM method achieves substantial improvements in training and testing accuracies as compared with a single stage linear SVM model and also circumvents the potential overfitting issue usually faced by more complex nonlinear models.


Support Vector Machines, Multistage Structure, Partitioning of Dataset, Aggregation of Sub-Models

Date of this Version

March 2007