Stability of machine learning algorithms
Abstract
In the literature, the predictive accuracy is often the primary criterion for evaluating a learning algorithm. In this thesis, I will introduce novel concepts of stability into the machine learning community. A learning algorithm is said to be stable if it produces consistent predictions with respect to small perturbation of training samples. Stability is an important aspect of a learning procedure because unstable predictions can potentially reduce users' trust in the system and also harm the reproducibility of scientific conclusions. As a prototypical example, stability of the classification procedure will be discussed extensively. In particular, I will present two new concepts of classification stability. The first one is the decision boundary instability (DBI) which measures the variability of linear decision boundaries generated from homogenous training samples. Incorporating DBI with the generalization error (GE), we propose a two-stage algorithm for selecting the most accurate and stable classifier. The proposed classifier selection method introduces the statistical inference thinking into the machine learning society. Our selection method is shown to be consistent in the sense that the optimal classifier simultaneously achieves the minimal GE and the minimal DBI. Various simulations and real examples further demonstrate the superiority of our method over several alternative approaches. The second one is the classification instability (CIS). CIS is a general measure of stability and generalizes DBI to nonlinear classifiers. This allows us to establish a sharp convergence rate of CIS for general plug-in classifiers under a low-noise condition. As one of the simplest plug-in classifiers, the nearest neighbor classifier is extensively studied. Motivated by an asymptotic expansion formula of the CIS of the weighted nearest neighbor classifier, we propose a new classifier called stabilized nearest neighbor (SNN) classifier. Our theoretical developments further push the frontier of statistical theory in machine learning. In particular, we prove that SNN attains the minimax optimal convergence rate in the risk, and the established sharp convergence rate in CIS. Extensive simulation and real experiments demonstrate that SNN achieves a considerable improvement in stability over existing classifiers with no sacrifice of predictive accuracy.
Degree
Ph.D.
Advisors
Cheng, Purdue University.
Subject Area
Statistics|Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.