Nonparametric Perspective of Deep Learning

Tianyang Hu, Purdue University

Abstract

Models built with deep neural network (DNN) can handle complicated real-world data extremely well, seemingly without suffering from the curse of dimensionality or the non-convex optimization. To contribute to the theoretical understanding of deep learning, this work studies the nonparametric perspective of DNNs by considering the following questions: (1) What is the underlying estimation problem and what are the most appropriate data assumptions? (2) What is the corresponding optimal convergence rate and does the curse of dimensionality occur? (3) Is the optimal rate achievable for DNN estimators and is there any optimization guarantee? These questions are investigated on two of the most fundamental problems --- regression and classification. Specifically, statistical optimality of DNN estimators is established under various settings with special focuses on the curse of dimensionality and optimization guarantee. In the classic binary classification problem, statistical optimal convergence rates that suffer less from the curse of dimensionality are established under two settings: (1) Under the smooth boundary assumption [1], I show that DNN classifiers with proper architectures can benefit from the compositional smoothness structure [2] underlying the high dimensional data in the sense that the optimal convergence rates only depend on some effective dimension d∗ , potentially much smaller than the data dimension d. (2) Under a novel teacher-student framework that assumes the Bayes classifier to be expressed as ReLU neural networks, I obtain a dimension-free rate of convergence O(n−2/3) for DNN classifiers, which is also proven optimal.

Degree

Ph.D.

Advisors

Cheng, Purdue University.

Subject Area

Artificial intelligence

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS