Date of Award

8-2018

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematics

Committee Chair

Hanxiang Peng

Committee Member 1

Benzion Boukai

Committee Member 2

Guang Lin

Committee Member 3

Zhongmin Shen

Committee Member 4

Fei Tan

Abstract

U-statistics has been widely studied and used in both statistics and machine learning. One challenge in application of U-statistics is the intensively demanding computation. In this thesis, we propose subsampling method to fast compute U-statistics. Our work is fourfold: (1) we formally accommodate uniform subsampling to fast computing of U-statistics; (2) we propose A-optimal subsampling method, which outperforms uniform subsampling method in terms of MSE; (3) we provide a method to approximate the A-optimal subsampling probabilities, since the running time of the A-optimal subsampling probabilities is the same as the full sample U-statistics; (4) we get the limiting distribution of the subsampling estimator. Then we run simulations and employ two real datasets to assess the performance of the uniform subsampling and the A-optimal subsampling methods. Our simulation and real data result shows that the MSE of A-optimal subsampling estimator is significantly smaller that of the uniform subsampling estimator. And the A-optimal subsampling estimator takes much less computing time than the full sample U-statistics if the subsample size is not too large compared to the full sample size.

Share

COinS