Date of Award

8-2018

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer and Information Technology

Committee Chair

Baijian Yang

Committee Member 1

Tonglin Zhang

Committee Member 2

Dmitri A. Gusev

Abstract

Ridge regression is a technical method to deal with highly correlated data when using regression model for analysis. Like other traditional techniques, one common limitation happens when the data size is bigger than the storage capacity of the memory or taking most of the memory storage. The analysis can’t complete because of memory error, either happening in loading the data into the memory or during the calculation step. Sampling or extending the memory storage capacity may be two possible solutions to avoid the problem. However, it probably brings unknown bias when the population is enormous or high costs in establishing the hardware.

With the new method proposed by Zhang and Yang (2017b), it solves the above problems that the memory cannot support the requirements for computation in big data sets as well as the cost. The new method only needs to read the whole data set one time and make it separately. Unlike the traditional method, reading the entire dataset repeatedly is not required. In this study, it is to prove the new method can provide a fast way to use ridge regression for analysis as well as an exact result without approximation. Three experiments implemented are to examine (i) if the new method can provide the result sooner than others, (ii) if the new method can handle bigger data set of which others can’t don, and (iii) if the result from the new method has better predictive accuracy than others.

Recommended Citation

Chiang, Wan-Chih, "The approach to ridge regression for big data: An examination" (2018). Open Access Theses. 1519.
https://docs.lib.purdue.edu/open_access_theses/1519

Download

COinS

Open Access Theses

The approach to ridge regression for big data: An examination

Date of Award

Degree Type

Degree Name

Department

Committee Chair

Committee Member 1

Committee Member 2

Abstract

Recommended Citation

Search

Links

Links for Authors

Browse

Open Access Theses

The approach to ridge regression for big data: An examination

Author

Date of Award

Degree Type

Degree Name

Department

Committee Chair

Committee Member 1

Committee Member 2

Abstract

Recommended Citation

Share

Search

Links

Links for Authors

Browse