Date of Award
12-2017
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
Committee Chair
Jennifer Neville
Committee Co-Chair
Yuan Qi
Committee Member 1
Ninghui Li
Committee Member 2
Michael Gribskov
Abstract
Real-world data often encompass hidden relationships, such as interactions between modes in multidimensional arrays (or tensors), subsets of features correlated to specifc responses, and associations between heterogeneous data sources. Uncovering these relationships is a key problem in machine learning and data mining, and relates to numerous applications ranging from information security to imaging genetics and to computational advertisement. However, to mine these relationships, we have to face several signifcant challenges. First, how can we design powerful models to capture the complicated, potentially highly nonlinear patterns in data? Second, how can we develop effcient model estimation algorithms to deal with real-world large data volumes, say, millions of features and billions of tensor elements? In this dissertation, we aim to address these challenges using Bayesian learning techniques. Compared with other types of methodologies, Bayesian learning has a unique advantage — it provides a highly principled, interpretable mathematical framework for data modeling and reasoning under uncertainty. We use two families of Bayesian approaches, namely Bayesian nonparametrics and sparse learning, to uncover the fundamental relationships hidden in data. That is, the interactive relationships between multiple entities within tensors, where each mode represents a particular type of entity, e.g., a three-mode (user, movie, music) tensor, and the correlated relationships between features and responses in high dimensional and multiview data.
Recommended Citation
Zhe, Shandian, "Scalable Bayesian Nonparametrics and Sparse Learning for Hidden Relationship Discovery" (2017). Open Access Dissertations. 1676.
https://docs.lib.purdue.edu/open_access_dissertations/1676