Date of Award

12-2017

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

Committee Chair

Jennifer Neville

Committee Co-Chair

Yuan Qi

Committee Member 1

Ninghui Li

Committee Member 2

Michael Gribskov

Abstract

Real-world data often encompass hidden relationships, such as interactions between modes in multidimensional arrays (or tensors), subsets of features correlated to specifc responses, and associations between heterogeneous data sources. Uncovering these relationships is a key problem in machine learning and data mining, and relates to numerous applications ranging from information security to imaging genetics and to computational advertisement. However, to mine these relationships, we have to face several signifcant challenges. First, how can we design powerful models to capture the complicated, potentially highly nonlinear patterns in data? Second, how can we develop effcient model estimation algorithms to deal with real-world large data volumes, say, millions of features and billions of tensor elements? In this dissertation, we aim to address these challenges using Bayesian learning techniques. Compared with other types of methodologies, Bayesian learning has a unique advantage — it provides a highly principled, interpretable mathematical framework for data modeling and reasoning under uncertainty. We use two families of Bayesian approaches, namely Bayesian nonparametrics and sparse learning, to uncover the fundamental relationships hidden in data. That is, the interactive relationships between multiple entities within tensors, where each mode represents a particular type of entity, e.g., a three-mode (user, movie, music) tensor, and the correlated relationships between features and responses in high dimensional and multiview data.

Share

COinS