Efficient learning algorithms for Gaussian processes

Feng Yan, Purdue University

Abstract

Data in many scientific and engineering applications are structured and contain multiple aspects. There are two significant challenges for predictive modeling for these data. The first challenge is how to model nonlinear relationships between different entities in the data, and the second challenge is how to make scalable inference for predictive models. To address these challenges, we use Gaussian processes to provide data modeling solutions and develop efficient inference algorithms by exploiting the structures within the data. First, we present a sparse Gaussian process regression method, GPLasso, which uses KL divergence minimization and ℓ1 penalization to explicitly represent the tradeoff between accuracy and sparsity. GPLasso beats the state-of-the-art methods on both predictive performance and speed. Second, we propose network models based on matrix-variate Gaussian processes, which generalize the popular bilinear models to nonlinear factorization models in infinite feature spaces. Experiments on both synthetic and real-world networks demonstrate that our models have superior link prediction and community discovery performance. Last, we propose to generalize the matrix-variate Gaussian processes to tensors and apply them to nonlinear tensor factorization. Our nonlinear tensor factorization models—InfTucker—can be regarded as the classic Tucker decomposition in infinite feature spaces. In our experiments on chemometrics and social network datasets, our new models achieve significantly higher prediction accuracy than the state-of-the-art tensor decomposition approaches.

Degree

Ph.D.

Advisors

Qi, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS