Practical Differential Privacy for High-dimensional and Graph Data

Wei-Yen Day, Purdue University

Abstract

Differential privacy has emerged as a de facto standard of privacy notion. It is widely adopted in various domains, including data publishing, data mining, and interactive database queries. However, applying differential privacy on complex data still remains challenging due to the huge change of sensitivity. In this dissertation, we introduce three major topics about publishing information with high-dimensional and graph data under differential privacy. The first topic discusses the possibility of publishing column counts from high-dimensional data under differential privacy, with a proposed technique called sensitivity control. The idea is to limit the contribution of data records such that sensitivity can be limited. We solve the challenge of balancing the sensitivity level and remaining data utility. The second topic aims at solving the problem of high-dimensional data classification with differential privacy. We propose PrivWalk, a greedily walking algorithm that iteratively searches the optimal model and also automatically determines the number of steps given a privacy budget. In the third topic, we advance the technique to publish degree distribution from a graph under node-differential privacy. We develop a projection technique that preserves that most utility and also limits the sensitivity. Based on the projection method, we propose two approaches for publishing degree histograms. The experiments of the three topics demonstrate that our proposed techniques significantly improve the existing state of the art, making differential privacy on high-dimensional and graph data practical.

Degree

Ph.D.

Advisors

Li, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS