Achieving practical differential privacy
Abstract
Since the introduction of differential privacy to the field of privacy preserving data analysis, many privacy preserving algorithms that interact with sensitive data have been developed. In this dissertation, we consider some practical issues arising in the real application of algorithms introduced in the literature. We start by examining the impact of the privacy parameter epsilon on the identifiability of individuals in the data. We demonstrate that for apparently reasonable values of epsilon the adversary sometimes can identify individuals with high confidence even when every data access is differentially private. However, this does not mean that privacy guarantee of differential privacy is weak, but shows the difficulty of choosing the proper value of epsilon. To bridge the gap between theoretical guarantee of differential privacy and individual identifiability, we provide an alternative parameterization for differential privacy, called ρ-differential identifiability. Differential identifiability ensures that the probability of identifying any individual is limited to ρ. The main challenge in private pattern search is how to guide the algorithm in the huge search space. We introduce a variant of the sparse vector technique and demonstrate how it can be used for finding frequent itemsets. Finally, we study the feasibility of using the classical random projection technique for achieving differential privacy. We show that the distribution of the inner product of two randomly projected vectors already contains enough noise to satisfy differential privacy.
Degree
Ph.D.
Advisors
Clifton, Purdue University.
Subject Area
Information science|Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.