Cyber Center Publications

Efficient privacy-aware record integration

Mehmet Kuzu, University of Texas at Dallas, Richardson, TX , USA
Murat Kantarcioglu, University of Texas at Dallas, Richardson, TX , USA
Ali Inan, Isik University, Istanbul, Turkey
Elisa Bertino, Purdue University, USAFollow
Elizabeth Durham, Vanderbilt University, Nashville, TN , USA
Bradley Malin, Vanderbilt University, Nashville, TN , USA

Abstract

The integration of information dispersed among multiple repositories is a crucial step for accurate data analysis in various domains. In support of this goal, it is critical to devise procedures for identifying similar records across distinct data sources. At the same time, to adhere to privacy regulations and policies, such procedures should protect the confidentiality of the individuals to whom the information corresponds. Various private record linkage (PRL) protocols have been proposed to achieve this goal, involving secure multi-party computation (SMC) and similarity preserving data transformation techniques. SMC methods provide secure and accurate solutions to the PRL problem, but are prohibitively expensive in practice, mainly due to excessive computational requirements. Data transformation techniques offer more practical solutions, but incur the cost of information leakage and false matches.

In this paper, we introduce a novel model for practical PRL, which 1) affords controlled and limited information leakage, 2) avoids false matches resulting from data transformation. Initially, we partition the data sources into blocks to eliminate comparisons for records that are unlikely to match. Then, to identify matches, we apply an efficient SMC technique between the candidate record pairs. To enable efficiency and privacy, our model leaks a controlled amount of obfuscated data prior to the secure computations. Applied obfuscation relies on differential privacy which provides strong privacy guarantees against adversaries with arbitrary background knowledge. In addition, we illustrate the practical nature of our approach through an empirical analysis with data derived from public voter records.

Keywords

differential privacy, experimentation, information filtering, performance, privacy, record linkage, security, integrity, and protection, statistical databases

Date of this Version

2013

DOI

10.1145/2452376.2452398

Comments

Published in:
· Proceeding
EDBT '13 Proceedings of the 16th International Conference on Extending Database Technology
Pages 167-178

Link to Full Text

Find in your library

COinS

Cyber Center Publications

Efficient privacy-aware record integration

Abstract

Keywords

Date of this Version

DOI

Comments

Search

Links

Links for Authors

Browse

Cyber Center Publications

Efficient privacy-aware record integration

Authors

Abstract

Keywords

Date of this Version

DOI

Comments

Share

Search

Links

Links for Authors

Browse