Cyber Center Publications

Efficient k-Anonymization Using Clustering Techniques

Ji-Won Byun
Ashish Kamra
Elisa Bertino, Purdue UniversityFollow
Ninghui Li, Purdue UniversityFollow

Abstract

k-anonymization techniques have been the focus of intense research in the last few years. An important requirement for such techniques is to ensure anonymization of data while at the same time minimizing the information loss resulting from data modifications. In this paper we propose an approach that uses the idea of clustering to minimize information loss and thus ensure good data quality. The key observation here is that data records that are naturally similar to each other should be part of the same equivalence class. We thus formulate a specific clustering problem, referred to as k-member clustering problem. We prove that this problem is NP-hard and present a greedy heuristic, the complexity of which is in O(n ²). As part of our approach we develop a suitable metric to estimate the information loss introduced by generalizations, which works for both numeric and categorical data.

Keywords

anonymization, data modifications, clustering, NP hard

Date of this Version

2007

Comments

Advances in Databases: Concepts, Systems and Applications Lecture Notes in Computer Science, 2007, Volume 4443/2007, 188-200

Link to Full Text

COinS

Cyber Center Publications

Efficient k-Anonymization Using Clustering Techniques

Abstract

Keywords

Date of this Version

Comments

Search

Links

Links for Authors

Browse

Cyber Center Publications

Efficient k-Anonymization Using Clustering Techniques

Authors

Abstract

Keywords

Date of this Version

Comments

Share

Search

Links

Links for Authors

Browse