Privacy -preserving distributed data mining and processing on horizontally partitioned data

Murat Kantarcioglu, Purdue University


Data mining can extract important knowledge from large data collections, but sometimes these collections are split among various parties. Data warehousing, bringing data from multiple sources under a single authority, increases risk of privacy violations. Furthermore, privacy concerns may prevent the parties from directly sharing even some meta-data. Distributed data mining and processing provide a means to address this issue, particularly if queries are processed in a way that avoids the disclosure of any information beyond the final result. This thesis presents methods to mine horizontally partitioned data without violating privacy and shows how to use the data mining results in a privacy-preserving way. The methods incorporate cryptographic techniques to minimize the information shared, while adding as little as possible overhead to the mining and processing task.




Clifton, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server