Efficient tree pattern queries on encrypted XML documents

Abstract

Outsourcing XML documents is a challenging task, because it encrypts the documents, while still requiring efficient query processing. Past approaches on this topic either leak structural information or fail to support searching that has constraints on XML node content. In addition, they adopt a filtering-and-refining framework, which requires the users to prune false positives from the query results. To address these problems, we present a solution for efficient evaluation of tree pattern queries (TPQs) on encrypted XML documents. We create a domain hierarchy, such that each XML document can be embedded in it. By assigning each node in the hierarchy a position, we create for each document a vector, which encodes both the structural and textual information about the document. Similarly, a vector is created also for a TPQ. Then, the matching between a TPQ and a document is reduced to calculating the distance between their vectors. For the sake of privacy, such vectors are encrypted before being outsourced. To improve the matching efficiency, we use a k-d tree to partition the vectors into non-overlapping subsets, such that non-matchable documents are pruned as early as possible. The extensive evaluation shows that our solution is efficient and scalable to large dataset.

Keywords

filtering and refining framework, tree pattern queries, TPQs, privacy

Date of this Version

2013

DOI

10.1145/2457317.2457338

Comments

Published in:
· Proceeding
EDBT '13 Proceedings of the Joint EDBT/ICDT 2013 Workshops
Pages 111-120

Share

COinS