Database support for uncertain data

Sarvjeet Singh, Purdue University

Abstract

In recent years, the field of uncertainty management in databases has received considerable interest due to the presence of numerous applications that handle probabilistic data. In this dissertation, we identify and solve important issues for managing uncertain data natively at the database level. We propose the semantics of join operation in the presence of attribute uncertainty and present various pruning techniques to significantly improve the join performance. Two index structures for indexing categorical uncertain data are also presented. For optimization of probabilistic queries, we discuss novel selectivity estimation techniques. We also introduce a new model for handling arbitrary pdf (both discrete and continuous) attributes natively at the database level. This model is consistent with Possible Worlds Semantics and is closed under the fundamental relation operations of selection, projection and join. We also present and discuss the implementation of Orion – a relational database with native support for uncertain data. Orion is developed as an extension of the open source relational database, PostgreSQL. The experiments performed in Orion show the effectiveness and efficiency of our approach.

Degree

Ph.D.

Advisors

Prabhakar, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS