Managing uncertainty in constantly-evolving environments
Abstract
In systems that monitor continuously-changing entities like temperature values and locations of moving objects, data are obtained from sensors and streamed continuously to the database. Due to limited bandwidth and battery power, it is infeasible for the system to keep track of the actual values of the entities. Queries that use these database values can produce incorrect answers. In this dissertation, we model the uncertainty inherent to dynamic sensor data. Based on the uncertainty model, we propose probabilistic queries, which evaluate uncertain data and produce answers with probabilistic guarantees. We describe a query classification scheme, and for each class, query evaluation algorithms and answer quality metrics are presented. We also study the semantics of probabilistic comparison operators that return imprecise comparison results. We illustrate how uncertainty management techniques for sensor data can be extended to location data in mobile databases. Although probabilistic queries are more informative than traditional queries due to the probability values accompanying their answers, they are also more expensive to compute. In this thesis, we investigate the efficiency of probabilistic query evaluation. In particular, we propose I/O- and computationally-efficient algorithms to enhance the performance of nearest-neighbor queries, range queries, and joins. Experimental evaluations show that these algorithms can perform significantly better than algorithms that do not consider uncertainty. We demonstrate how the proposed ideas are realized in a practical database system. Finally, we outline the future directions of this research work, and explore the possibility of applying probabilistic queries to solve new problems, such as sensor selection and location privacy.
Degree
Ph.D.
Advisors
Prabhakar, Purdue University.
Subject Area
Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.