Digital provenance - models, systems, and applications
Abstract
Data provenance refers to the history of creation and manipulation of a data object and is being widely used in various application domains including scientific experiments, grid computing, file and storage system, streaming data etc. However, existing provenance systems operate at a single layer of abstraction (workflow/process/OS) at which they record and store provenance whereas the provenance captured from different layers provide the highest benefit when integrated through a unified provenance framework. To build such a framework, a comprehensive provenance model able to represent the provenance of data objects with various semantics and granularity is the first step. In this thesis, we propose a such a comprehensive provenance model and present an abstract schema of the model. We further explore the secure provenance solutions for distributed systems, namely streaming data, wireless sensor networks (WSNs) and virtualized environments. We design a customizable file provenance system with an application to the provenance infrastructure for virtualized environments. The system supports automatic collection and management of file provenance metadata, characterized by our provenance model. Based on the proposed provenance framework, we devise a mechanism for detecting data exfiltration attack in a file system. We then move to the direction of secure provenance communication in streaming environment and propose two secure provenance schemes focusing on WSNs. The basic provenance scheme is extended in order to detect packet dropping adversaries on the data flow path over a period of time. We also consider the issue of attack recovery and present an extensive incident response and prevention system specifically designed for WSNs.
Degree
Ph.D.
Advisors
Bertino, Purdue University.
Subject Area
Computer Engineering|Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.