Tagging Stream Data for Rich RealTime

Rimma Nehme
Elke Rundensteiner
Elisa Bertino, Purdue University

Original Manuscript

Abstract

In recent years, data streams have become ubiquitous as technology is improving and the prices of portable devices are falling, e.g., sensor networks, location-based services. Most data streams transmit only data tuples based on which continuous queries are evaluated. In this paper, we propose to enrich data streams with a new type of metadata called streaming tags or short tick-tags1. The fundamental premise of tagging is that users can label data using uncontrolled vocabulary, and these tags can be exploited in a wide variety of applications, such as data exploration, data search, and to produce “enriched” with additional semantics, thus more informative query results. In this paper we focus primarily on the problem of continuous query processing with streaming tags and tagged objects, and address the tick-tag semantic issues as well as efficiency concerns. Our main contributions are as follows. First, we specify a general and flexible Stream Tag Framework (or short STF) that supports a stream-centric approach to tagging, and where tick-tags, attached to streaming objects are treated as first-class citi- zens. Second, under STF, users can query tags explicitly as well as implicitly by outputting the tags of the base data together with query results. Finally, we have implemented STF in a prototype Data Stream Management System, and through a set of performance experiments, we show that the cost of stream tagging is small and the approach is scalable to a large percentage of tagged objects.