bdbms - A Database Management System for Biological

Mohamed Eltabakh
Mourad Ouzzani
Walid G. Aref, Purdue University

Original manuscript.

Abstract

Biologists are increasingly using databases for storing and anaging their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce

bdbms , an extensible prototype database management system f

or supporting

biological data. bdbms extends the unctionalities of current DBMSs with: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as and querying rst class objects in bdbms, (2) Local dependency tracking and querying rst class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based

authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms.