Abstract

With the adoption of RDF (Resource Description Framework), OWL (Web Ontology Language) and SPARQL (SPARQL Protocol And RDF Query Language) as standards for the semantic web, it has become essential to look into datawarehousing systems that are dedicated to working with the RDF data (World Wide Web Consortium). Traditional datawarehouses have focused on relational databases and have been optimized to work with the relational data. However, working with RDF data involves exploiting the triple nature of the data. As the size of the database increases, the time required to evaluate the queries on the database increases as well (Rohloff & Dean, et al., 2007). However, not only do the users need access to information as soon as possible, but also the information that is presented to them needs to be relevant to their search (Spink & Wolfram, et al., 2000). Through this project, the author looked into the different storage techniques for RDF data and attempted to strike a balance between the access time for information retrieval and parameters such as the storage space needed for the data and the complexity of the queries. BigOWLIM and Pellet which are built around open source frameworks such as Jena and Sesame respectively were used for this study. The work done in this project is of significance mainly to small and medium enterprises since small datasets having about a million triples have been considered.

Date of this Version

7-12-2010

Department

Computer and Information Technology

Department Head

Gary Bertoline

Month of Graduation

August

Year of Graduation

2010

Degree

Master of Science

Head of Graduate Program

Gary Bertoline

Advisor 1 or Chair of Committee

Jeffrey Brewer

Advisor 2

James Mohler

Advisor 3

John Springer

Share

COinS