Efficient storage of semantic web data
Abstract
With the adoption of RDF (Resource Description Framework), OWL (Web Ontology Language) and SPARQL (SPARQL Protocol And RDF Query Language) as standards for the semantic web, it has become essential to look into datawarehousing systems that are dedicated to working with the RDF data (World Wide Web Consortium). Traditional datawarehouses have focused on relational databases and have been optimized to work with the relational data. However, working with RDF data involves exploiting the triple nature of the data. As the size of the database increases, the time required to evaluate the queries on the database increases as well (Rohloff & Dean, et al., 2007). However, not only do the users need access to information as soon as possible, but also the information that is presented to them needs to be relevant to their search (Spink & Wolfram, et al., 2000). Through this project, the author looked into the different storage techniques for RDF data and attempted to strike a balance between the access time for information retrieval and parameters such as the storage space needed for the data and the complexity of the queries. BigOWLIM and Pellet which are built around open source frameworks such as Jena and Sesame respectively were used for this study. The work done in this project is of significance mainly to small and medium enterprises since small datasets having about a million triples have been considered.
Degree
M.S.
Advisors
Brewer, Purdue University.
Subject Area
Computer Engineering|Computer science
Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.