Comparison of clustered RDF data stores

Venkata N. Ramarekha Patchigolla, Purdue University

Abstract

Storing data in RDF format helps in simpler data interchange among different researchers compared to present approaches. There has been tremendous increase in the applications that use RDF data. The nature of RDF data is such that it tends to increase explosively. This makes it necessary to consider the time for retrieval and scalability of data while selecting a suitable RDF data store for developing applications. The research concentrates on comparing BigOWLIM. Bigdata, 4store and Virtuoso RDF stores on basis of their scalability and performance of storing and retrieving cancer proteomics and mass spectrometry data using SPARQL queries. In this research the author compares RDF data stores on a single machine as baseline and extends 4store and BigOWLIM data stores on a cluster for comparison. The author uncovers that Virtuoso has the best performance on data consisting of less than 250,000 triples whereas 4store has better scalability and performance for the larger data.

Degree

M.S.

Advisors

Springer, Purdue University.

Subject Area

Computer Engineering|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS