Date of Award
Spring 2015
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer and Information Technology
First Advisor
Marcus Rogers
Committee Chair
Marcus Rogers
Committee Member 1
John Springer
Committee Member 2
Eric Matson
Abstract
Internet Relay Chat (IRC) was one of the first real-time communication protocols over the internet. It was not designed with any form of Authentication, Authorization and Accounting features. This made IRC channels a place to conduct transactions in complete anonymity. On the other hand with the advent of Big Data we are now able to process large quantities of data in a very short period of time. This research presents a method to use Apache Solr, a text indexing server built on top of Lucene to index and search large quantities of IRC data collected over months from public IRC channels. It even presents a highly scalable approach to monitor public IRC channels by creation of IRC Client Bots which are in turn controlled by a robust IRC Parent Bot. The data thus collected is analyzed by Apache Solr and MS SQL servers and the response times are compared. This research concluded that Apache Solr outperforms MS SQL by a very great margin and such an implementation can be used by digital forensic investigators to monitor and search public IRC channels.
Recommended Citation
Boreddy, Nikhil Reddy, "IRC channel data analysis using Apache Solr" (2015). Open Access Theses. 551.
https://docs.lib.purdue.edu/open_access_theses/551