A Machine Learning Based Web Service for Malicious URL Detection in a Browser

Hafiz Muhammad Junaid Khan, Purdue University

Abstract

Malicious URLs pose serious cybersecurity threats to the Internet users. It is critical to detect malicious URLs so that they could be blocked from user access. In the past few years, several techniques have been proposed to differentiate malicious URLs from benign ones with the help of machine learning. Machine learning algorithms learn trends and patterns in a dataset and use them to identify any anomalies. In this work, we attempt to find generic features for detecting malicious URLs by analyzing two publicly available malicious URL datasets. In order to achieve this task, we identify a list of substantial features that can be used to classify all types of malicious URLs. Then, we select the most significant lexical features by using Chi-Square and ANOVA based statistical tests. The effectiveness of these feature sets is then tested by using a combination of single and ensemble machine learning algorithms. We build a machine learning based real-time malicious URL detection system as a web service to detect malicious URLs in a browser. We implement a chrome extension that intercepts a browser’s URL requests and sends them to web service for analysis. We implement the web service as well that classifies a URL as benign or malicious using the saved ML model. We also evaluate the performance of our web service to test whether the service is scalable.

Degree

M.Sc.

Advisors

Devabhaktuni, Purdue University.

Subject Area

Criminology|Web Studies

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS