Knowledge modeling of phishing emails

Courtney Falk, Purdue University


This dissertation investigates whether or not malicious phishing emails are detected better when a meaningful representation of the email bodies is available. The natural language processing theory of Ontological Semantics Technology is used for its ability to model the knowledge representation present in the email messages. Known good and phishing emails were analyzed and their meaning representations fed into machine learning binary classifiers. Unigram language models of the same emails were used as a baseline for comparing the performance of the meaningful data. The end results show how a binary classifier trained on meaningful data is better at detecting phishing emails than a unigram language model binary classifier at least using some of the selected machine learning algorithms.




Taylor, Purdue University.

Subject Area

Information Technology|Information science|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server