An event detection approach based on Twitter hashtags

Shih-Feng Yang, Purdue University


Twitter is one of the most popular microblogging services in the world. The great amount of information made Twitter an important information channel for people to know and share news. Hashtag is a popular feature when people use Twitter. It can be taken as human labeled information and is useful for people to identify the topic of a tweet. Many researchers have proposed event-detection approaches that can monitor Twitter data and determine whether special events, such as accidents, extreme weather, earthquakes, or crimes, are happening. Although many approaches considered hashtag as one of their features, few of them explicitly focused on the effectiveness of using hashtag on event detection. In this study, we proposed an event detection approach that utilizes hashtags in tweets. We adopted the feature extraction used in STREAMCUBE (Feng et al., 2015) and applied a clustering K-means approach (Lloyd, 1982) to it. The experiments were conducted on 20,514 tweets with 8,616 hashtags collected between November 13, 2015 and November 17, 2015 with general topic of the Paris Attacks. A randomly sampled subset of 200 tweets was also manually labeled by a human subject to verify the approach. Based on the collected tweets, we demonstrated that the K-means approach could perform better than STREAMCUBE in the clustering results. Also, we discussed how to set the K values for the K-means approach to lead to a better clustering performance.




Julia Taylor, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server