Research Website
http://www.purdue.edu/discoverypark/vaccine/
Keywords
Social Media Data Analyzing, Topic Extraction using NMF, Information Search and Retrieval
Presentation Type
Event
Research Abstract
With the fast growth of social media services, vast amount of user-generated content with time-space stamps are produced everyday. Considerable amount of these data are publicly available online, some of which collectively convey information that are of interest to data analysts. Social media data are dynamic and unstructured by nature, which makes it very hard for analysts to efficiently and effectively retrieve useful information. Social Media Analytics Reporting Toolkit (SMART), a system developed at Purdue VACCINE lab, aims to support such analyzing. The current framework collects real-time Twitter messages and visualizes volume densities on a map. It uses Latent Dirichilet Allocation (LDA) to extract regional topics and can optionally apply Seasonal-Trend decomposition using Loess (STL) to detect abnormal events. While Twitter has a fair amount of active users, they account for a small portion of total active social media users. Data generated by many other social media services are not currently utilized by SMART. Therefore, my work focused on expanding data sources of SAMRT system by creating means to collect data from other sources such as Facebook and Instagram. During a test run using a collection of 88 specified keywords in search, over two million Facebook posts were collected in one week. Besides, current SMART framework utilizes only one topic model, i.e. LDA, which is considered to be slower than Non-negative Matrix Factorization (NMF) model, thus I also put my effort into integrating NMF algorithm into the system. The improved SMART system can be used to fulfill a variety of analyzing tasks such as monitoring regional social media responses from different sources in disastrous events, detecting user reported crimes and so on. SMART is currently an ongoing and promising project that can be further improved by integrating new features.
Session Track
Data Analytics
Recommended Citation
Yuchen Cui, Junghoon Chae, and David Ebert,
"Social Media Analytics Reporting Toolkit"
(August 7, 2014).
The Summer Undergraduate Research Fellowship (SURF) Symposium.
Paper 100.
https://docs.lib.purdue.edu/surf/2014/presentations/100
Social Media Analytics Reporting Toolkit
With the fast growth of social media services, vast amount of user-generated content with time-space stamps are produced everyday. Considerable amount of these data are publicly available online, some of which collectively convey information that are of interest to data analysts. Social media data are dynamic and unstructured by nature, which makes it very hard for analysts to efficiently and effectively retrieve useful information. Social Media Analytics Reporting Toolkit (SMART), a system developed at Purdue VACCINE lab, aims to support such analyzing. The current framework collects real-time Twitter messages and visualizes volume densities on a map. It uses Latent Dirichilet Allocation (LDA) to extract regional topics and can optionally apply Seasonal-Trend decomposition using Loess (STL) to detect abnormal events. While Twitter has a fair amount of active users, they account for a small portion of total active social media users. Data generated by many other social media services are not currently utilized by SMART. Therefore, my work focused on expanding data sources of SAMRT system by creating means to collect data from other sources such as Facebook and Instagram. During a test run using a collection of 88 specified keywords in search, over two million Facebook posts were collected in one week. Besides, current SMART framework utilizes only one topic model, i.e. LDA, which is considered to be slower than Non-negative Matrix Factorization (NMF) model, thus I also put my effort into integrating NMF algorithm into the system. The improved SMART system can be used to fulfill a variety of analyzing tasks such as monitoring regional social media responses from different sources in disastrous events, detecting user reported crimes and so on. SMART is currently an ongoing and promising project that can be further improved by integrating new features.
https://docs.lib.purdue.edu/surf/2014/presentations/100