Extract Restaurant Aspect Words by Using Word2vec Model from Yelp Reviews

Penghao Wang, Purdue University

Abstract

Many customers use Yelp restaurant reviews to determine where they will eat. However, it is almost impossible for users to read all the reviews because of the large number of reviews. Customers usually have different concerns about restaurants like ambiance, service, and food quality. Providing customer reviews that concentrate on the aspects that they are concerned with would save customers time and help them to find better restaurants. This thesis is an investigation into selecting such useful aspects. To achieve that, one needs to detect which aspects are mentioned in the review. In this study, the researcher used a Word2vec model to detect the aspect words and tested the performance of the model on the manually labeled data and compared the results to a statistical model. The author presented three tasks the Word2vec model could do: (1) detecting food categories, (2) detecting informal words, and (3) detecting typos and abbreviations. As a result, the author found the Word2vec model yielded better performance results than the statistical topic model on the aspect extraction task.

Degree

M.S.

Advisors

Rayz, Purdue University.

Subject Area

Information Technology|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS