/
...
/
/
💭
Hidden themes in Tripadvisor reviews
Search
Try Notion
💭💭
Hidden themes in Tripadvisor reviews
Context
Consider the hotel_review.csv file, containing 20,491 hotel reviews from Tripadvisor. Each row in the file is a review; the first column contains textual reviews; the second column contains ratings. By using topic modeling, you aim to produce the features to train a classifier that adjudicates between ‘bad rating’ (e.g., 1 or 2 stars) and ‘good rating’ reviews (e.g., 4 or 5 stars).
Problem
Use Gensim, Gensim with Mallet, or Tomotopy to:
explore competing topic models — i.e., models retaining a different number of topics
select the most appropriate topic model (i.e. the best number of topics) given the nature of the task, which consists of creating the features to train a classifier
motivate your choice for the best topic model