Context
Consider the hotel_review.csv file, containing 20,491 hotel reviews from Tripadvisor. Each row in the file is a review; the first column contains textual reviews; the second column contains ratings. By using topic modeling, you aim to produce the features to train a classifier that adjudicates between ‘bad rating’ (e.g., 1 or 2 stars) and ‘good rating’ reviews (e.g., 4 or 5 stars).
Problem
explore competing topic models — i.e., models retaining a different number of topics
select the most appropriate topic model (i.e. the best number of topics) given the nature of the task, which consists of creating the features to train a classifier
motivate your choice for the best topic model