Mapping documents onto the vector space

Context

Consider the ai_in_finance.json file, containing 4K+ articles dealing with the topics of AI & financial services from The Wall Street Journal and The Financial Times. For a description of the corpus, see Lanzolla, Gianvito, Simone Santoni, and Christopher Tucci. "Unlocking value from AI in financial services: strategic and organizational tradeoffs vs. media narratives." In Artificial Intelligence for Sustainable Value Creation. Edward Elgar Publishing, 2021.

Problem

Use the BoW or TFIDF approach to transform each document (i.e., article) in the corpus into a vector. Then, use the dimensionality reduction technique of your choice (see scikit-learn capabilities) to visualize the position of the individual documents in the vector space. You may also want to color-code the positions on the basis of document-level attributes (e.g., year of publication).