This script shows how to visualize word’s positions in the semantic space interactively. To do so, it relies on the Python library Whatlies, part of the spaCy’s ecosystem.
Let’s start by importing ‘SpacyLanguage’, which allows to load one of spaCy’s language models from within Whatlies. We also import EmbeddingSet to create 2D plots whose dimensions are associated with specific word vectors (e.g., ‘fine’).
Python
Copy
>>> from whatlies.language import SpacyLanguage
>>> from whatlies import EmbeddingSet
Then, we load the model of the language of our choice.
Python
Copy
>>> lang = SpacyLanguage("en_core_web_lg")
We aim at positioning the words of animals in a ad hoc semantic space whose vectors for ‘intelligent’ and ‘loyal’. Below is the set of animals (which closely resembles the animals of George Orwell’s Animal Farm) and qualities (i.e., the axes). Items results from the concatenation of the previous two lists.
Python
Copy
>>> animals = ["cow", "dog", "duck", "horse", "pig", "rooster", "sheep"]
>>> qualities = ["intelligent", "loyal"]
Almost there. We get the embeddings for all the words included in ‘items’ in a row using Whatlies’ ‘EmbeddingSet’.
Python
Copy
>>> items = animals + qualities
>>> emb = EmbeddingSet(*[lang[item] for item in items])
Finally, we create an interactive scatter whose x- and y-axis are ‘intelligent’ and ‘loyal’ respectively.
Python
Copy
>>> emb.plot_interactive(x_axis=emb["intelligent"], y_axis=emb["loyal"])
This snippet comes from the Python script “whatlies.py”, hosted in the GitHub repo simoneSantoni/NLP-orgs-markets.