This block highlights the origins of NLP, some key NLP frameworks, and what we can do with them. A series of examples and Python scripts show how to manipulate text with NTLK β the Swiss army knife of NLP β and spaCy β a top-class library to pre-process and analyze text corpora at scale
Computational linguistics
In this block, the attention revolves around the analytical and computational strategies to model the meaning included in a corpus of text: i) human-annotated dictionaries, and ii) word vectors. A series of examples and Python scripts show how to leverage human-annotated dictionaries and learn word vectors using text corpora regarding organizations and markets
Computational linguistics
Human-annotated dictionaries
This block focuses on embeddings β a framework that relies on ML/DL to learn word vectors. A series of examples and Python scripts show how to harness word vectors for the analysis of organizations and markets
Revealing the hidden themes in a corpus of text is the subject of this block. Weβll see how to design and evaluate a topic model and to post-process topic modeling outcome. A series of examples and Python scripts show how to deploy topic modeling to analyze text corpora comprising corporate filings, financial analyst reports, or product reviews
This block copes with the problem of text classification, the task behind sentiment analysis, and many other NLP frameworks. A series of examples and Python scripts illustrate how to implement different classifiers, from the Naive Bayes Classifier to Deep-Learning powered classifiers. Special attention is devoted to product review data
Here, the focus is on various tasks that fall within the remit of information extraction. Examples include Named Entity Recognizer, identifying events, times, and relations among entities. A series of Python scripts illustrate how to extract βstructuredβ information out of a variety of text corpora comprising data on organizations and markets