Untitled

Table

Filter

Sort

Building block

Synopsis

Tags

This block highlights the origins of NLP, some key NLP frameworks, and what we can do with them. A series of examples and Python scripts show how to manipulate text with NTLK — the Swiss army knife of NLP — and spaCy — a top-class library to pre-process and analyze text corpora at scale

Computational linguistics

NLP frameworks

spaCy

Text pipelines

NLTK

Text, meanings, and maths

In this block, the attention revolves around the analytical and computational strategies to model the meaning included in a corpus of text: i) human-annotated dictionaries, and ii) word vectors. A series of examples and Python scripts show how to leverage human-annotated dictionaries and learn word vectors using text corpora regarding organizations and markets

Computational linguistics

Distributional HP

Connotations

Word vectors

Human-annotated dictionaries

NLTK

Vector semantics & embeddings

This block focuses on embeddings — a framework that relies on ML/DL to learn word vectors. A series of examples and Python scripts show how to harness word vectors for the analysis of organizations and markets

word2vec

Model language

GloVe

fasttext

BERT

Gensim

NumPy

SciPy

sent2vec

doc2vec

Foundational models

Topic modeling

Revealing the hidden themes in a corpus of text is the subject of this block. We’ll see how to design and evaluate a topic model and to post-process topic modeling outcome. A series of examples and Python scripts show how to deploy topic modeling to analyze text corpora comprising corporate filings, financial analyst reports, or product reviews

LDA

Tomotopy

Gensim

Unsupervised learning

Text classification

This block copes with the problem of text classification, the task behind sentiment analysis, and many other NLP frameworks. A series of examples and Python scripts illustrate how to implement different classifiers, from the Naive Bayes Classifier to Deep-Learning powered classifiers. Special attention is devoted to product review data

Sentiment analysis

Affect lexicons

NBC

Custom affect lexicons

PyTorch

NLP frameworks

Semi-supervised learning

Word vectors

Supervised learning

Information extraction [bonus track]

Here, the focus is on various tasks that fall within the remit of information extraction. Examples include Named Entity Recognizer, identifying events, times, and relations among entities. A series of Python scripts illustrate how to extract ‘structured’ information out of a variety of text corpora comprising data on organizations and markets

Named entity recognition

Prodigy

flair

spaCy

Supervised learning