Haystack docs home page

NLP Resources

Here are some links to resources about the core concepts of Natural Language Processing (NLP) that will help you get started with Haystack.

What is NLP?

Learn about what is possible when we apply computational power to language processing.

TitleTypeAuthorDescriptionLevel
Natural Language Processing (NLP)BlogIBMHigh level introduction to the tasks, tools, and use cases of NLP.Beginner
Introudction to NLPVideoData Science DojoCovers many of the different tasks from part-of-speech tagging to the creation of word embeddings. Contains some probabilistic notation.Intermediate
Text Classification with NLP: Tf-Idf vs Word2Vec vs BERTBlog with CodeMauro Di PietroHands-on and in depth dive into text classification using TF-IDF, Word2Vec and BERT.Intermediate

Search and Question Answering

There are many different flavors of search. Learn the differences between them and understand how the task of question answering can improve the search experience.

TitleTypeAuthorDescriptionLevel
Question Answering at Scale With HaystackBlogBranden Chan (deepset)High level description of the Retriever-Reader pipeline that gives some intuition about how it works, how it can be deployed.Beginner
Understanding Semantic SearchBlogBranden Chan (deepset)Disambiguates search jargon and explains the differences between various styles of search.Beginner
Haystack: The State of Search in 2021BlogBranden Chan (deepset)Description of the Retriever-Reader pipeline and an introduction to some complementary tasks.Beginner
Modern Question Answering Systems ExplainedBlogBranden Chan (deepset)Illustrated deeper dive into the inner workings of the Reader model.Beginner
How to Build an Open-Domain Question Answering System?BlogLilian WengComprehensive look into the inner workings of a Question Answering system. Contains a lot of mathematical notation.Advanced

Text Vectorization and Embeddings

In NLP, text is often converted into a sequence of numbers called an embedding. Learn how they are generated and why they are useful.

TitleTypeAuthorDescriptionLevel
What Is Text Vectorization? Everything You Need to KnowBlogBranden Chan (deepset)High-level overview of text vectorization starting from TF-IDF to Transformers.Beginner
Word Embeddings for NLPBlogRenu KhandelwalGives good intuition of what word embeddings are and how we use them. Contains some helpful illustrations.Intermediate
Introduction to Word Embedding and Word2VecBlogDhruvil KaraniA deeper dive into the CBOW and Skip Gram versions of Word2Vec.Advanced

BERT and Transformers

The majority of the latest NLP systems use a machine learning architecture called the Transformer. BERT is one of the first models of this kind. Learn why these were so revolutionary and how they work.

TitleTypeAuthorDescriptionLevel
From Language Model to Haystack ReaderDocumentationdeepsetHigh level overview of how language models, Readers and prediction heads are all relatedBeginner
Intuitive Explanation of BERT- Bidirectional Transformers for NLPBlogRenu KhandelwalTouches upon many of the concepts that are essential to understanding how Transformers work.Beginner
A dummy’s guide to BERTBlogNicole NairA good high-level summary of the BERT paper.Beginner
Learn About Transformers: A RecipeBlogElvis SaraviaLinks to many other resources that give explanations or implementations of the Transformer architecture.Intermediate
The Illustrated TransformerBlogJay AlammarExcellent visualization of the inner workings of transformer models. Gets quite deep into details.Advanced
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)BlogJay AlammarExcellent visualization of the inner workings of language models. Gets quite deep into details.Advanced