Get started with BERT

BERT is the most important new tool in NLP. Ready to become a BERT expert?

With BERT, you can achieve high accuracy with low effort in design, on a variety of tasks in NLP.  

What you'll learn

The Collection includes ALL of my BERT-related content!

BERT's Architecture

The Inner Workings of BERT eBook provides an in-depth tutorial of BERT's architecture and why it works.

BERT's Applications

Tutorials and example code for a wide variety of common BERT use-cases will help jump start your own project.

Launch your BERT project

The BERT Collection includes 12 examples--all are written in Python, built on PyTorch and the hugginface/transformers library, and run on a free GPU in Google Colab!

Document Classification

Text Classification

Learn the basics of classifying longer pieces of text with BERT.


Text classification, but now on a dataset where document length is more crucial, and where GPU memory becomes a limiting factor.


Learn how to customize BERT's classification layer to different tasks--in this case, classifying text where each sample can have multiple labels.

Domain-Specific Text

BERT Variants

Learn how to find and apply publicly-available variants of BERT tailored to specific domains such as medical text.

Adding Vocab

Add terms to BERT's vocabulary, and improve BERT's accuracy by continuing to Pre-Train BERT on unlabeled text from your domain.

Beyond English

Learn how Multilingual BERT models help you apply BERT to other languages beyond English (even languages with limited training text!)

Question Answering, NER

Question Answering Basics

Learn the details of how BERT is applied to search reference text for the answer to a given question. Try with your own examples!

Fine-Tuning on SQuAD

Training BERT on the SQuAD question answering dataset is tricky, but this Notebook will walk you through it!

Named Entity Recognition

Fine-tune BERT to recognize custom entity classes in a restaurant dataset.

Basics of BERT on PyTorch

Word Embeddings

Learn the basics of BERT's input formatting, and how to extract "contextualized" word and sentence embeddings from text.

Sentence Classification

Learn the basics of fine-tuning BERT with PyTorch and the huggingface/transformers library.


See how to adapt any of our examples to train on a multi-GPU system.

Hi, I'm Chris McCormick

I help researchers, students, and developers like you to master the most difficult concepts in AI...

with legible code, simple  illustrations, and video walkthroughs.

The Inner Workings of BERT eBook

Learn BERT's architecture and implementation, and gain insight into why it works so well!

