The Inner Workings of BERT eBook

Get Started with BERT

BERT began a revolution in NLP which has made BERT-based models the most important new tools in the field. With BERT, you can achieve high accuracy with low effort in design, on a variety of tasks in NLP.

The Inner Workings of BERT will provide you with a detailed, but approachable explanation of BERT’s entire architecture. You’ll learn how it fits within the broader field of NLP, and why it does what it does so well.

Ready to become a BERT expert?

GET STARTED

What you'll learn

Intro to Transfer Learning

BERT’s strength has everything to do with a technique called Transfer Learning, so we start with a tutorial on this powerful approach to machine learning problems.

Inputs & Outputs

Before diving into the internals of BERT’s architecture, I’ve found it helpful to take a “black box” view of the model, and start with understanding:

What BERT does to prepare your text for processing
How it handles unknown words
What it produces on its output

Applications

One of BERT’s greatest strengths is its wide applicability to many common NLP tasks. It can’t do everything, though, so we’ll look at which types of applications it supports, and talk about BERT’s general strengths and weaknesses.

Self-Attention

The bulk of this eBook is devoted to explaining the internals of BERT’s architecture, and the key concept for understanding this is a mechanism called Self-Attention. I’ll provide an intuitive explanation, as well as walk you through the actual matrix operations.

Building on this understanding, we’ll look at Multi-headed Attention.

Ready to learn a whole new skill?

GET STARTED

Get Started with BERT

What you'll learn

Intro to Transfer Learning

Inputs & Outputs

Applications

Self-Attention

Ready to learn a whole new skill?

Hi, I'm Chris McCormick

I help researchers, students, and developers like you to master the most difficult concepts in AI...

with legible code, simple illustrations, and video walkthroughs.

Sneak Peek

Two Step