Multilingual BERT

How do you apply BERT's magic to languages beyond just English? Why isn't it always as simple as re-training BERT on text from your language?

In four Colab Notebooks with a video walkthrough, this tutorial explains, implements, and compares several approaches:

  • Monolingual Models 
  • Augmenting with Machine Translated Text
  • Multilingual Models 


4-Part Tutorial

4x Colab Notebooks   +   Video Walkthrough
PyTorch   +   huggingface/transformers

1. Introduction & Concepts

  • What's a "monolingual" model vs. a "multilingual" model? 
  • Why isn't monolingual the obvious choice?
  • What about Machine Translation?

2. Inspect XLM-R's Vocabulary

A model trained on 100 different languages must have a pretty strange vocabulary--let's see what's in there!

3. Multilingual Approach with XLM-R

  • Code tutorial applying XLM-R on Arabic.
  • Leverages Cross-Lingual Transfer - We'll fine-tune on English data then test on Arabic data!

4. Monolingual Approach

  • Code tutorial with community-created Arabic BERT model.
  • Train with machine-translated text. 
    • (Note: This Notebook uses existing translated text from the XNLI dataset--it does not include code for translating new text).


Ready to learn a whole new skill?

NLP Base Camp Members have complete access to this tutorial
and all of my NLP content!

Supported Languages

These Notebooks can be easily modified to run for any of the 15 languages included in the XNLI benchmark!

  1. Arabic
  2. Bulgarian
  3. German
  4. Greek
  5. English
  6. Spanish
  7. French
  8. Hindi

9. Russian

10. Swahili

11. Thai

12. Turkish

13. Urdu

14. Vietnamese

15. Chinese

(Note: Monolingual Notebook requires finding a BERT model trained on your language) 


50% Complete

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.