What is BERT

A introduction to the BERT model

BERT is a large scale transformer-based language model That can be fine-tuned. It is developed by Google and the detail information can be found at https://arxiv.org/abs/1810.04805

Transformer: an advancement of recurrent neural net. Can parallel processing and training of instance. It make the input and output size fit, which is more convenient

From Pre-Training to Fine-Tuning:

BERT model overview
  1. Token Embedding: specify the token for that word: e.g. my -> 1

  2. Segment Embedding: specify which segment the word belongs to: e.g. sentence A or sentence B

  3. Position Embedding: specify the position of that word in the whole input

Last updated

Was this helpful?