What is BERT

A introduction to the BERT model

BERT is a large scale transformer-based language model That can be fine-tuned. It is developed by Google and the detail information can be found at https://arxiv.org/abs/1810.04805

Transformer: an advancement of recurrent neural net. Can parallel processing and training of instance. It make the input and output size fit, which is more convenient

From Pre-Training to Fine-Tuning:

Token Embedding: specify the token for that word: e.g. my -> 1
Segment Embedding: specify which segment the word belongs to: e.g. sentence A or sentence B
Position Embedding: specify the position of that word in the whole input

PreviousTransfer Learning NextExploratory Data Analysis and Pre-processing

Last updated 5 years ago

Was this helpful?