Added Chapters 12 and 13

This commit is contained in:
Christian Risi
2025-09-16 20:54:51 +02:00
parent f5bc43c59b
commit ed35a6196d
2 changed files with 64 additions and 7 deletions

View File

@@ -21,8 +21,8 @@ to the first `encoder` and `decoder`.
Here we transform each word of the input into an ***embedding*** and add a vector to account for
position. This positional encoding can either be learnt or can follow this formula:
- Even size:
$$
\text{positional\_encoding}_{
(position, 2\text{size})
@@ -40,7 +40,9 @@ $$
}
\right)
$$
- Odd size:
$$
\text{positional\_encoding}_{
(position, 2\text{size} + 1)
@@ -59,7 +61,6 @@ $$
\right)
$$
### Encoder
> [!CAUTION]
@@ -164,17 +165,17 @@ It can be used as a classifier and can be fine tuned.
The fine tuning happens by **masking** input and **predict** the **masked word**:
- 15% of total words in input are masked
- 80% will become a `[masked]` token
- 10% will become random words
- 10% will remain unchanged
- 80% will become a `[masked]` token
- 10% will become random words
- 10% will remain unchanged
#### Bert tasks
- **Classification**
- **Fine Tuning**
- **2 sentences tasks**
- **Are they paraphrases?**
- **Does one sentence follow from this other one?**
- **Are they paraphrases?**
- **Does one sentence follow from this other one?**
- **Feature Extraction**: "Allows us to extract feature to use in our model
### GPT-2