Added Chapters 12 and 13

2025-09-16 20:54:51 +02:00
parent f5bc43c59b
commit ed35a6196d
2 changed files with 64 additions and 7 deletions
--- a/Chapters/12-Transformers/INDEX.md
+++ b/Chapters/12-Transformers/INDEX.md
@@ -21,8 +21,8 @@ to the first `encoder` and `decoder`.
 Here we transform each word of the input into an ***embedding*** and add a vector to account for
 position. This positional encoding can either be learnt or can follow this formula:

-
 - Even size:
+
 $$
 \text{positional\_encoding}_{
    (position, 2\text{size})
@@ -40,7 +40,9 @@ $$
        }
    \right)
 $$
+
 - Odd size:
+
 $$
 \text{positional\_encoding}_{
    (position, 2\text{size} + 1)
@@ -59,7 +61,6 @@ $$
    \right)
 $$

-
 ### Encoder

 > [!CAUTION]
@@ -164,17 +165,17 @@ It can be used as a classifier and can be fine tuned.
 The fine tuning happens by **masking** input and **predict** the **masked word**:

 - 15% of total words in input are masked
-    - 80% will become a `[masked]` token
-    - 10% will become random words
-    - 10% will remain unchanged
+  - 80% will become a `[masked]` token
+  - 10% will become random words
+  - 10% will remain unchanged

 #### Bert tasks

 - **Classification**
 - **Fine Tuning**
 - **2 sentences tasks**
-    - **Are they paraphrases?**
-    - **Does one sentence follow from this other one?**
+  - **Are they paraphrases?**
+  - **Does one sentence follow from this other one?**
 - **Feature Extraction**: "Allows us to extract feature to use in our model

 ### GPT-2