From 72eb937b471ac863d17bdac4c7e8cb28412b844e Mon Sep 17 00:00:00 2001
From: Christian Risi <75698846+CnF-Gris@users.noreply.github.com>
Date: Wed, 17 Sep 2025 12:51:14 +0200
Subject: [PATCH] Fixed Markdown violations

---
 docs/RESOURCES.md | 43 ++++++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/docs/RESOURCES.md b/docs/RESOURCES.md
index 65c47d6..225b83c 100644
--- a/docs/RESOURCES.md
+++ b/docs/RESOURCES.md
@@ -1,28 +1,31 @@
 # Byte-Pair Encoding (BPE)
 
 ## Overview
-Byte-Pair Encoding (BPE) is a simple but powerful text compression and tokenization algorithm.  
+
+Byte-Pair Encoding (BPE) is a simple but powerful text compression and tokenization algorithm.
 Originally introduced as a data compression method, it has been widely adopted in **Natural Language Processing (NLP)** to build subword vocabularies for models such as GPT and BERT.
 
 ---
 
 ## Key Idea
-BPE works by iteratively replacing the most frequent pair of symbols (initially characters) with a new symbol.  
+
+BPE works by iteratively replacing the most frequent pair of symbols (initially characters) with a new symbol.
 Over time, frequent character sequences (e.g., common morphemes, prefixes, suffixes) are merged into single tokens.
 
 ---
 
 ## Algorithm Steps
-1. **Initialization**  
+
+1. **Initialization**
    - Treat each character of the input text as a token.
 
-2. **Find Frequent Pairs**  
+2. **Find Frequent Pairs**
    - Count all adjacent token pairs in the sequence.
 
-3. **Merge Most Frequent Pair**  
+3. **Merge Most Frequent Pair**
    - Replace the most frequent pair with a new symbol not used in the text.
 
-4. **Repeat**  
+4. **Repeat**
    - Continue until no frequent pairs remain or a desired vocabulary size is reached.
 
 ---
@@ -31,14 +34,15 @@ Over time, frequent character sequences (e.g., common morphemes, prefixes, suffi
 
 Suppose the data to be encoded is:
 
-```
+```text
 aaabdaaabac
 ```
 
 ### Step 1: Merge `"aa"`
+
 Most frequent pair: `"aa"` → replace with `"Z"`
 
-```
+```text
 ZabdZabac
 Z = aa
 ```
@@ -46,9 +50,10 @@ Z = aa
 ---
 
 ### Step 2: Merge `"ab"`
+
 Most frequent pair: `"ab"` → replace with `"Y"`
 
-```
+```text
 ZYdZYac
 Y = ab
 Z = aa
@@ -57,9 +62,10 @@ Z = aa
 ---
 
 ### Step 3: Merge `"ZY"`
+
 Most frequent pair: `"ZY"` → replace with `"X"`
 
-```
+```text
 XdXac
 X = ZY
 Y = ab
@@ -73,9 +79,10 @@ At this point, no pairs occur more than once, so the process stops.
 ---
 
 ## Decompression
+
 To recover the original data, replacements are applied in **reverse order**:
 
-```
+```text
 XdXac
 → ZYdZYac
 → ZabdZabac
@@ -85,13 +92,15 @@ XdXac
 ---
 
 ## Advantages
-- **Efficient vocabulary building**: reduces the need for massive word lists.  
-- **Handles rare words**: breaks them into meaningful subword units.  
-- **Balances character- and word-level tokenization**.  
+
+- **Efficient vocabulary building**: reduces the need for massive word lists.
+- **Handles rare words**: breaks them into meaningful subword units.
+- **Balances character- and word-level tokenization**.
 
 ---
 
 ## Limitations
-- Does not consider linguistic meaning—merges are frequency-based.  
-- May create tokens that are not linguistically natural.  
-- Vocabulary is fixed after training.  
+
+- Does not consider linguistic meaning—merges are frequency-based.
+- May create tokens that are not linguistically natural.
+- Vocabulary is fixed after training.