From 26e0f11b428d0e6c44bc0c1049cc7014ce63aae6 Mon Sep 17 00:00:00 2001 From: Christian Risi <75698846+CnF-Gris@users.noreply.github.com> Date: Sun, 20 Apr 2025 11:59:16 +0200 Subject: [PATCH] Added Lion --- Chapters/5-Optimization/Fancy-Methods/LION.md | 48 +++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 Chapters/5-Optimization/Fancy-Methods/LION.md diff --git a/Chapters/5-Optimization/Fancy-Methods/LION.md b/Chapters/5-Optimization/Fancy-Methods/LION.md new file mode 100644 index 0000000..ed945ca --- /dev/null +++ b/Chapters/5-Optimization/Fancy-Methods/LION.md @@ -0,0 +1,48 @@ +# Lion (evoLved sIgn mOmeNtum)[^official-paper] + +`Lion` is a ***genetic search algorithm*** aimed to +find the best `optimizer`. + +It starts from a population of `AdamW` algorithms to +***speed up the search***. Opposed to +`Adam` and `AdamW`, it keeps track +***only for the momentum*** and ***gradient sign***, +requiring ***less `memory`***. + +Since ***uniform updates yields larger norms***, +`Lion` requires a ***smaller `learning-rate`*** +and a ***larger decoupled `weight-decay`*** +$\lambda$[^official-paper-1]. + +The ***advantages of `Lion` over `Adam` and `AdamW` +increase with the size of +the `mini-batch`***[^official-paper-1] + +## Symbolic Representation[^official-paper-2] + +New ***trained algorithms*** are represented +`simbolically`, bringing these advantages: + +- `Algorithms` must be ***implemented*** as `programs` +- It ***easier to analyze, comprehend and transfer to + new task*** these `algorithms`, rather than other + `algorithms` such as `NeuralNetworks` +- We can **estimate the *complexity*** by looking + at the ***length of code*** + +## Tournament[^official-paper-3] + +The best code is found with a ***tournament style +evolution***. Each cycle it picks the ***best +`algorithm`*** which will be +***copied and mutated*** and the ***oldest is removed*** + + + +[^official-paper]: [Official Lion Paper | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675) + +[^official-paper-1]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675) + +[^official-paper-2]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675) + +[^official-paper-3]: [Official Lion Paper| Paragraph 2 pg. 4-5 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)