49 lines
1.7 KiB
Markdown
Raw Normal View History

2025-04-20 11:59:16 +02:00
# Lion (evoLved sIgn mOmeNtum)[^official-paper]
`Lion` is a ***genetic search algorithm*** aimed to
find the best `optimizer`.
It starts from a population of `AdamW` algorithms to
***speed up the search***. Opposed to
`Adam` and `AdamW`, it keeps track
***only for the momentum*** and ***gradient sign***,
requiring ***less `memory`***.
Since ***uniform updates yields larger norms***,
`Lion` requires a ***smaller `learning-rate`***
and a ***larger decoupled `weight-decay`***
$\lambda$[^official-paper-1].
The ***advantages of `Lion` over `Adam` and `AdamW`
increase with the size of
the `mini-batch`***[^official-paper-1]
## Symbolic Representation[^official-paper-2]
New ***trained algorithms*** are represented
`simbolically`, bringing these advantages:
- `Algorithms` must be ***implemented*** as `programs`
- It ***easier to analyze, comprehend and transfer to
new task*** these `algorithms`, rather than other
`algorithms` such as `NeuralNetworks`
- We can **estimate the *complexity*** by looking
at the ***length of code***
## Tournament[^official-paper-3]
The best code is found with a ***tournament style
evolution***. Each cycle it picks the ***best
`algorithm`*** which will be
***copied and mutated*** and the ***oldest is removed***
<!-- Footnotes -->
[^official-paper]: [Official Lion Paper | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
[^official-paper-1]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
[^official-paper-2]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
[^official-paper-3]: [Official Lion Paper| Paragraph 2 pg. 4-5 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)