Added Lion
This commit is contained in:
parent
0dbb35b43a
commit
26e0f11b42
48
Chapters/5-Optimization/Fancy-Methods/LION.md
Normal file
48
Chapters/5-Optimization/Fancy-Methods/LION.md
Normal file
@ -0,0 +1,48 @@
|
||||
# Lion (evoLved sIgn mOmeNtum)[^official-paper]
|
||||
|
||||
`Lion` is a ***genetic search algorithm*** aimed to
|
||||
find the best `optimizer`.
|
||||
|
||||
It starts from a population of `AdamW` algorithms to
|
||||
***speed up the search***. Opposed to
|
||||
`Adam` and `AdamW`, it keeps track
|
||||
***only for the momentum*** and ***gradient sign***,
|
||||
requiring ***less `memory`***.
|
||||
|
||||
Since ***uniform updates yields larger norms***,
|
||||
`Lion` requires a ***smaller `learning-rate`***
|
||||
and a ***larger decoupled `weight-decay`***
|
||||
$\lambda$[^official-paper-1].
|
||||
|
||||
The ***advantages of `Lion` over `Adam` and `AdamW`
|
||||
increase with the size of
|
||||
the `mini-batch`***[^official-paper-1]
|
||||
|
||||
## Symbolic Representation[^official-paper-2]
|
||||
|
||||
New ***trained algorithms*** are represented
|
||||
`simbolically`, bringing these advantages:
|
||||
|
||||
- `Algorithms` must be ***implemented*** as `programs`
|
||||
- It ***easier to analyze, comprehend and transfer to
|
||||
new task*** these `algorithms`, rather than other
|
||||
`algorithms` such as `NeuralNetworks`
|
||||
- We can **estimate the *complexity*** by looking
|
||||
at the ***length of code***
|
||||
|
||||
## Tournament[^official-paper-3]
|
||||
|
||||
The best code is found with a ***tournament style
|
||||
evolution***. Each cycle it picks the ***best
|
||||
`algorithm`*** which will be
|
||||
***copied and mutated*** and the ***oldest is removed***
|
||||
|
||||
<!-- Footnotes -->
|
||||
|
||||
[^official-paper]: [Official Lion Paper | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
|
||||
|
||||
[^official-paper-1]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
|
||||
|
||||
[^official-paper-2]: [Official Lion Paper| Paragraph 1 pg. 3 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
|
||||
|
||||
[^official-paper-3]: [Official Lion Paper| Paragraph 2 pg. 4-5 | arXiv:2302.06675v4](https://arxiv.org/pdf/2302.06675)
|
||||
Loading…
x
Reference in New Issue
Block a user