From 24cde3c8c1e689ea242160b896ea7cef333fefa5 Mon Sep 17 00:00:00 2001 From: Christian Risi <75698846+CnF-Gris@users.noreply.github.com> Date: Thu, 20 Nov 2025 18:51:34 +0100 Subject: [PATCH] Fixed typo --- Chapters/5-Optimization/INDEX.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Chapters/5-Optimization/INDEX.md b/Chapters/5-Optimization/INDEX.md index 90d9542..b9e9928 100644 --- a/Chapters/5-Optimization/INDEX.md +++ b/Chapters/5-Optimization/INDEX.md @@ -280,15 +280,15 @@ small value, usually in the order of $10^{-8}$ > This example is tough to understand if we where to apply it to a matrix $W$ > instead of a vector. To make it easier to understand in matricial notation: > -> $$ +$$ \begin{aligned} \nabla L^{(k + 1)} &= \frac{d \, Loss^{(k)}}{d \, W^{(k)}} \\ G^{(k + 1)} &= G^{(k)} +(\nabla L^{(k+1)}) ^2 \\ W^{(k+1)} &= W^{(k)} - \eta \frac{\nabla L^{(k + 1)}} {\sqrt{G^{(k+1)} + \epsilon}} \end{aligned} -> $$ -> +$$ +> > In other words, compute the gradient and scale it for the sum of its squares > until that point