Fixed typo

This commit is contained in:
Christian Risi 2025-11-20 18:51:34 +01:00
parent 1a30dc6400
commit 24cde3c8c1

View File

@ -280,15 +280,15 @@ small value, usually in the order of $10^{-8}$
> This example is tough to understand if we where to apply it to a matrix $W$ > This example is tough to understand if we where to apply it to a matrix $W$
> instead of a vector. To make it easier to understand in matricial notation: > instead of a vector. To make it easier to understand in matricial notation:
> >
> $$ $$
\begin{aligned} \begin{aligned}
\nabla L^{(k + 1)} &= \frac{d \, Loss^{(k)}}{d \, W^{(k)}} \\ \nabla L^{(k + 1)} &= \frac{d \, Loss^{(k)}}{d \, W^{(k)}} \\
G^{(k + 1)} &= G^{(k)} +(\nabla L^{(k+1)}) ^2 \\ G^{(k + 1)} &= G^{(k)} +(\nabla L^{(k+1)}) ^2 \\
W^{(k+1)} &= W^{(k)} - \eta \frac{\nabla L^{(k + 1)}} W^{(k+1)} &= W^{(k)} - \eta \frac{\nabla L^{(k + 1)}}
{\sqrt{G^{(k+1)} + \epsilon}} {\sqrt{G^{(k+1)} + \epsilon}}
\end{aligned} \end{aligned}
> $$ $$
> >
> In other words, compute the gradient and scale it for the sum of its squares > In other words, compute the gradient and scale it for the sum of its squares
> until that point > until that point