Added receptive fields section and fixed some info
This commit is contained in:
parent
fc7cefb93e
commit
d23d847c2e
@ -5,14 +5,14 @@
|
||||
> [!WARNING]
|
||||
> We apply this concept ***mainly*** to `images`
|
||||
|
||||
Usually, for `images`, `fcnn` (short for `f`ully
|
||||
`c`onnected `n`eural `n`etworks), are not suitable,
|
||||
Usually, for `images`, `fcnn` (short for **f**ully
|
||||
**c**onnected **n**eural **n**etworks), are not suitable,
|
||||
as `images` have a ***large number of `inputs`*** that is
|
||||
***highly dimensional*** (e.g. a `32x32`, `RGB` picture
|
||||
has dimension of `weights`)[^anelli-convolutional-networks-1]
|
||||
has dimension of 3072 data inputs)[^anelli-convolutional-networks-1]
|
||||
|
||||
Combine this with the fact that ***nowadays pictures
|
||||
have (the least) `1920x1080` pixels*** makes `FCnn`
|
||||
have (the least) `1920x1080` pixels***. This makes `FCnn`
|
||||
***prone to overfitting***[^anelli-convolutional-networks-1]
|
||||
|
||||
> [!NOTE]
|
||||
@ -61,13 +61,13 @@ concerning the `width` and `height`***
|
||||
|
||||
<!-- TODO: Add image -->
|
||||
|
||||
#### Filters
|
||||
#### Filters (aka Kernels)
|
||||
|
||||
These are the ***work-horse*** of the whole `layer`.
|
||||
A filter is a ***small window that contains weights***
|
||||
and produces the `outputs`.
|
||||
|
||||
<!-- TODO: Add image -->
|
||||

|
||||
|
||||
We have a ***number of `filter` equal to the `depth` of
|
||||
the `output`***.
|
||||
@ -80,6 +80,9 @@ Each `filter` share the same `height` and `width` and
|
||||
has a `depth` equal to the one in the `input`, and their
|
||||
`output` is usually called `activation-map`.
|
||||
|
||||
> [!WARNING]
|
||||
> Don't forget about biases, one for each`kernel`
|
||||
|
||||
> [!NOTE]
|
||||
> Usually what the first `activation-maps` *learn* are
|
||||
> oriented edges, opposing colors, ecc...
|
||||
@ -95,8 +98,8 @@ $$
|
||||
out_{side\_len} = \frac{
|
||||
in_{side\_len} - filter_{side\_len}
|
||||
}{
|
||||
stride + 1
|
||||
}
|
||||
stride
|
||||
} + 1
|
||||
$$
|
||||
|
||||
Whenever the `stride` makes $out_{side\_len}$ ***not
|
||||
@ -144,6 +147,23 @@ Pooling](#average-pooling)
|
||||
|
||||
This `layer` ***introduces space invariance***
|
||||
|
||||
## Receptive Fields[^youtube-video-receptive-fields]
|
||||
|
||||
At the end of our convolution we may want our output to have been influenced by all
|
||||
pixels in our picture.
|
||||
|
||||
The amount of pixels that influenced our output is called receptive field and it increases
|
||||
each time we do a convolution by a factor of $k - 1$ where $k$ is the kernel size. This is
|
||||
due to our kernel of producing an output deriving from more inputs, thus influenced by more
|
||||
pixels.
|
||||
|
||||
However this means that before being able to have an output influenced by all pixels, we need to
|
||||
go very deep.
|
||||
|
||||
To mitigate this, we can downsample by striding. This means that we will collect more pixel
|
||||
information during upper layers, even though more sparse, and thus we'll be able to get more
|
||||
pixel info over deep layers.
|
||||
|
||||
## Tips[^anelli-convolutional-networks-2]
|
||||
|
||||
- `1x1` `filters` make sense. ***They allow us
|
||||
@ -176,3 +196,5 @@ This `layer` ***introduces space invariance***
|
||||
[^anelli-convolutional-networks-2]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 pg. 85
|
||||
|
||||
[^anelli-convolutional-networks-3]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 pg. 70
|
||||
|
||||
[^youtube-video-receptive-fields]: [CNN Receptive Fields | YouTube | 23rd October 2025](https://www.youtube.com/watch?v=ip2HYPC_T9Q)
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user