# Convolutional Networks[^anelli-convolutional-networks] > [!WARNING] > We apply this concept ***mainly*** to `images` Usually, for `images`, `fcnn` (short for `f`ully `c`onnected `n`eural `n`etworks), are not suitable, as `images` have a ***large number of `inputs`*** that is ***highly dimensional*** (e.g. a `32x32`, `RGB` picture has dimension of `weights`)[^anelli-convolutional-networks-1] Combine this with the fact that ***nowadays pictures have (the least) `1920x1080` pixels*** makes `FCnn` ***prone to overfitting***[^anelli-convolutional-networks-1] > [!NOTE] > > - From here on `depth` is the **3rd dimention of the > activation voulume** > - `FCnn` are just ***traditional `NeuralNetworks` > ## ConvNet The basic network we can achieve with a `convolutional-layer` is a `ConvNet`. It is composed of: 1. `input` (picture) 2. [`Convolutional Layer`](#convolutional-layer) 3. [`ReLU`](./../3-Activation-Functions/INDEX.md#relu) 4. [`Pooling layer`](#pooling-layer) 5. `FCnn` (Normal `NeuralNetork`) 6. `output` (classes tags) ## Building Blocks ### Convolutional Layer `Convolutional Layers` are `layers` that ***reduce the size of the computational load*** by creating `activation maps` ***computed starting from a `subset` of all the available `data`*** #### Local Connectivity To achieve such thing, we introduce the concept of `local connectivity`. Basically ***each `output` is linked with a `volume` smaller than the original one concerning the `width` and `height`*** (the `depth` is always fully connected) #### Filters These are the ***work-horse*** of the whole `layer`. A filter is a ***small window that contains weights*** and produces the `outputs`. We have a ***number of `filter` equal to the `depth` of the `output`***. This means that ***each `output-value` at the same `depth` has been generated by the same `filter`***, and as such, ***any `volume` shares `weights` across a single `depth`***. Each `filter` share the same `height` and `width` and has a `depth` equal to the one in the `input`, and their `output` is usually called `activation-map`. > [!NOTE] > Usually what the first `activation-maps` *learn* are > oriented edges, opposing colors, ecc... Another parameter for `filters` is the `stride`, which is basically the number of "hops" made from one convolution and another. The formula to determine the `output` size for any side is: $$ out_{side\_len} = \frac{ in_{side\_len} - filter_{side\_len} }{ stride + 1 } $$ Whenever the `stride` makes $out_{side\_len}$ ***not an integer value, we add $0$ `padding`*** to correct this. > [!NOTE] > > To avoid downsizing, it is not uncommon to apply a > $0$ padding of size 1 (per dimension) before applying > a `filter` with `stride` equal to 1 > > However, for a ***fast downsizing*** we can increment > `striding` > [!CAUTION] > Don't shrink too fast, it doesn't bring good results ### Pooling Layer[^pooling-layer-wikipedia] It ***downsamples the image without resorting to `learnable-parameters`*** There are many `algorithms` to make this `layer`, as: #### Max Pooling Takes the max element in the `window` #### Average Pooling Takes the average of elements in the `window` #### Mixed Pooling Linear sum of [Max Pooling](#max-pooling) and [Average Pooling](#average-pooling) > [!NOTE] > This list is **NOT EXHAUSTIVE**, please refer to > [this article](https://en.wikipedia.org/wiki/Pooling_layer) > to know more. This `layer` ***introduces space invariance*** ## Tips[^anelli-convolutional-networks-2] - `1x1` `filters` make sense. ***They allow us to reduce the `depth` of the next `volume`*** - ***Trends goes towards increasing the `depth` and having smaller `filters`*** - ***The trend is to remove [`pooling-layers`](#pooling-layer) and use only [`convolutional-layers`](#convolutional-layer)*** - ***Common settings for [`convolutional-layers`](#convolutional-layer) are:*** - number of filters: $K = 2^{a}$ [^anelli-convolutional-networks-3] - tuple of `filter-size` $F$ `stride` $S$, `0-padding` $P$: - (3, 1, 1) - (5, 1, 2) - (5, 2, *whatever fits*) - (1, 1, 0) - See ResNet/GoogLeNet [^anelli-convolutional-networks]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 [^anelli-convolutional-networks-1]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 pg. 2 [^pooling-layer-wikipedia]: [Pooling Layer | Wikipedia | 22nd April 2025](https://en.wikipedia.org/wiki/Pooling_layer) [^anelli-convolutional-networks-2]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 pg. 85 [^anelli-convolutional-networks-3]: Vito Walter Anelli | Deep Learning Material 2024/2025 | PDF 7 pg. 70