2025-04-15 14:10:54 +02:00

50 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Learning Representations
## Linear Classifiers and Limitations
A `Linear Classifier` can be defined as an ***hyperplane** with the following expression:
$$
\sum_{i=1}^{N} w_{i}x_{i} + b = 0
$$
<!-- TODO: Add images -->
So, the output is $o = sign \left( \sum_{i=1}^{N} w_{i}x_{i} + b = 0 \right)$
### Limitations
The question is, how probable is to divide $P$ `points` in $N$ `dimensions`?
>*The probability that a dichotomy over $P$ points in $N$ dimensions is* ***linearly separable***
>*goes to zero as $P$ gets larger than $N$* [Covers theorem 1966]
So, once you have the number of features $d$, the number of dimensions is **exponentially higher**
$N^{d}$.
## Deep vs Shallow Networks
While it is **theoretically possible** to have only `shallow-networks` as **universal predictors**,
this is **not technically possible** as it would require more **hardware**.
Instead `deep-networks` trade `time` for `space`, taking longer, but requiring **less hardware**
## Invariant Feature Learning
When we need to learn something that may have varying features (**backgrounds**, **colors**, **shapes**, etc...),
it is useful to do these steps:
1. Embed data into **high-dimensional** spaces
2. Bring **closer** data that are **similar** and **reduce-dimensions**
## Sparse Non-Linear Expansion
In this case, we break our datas, and then we aggreate things together
## Manifold Hypothesis
We can assume that whatever we are trying to represent **doesn't need that much features** but rather
**is a point of a latent space with a lower count of dimensions**.
Thus, our goal is to find this **low dimensional latent-space** and **disentangle** features, so that each
`direction` of our **latent-space** is a `feature`. Essentially, what we want to achieve with `Deep-Learning` is
a **system** capable of *learning* this **latent-space** on ***its own***.