Deep-Learning/Chapters/14-GNN-GCN/INDEX.md

# Graph ML

## Graph Introduction

- **Nodes**: Pieces of Information
- **Edges**: Relationship between nodes
    - **Mutual**
    - **One-Sided**
- **Directionality**
    - **Directed**: We care about the order of connections
        - **Unidirectional**
        - **Bidirectional**
    - **Undirected**: We don't care about order of connections

Now, we can have attributes over

- **nodes**
- **edges**
- **master nodes** (a collection of nodes and edges)

for example images may be represented as a graph where each non edge pixel is a vertex connected to other 8 ones.
Its information at the vertex is a 3 (or 4) dimensional vector (think of RGB and RGBA)

### Adjacency Graph

Take a picture and make a matrix with dimension $\{0, 1\}^{(h \cdot w) \times (h \cdot w)}$ and we put a 1 if these
nodes are connected (share and edge), or 0 if they do not.

> [!NOTE]
> For a $300 \times 250$ image our matrix would be $\{0, 1\}^{(250 \cdot 300) \times (250 \cdot 300)}$

The way we put a 1  or a 0 has this rules:
    - **Row element** has connection **towards** **Column element**
    - **Column element** has a connection **coming** from **Row element**

### Tasks

#### Graph-Level

We want to predict a graph property

#### Node-Level

We want to predict a node property, such as classification

#### Edge-Level

We want to predict relationships between nodes such as if they share an edge, or the value of the edge they share.

For this task we may start with a fully connected graph and then prune edges, as predictions go on, to come to a
sparse graph

### Downsides of Graphs

- They are not consistent in their structure and sometimes representing something as a graph is difficult
- If we don't care about order of nodes, we need to find a way to represent this **node-order equivariance**
- Graphs may be too large

## Representing Graphs

### Adjacency List

We store info about:

- **Nodes**: list of values. index $Node_k$ is the value of that node
- **Edges**: list of values. index $Edge_k$ is the value of that edge
- **Adjacent_list**: list of Tuples with indices over nodes. index $Tuple_k$
    represent the Nodes involved in the $kth$ edge
- **Graph**: Value of graph

```python
nodes: list[any] = [
    "forchetta", "spaghetti", "coltello", "cucchiao", "brodo"
]

edges: list[any] = [
    "serve per mangiare", "strumento", "cibo",
    "strumento", "strumento", "serve per mangiare"
]

adj_list: list[(int, int)] = [
    (0, 1), (0, 2), (1, 4),
    (0, 3), (2, 3), (3, 4)
]

graph: any = "tavola"
```

If we find some parts of the graph that are disconnected, we can just avoid storing and computing those parts

## Graph Neural Networks (GNNs)