From 0e9e33a28195f8d651ef223334c2627e53f39931 Mon Sep 17 00:00:00 2001 From: chris-admin Date: Thu, 11 Sep 2025 10:02:02 +0200 Subject: [PATCH] Added Introduction about GNNs --- Chapters/14-GNN-GCN/INDEX.md | 91 ++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) diff --git a/Chapters/14-GNN-GCN/INDEX.md b/Chapters/14-GNN-GCN/INDEX.md index e69de29..c605c9e 100644 --- a/Chapters/14-GNN-GCN/INDEX.md +++ b/Chapters/14-GNN-GCN/INDEX.md @@ -0,0 +1,91 @@ +# Graph ML + +## Graph Introduction + +- **Nodes**: Pieces of Information +- **Edges**: Relationship between nodes + - **Mutual** + - **One-Sided** +- **Directionality** + - **Directed**: We care about the order of connections + - **Unidirectional** + - **Bidirectional** + - **Undirected**: We don't care about order of connections + +Now, we can have attributes over + +- **nodes** +- **edges** +- **master nodes** (a collection of nodes and edges) + +for example images may be represented as a graph where each non edge pixel is a vertex connected to other 8 ones. +Its information at the vertex is a 3 (or 4) dimensional vector (think of RGB and RGBA) + +### Adjacency Graph + +Take a picture and make a matrix with dimension $\{0, 1\}^{(h \cdot w) \times (h \cdot w)}$ and we put a 1 if these +nodes are connected (share and edge), or 0 if they do not. + +> [!NOTE] +> For a $300 \times 250$ image our matrix would be $\{0, 1\}^{(250 \cdot 300) \times (250 \cdot 300)}$ + +The way we put a 1 or a 0 has this rules: + - **Row element** has connection **towards** **Column element** + - **Column element** has a connection **coming** from **Row element** + +### Tasks + +#### Graph-Level + +We want to predict a graph property + +#### Node-Level + +We want to predict a node property, such as classification + +#### Edge-Level + +We want to predict relationships between nodes such as if they share an edge, or the value of the edge they share. + +For this task we may start with a fully connected graph and then prune edges, as predictions go on, to come to a +sparse graph + +### Downsides of Graphs + +- They are not consistent in their structure and sometimes representing something as a graph is difficult +- If we don't care about order of nodes, we need to find a way to represent this **node-order equivariance** +- Graphs may be too large + +## Representing Graphs + +### Adjacency List + +We store info about: + +- **Nodes**: list of values. index $Node_k$ is the value of that node +- **Edges**: list of values. index $Edge_k$ is the value of that edge +- **Adjacent_list**: list of Tuples with indices over nodes. index $Tuple_k$ + represent the Nodes involved in the $kth$ edge +- **Graph**: Value of graph + +```python +nodes: list[any] = [ + "forchetta", "spaghetti", "coltello", "cucchiao", "brodo" +] + +edges: list[any] = [ + "serve per mangiare", "strumento", "cibo", + "strumento", "strumento", "serve per mangiare" +] + +adj_list: list[(int, int)] = [ + (0, 1), (0, 2), (1, 4), + (0, 3), (2, 3), (3, 4) +] + +graph: any = "tavola" +``` + +If we find some parts of the graph that are disconnected, we can just avoid storing and computing those parts + +## Graph Neural Networks (GNNs)