# Graph ML ## Graph Introduction - **Nodes**: Pieces of Information - **Edges**: Relationship between nodes - **Mutual** - **One-Sided** - **Directionality** - **Directed**: We care about the order of connections - **Unidirectional** - **Bidirectional** - **Undirected**: We don't care about order of connections Now, we can have attributes over - **nodes** - **edges** - **master nodes** (a collection of nodes and edges) for example images may be represented as a graph where each non edge pixel is a vertex connected to other 8 ones. Its information at the vertex is a 3 (or 4) dimensional vector (think of RGB and RGBA) ### Adjacency Graph Take a picture and make a matrix with dimension $\{0, 1\}^{(h \cdot w) \times (h \cdot w)}$ and we put a 1 if these nodes are connected (share and edge), or 0 if they do not. > [!NOTE] > For a $300 \times 250$ image our matrix would be $\{0, 1\}^{(250 \cdot 300) \times (250 \cdot 300)}$ The way we put a 1 or a 0 has this rules: - **Row element** has connection **towards** **Column element** - **Column element** has a connection **coming** from **Row element** ### Tasks #### Graph-Level We want to predict a graph property #### Node-Level We want to predict a node property, such as classification #### Edge-Level We want to predict relationships between nodes such as if they share an edge, or the value of the edge they share. For this task we may start with a fully connected graph and then prune edges, as predictions go on, to come to a sparse graph ### Downsides of Graphs - They are not consistent in their structure and sometimes representing something as a graph is difficult - If we don't care about order of nodes, we need to find a way to represent this **node-order equivariance** - Graphs may be too large ## Representing Graphs ### Adjacency List We store info about: - **Nodes**: list of values. index $Node_k$ is the value of that node - **Edges**: list of values. index $Edge_k$ is the value of that edge - **Adjacent_list**: list of Tuples with indices over nodes. index $Tuple_k$ represent the Nodes involved in the $kth$ edge - **Graph**: Value of graph ```python nodes: list[any] = [ "forchetta", "spaghetti", "coltello", "cucchiao", "brodo" ] edges: list[any] = [ "serve per mangiare", "strumento", "cibo", "strumento", "strumento", "serve per mangiare" ] adj_list: list[(int, int)] = [ (0, 1), (0, 2), (1, 4), (0, 3), (2, 3), (3, 4) ] graph: any = "tavola" ``` If we find some parts of the graph that are disconnected, we can just avoid storing and computing those parts ## Graph Neural Networks (GNNs)