2.6 KiB

Graph ML

Graph Introduction

  • Nodes: Pieces of Information
  • Edges: Relationship between nodes
    • Mutual
    • One-Sided
  • Directionality
    • Directed: We care about the order of connections
      • Unidirectional
      • Bidirectional
    • Undirected: We don't care about order of connections

Now, we can have attributes over

  • nodes
  • edges
  • master nodes (a collection of nodes and edges)

for example images may be represented as a graph where each non edge pixel is a vertex connected to other 8 ones. Its information at the vertex is a 3 (or 4) dimensional vector (think of RGB and RGBA)

Adjacency Graph

Take a picture and make a matrix with dimension \{0, 1\}^{(h \cdot w) \times (h \cdot w)} and we put a 1 if these nodes are connected (share and edge), or 0 if they do not.

Note

For a 300 \times 250 image our matrix would be \{0, 1\}^{(250 \cdot 300) \times (250 \cdot 300)}

The way we put a 1 or a 0 has this rules: - Row element has connection towards Column element - Column element has a connection coming from Row element

Tasks

Graph-Level

We want to predict a graph property

Node-Level

We want to predict a node property, such as classification

Edge-Level

We want to predict relationships between nodes such as if they share an edge, or the value of the edge they share.

For this task we may start with a fully connected graph and then prune edges, as predictions go on, to come to a sparse graph

Downsides of Graphs

  • They are not consistent in their structure and sometimes representing something as a graph is difficult
  • If we don't care about order of nodes, we need to find a way to represent this node-order equivariance
  • Graphs may be too large

Representing Graphs

Adjacency List

We store info about:

  • Nodes: list of values. index Node_k is the value of that node
  • Edges: list of values. index Edge_k is the value of that edge
  • Adjacent_list: list of Tuples with indices over nodes. index Tuple_k represent the Nodes involved in the kth edge
  • Graph: Value of graph
nodes: list[any] = [
    "forchetta", "spaghetti", "coltello", "cucchiao", "brodo"
]

edges: list[any] = [
    "serve per mangiare", "strumento", "cibo",
    "strumento", "strumento", "serve per mangiare"
]

adj_list: list[(int, int)] = [
    (0, 1), (0, 2), (1, 4),
    (0, 3), (2, 3), (3, 4)
]

graph: any = "tavola"

If we find some parts of the graph that are disconnected, we can just avoid storing and computing those parts

Graph Neural Networks (GNNs)