From 1a3e33b5ed937ced86c2c3cd9a3232cf12a6b9d4 Mon Sep 17 00:00:00 2001 From: Christian Risi <75698846+CnF-Gris@users.noreply.github.com> Date: Sun, 23 Mar 2025 20:00:05 +0100 Subject: [PATCH] Added all Uninformated search Algorithms --- Chapters/3-SOLVING-PROBLEMS-BY-SEARCHING.md | 745 ++++++++++++++++++++ 1 file changed, 745 insertions(+) diff --git a/Chapters/3-SOLVING-PROBLEMS-BY-SEARCHING.md b/Chapters/3-SOLVING-PROBLEMS-BY-SEARCHING.md index 8df0296..3473ce3 100644 --- a/Chapters/3-SOLVING-PROBLEMS-BY-SEARCHING.md +++ b/Chapters/3-SOLVING-PROBLEMS-BY-SEARCHING.md @@ -48,3 +48,748 @@ anything else. This is an `agent` which has `factored` or `structures` representation of states. ## Search Problem + +A search problem is the union of the followings: + +- **State Space** + Set of *possible* `states`. + + It can be represented as a `graph` where each `state` is a `node` + and each `action` is an `edge`, leading from a `state` to another +- **Initial State** + The initial `state` the `agent` is in +- **Goal State(s)** + The `state` where the `agent` will have reached its goal. There can be multiple + `goal-states` +- **Available Actions** + All the `actions` available to the `agent`: + ```python + def get_actions(state: State) : set[Action] + ``` +- **Transition Model** + A `function` which returns the `next-state` after taking + an `action` in the `current-state`: + ```python + def move_to_next_state(state: State, action: Action): State + ``` +- **Action Cost Function** + A `function` which denotes the cost of taking that + `action` to reach a `new-state` from `current-state`: + ```python + def action_cost( + current_state: State, + action: Action, + new_state: State + ) : float + ``` + +A `sequence` of `actions` to go from a `state` to another is called `path`. +A `path` leading to the `goal` is called a `solution`. + +The ***shortest*** `path` to the `goal` is called the `optimal-solution`, or +in other words, this is the `path` with the ***lowest*** `cost`. + +Obviously we always need a level of ***abstraction*** to get our `agent` +perform at its best. For example, we don't need to express any detail +about the ***physics*** of the real world to go from *point-A* to *point-B*. + +## Searching Algorithms + +Most algorithms used to solve [Searching Problems](#search-problem) rely +on a `tree` based representation, where the `root-node` is the `initial-state` +and each `child-node` is the `next-available-state` from a `node`. + +By the `data-structure` being a `search-tree`, each `node` has a ***unique*** +`path` back to the `root` as each `node` has a ***reference*** to the `parent-node`. + +For each `action` we generate a `node` and each `generated-node`, wheter +***further explored*** or not, become part of the `frontier` or `fringe`. + +> [!TIP] +> Before going on how to implement `search` algorithms, +> let's say that we'll use these `data-structures` for +> `frontiers`: +> +> - `priority-queue` when we need to evaluate for `lowest-costs` first +> - `FIFO` when we want to explore the `tree` ***horizontally*** first +> - `LIFO` when we want to explore the `tree` ***vertically*** first +> +> Then we need to take care of ***reduntant-paths*** in some ways: +> +> - Remember all previous `states` and only care for best `paths` to these +> `states`, ***best when `problem` fits into memory***. +> - Ignore the problem when it is ***rare*** or ***impossible*** to repeat them, +> like in an ***assembly line*** in factories. +> - Check for repeated `states` along the `parent-chain` up to the `root` or +> first `n-links`. This allows us to ***save up on memory*** +> +> If we check for `redundant-paths` we have a `graph-search`, otherwise a `tree-like-search` +> + +### Measuring Performance + +We have 4 parameters: + +- #### Completeness: + Is the `algorithm` guaranteed to find the `solution`, if any, and report + for ***no solution***? + + This is easy for `finite` `state-spaces` while we need a ***systematic*** + algorithm for `infinite` ones, though it would be difficult reporting + for ***no solution*** as it is impossible to explore the ***whole `space`***. + +- #### Cost Optimality: + Can it find the `optimal-solution`? + +- #### Time Complexity: + `O(n) time` performance + +- #### Space Complexity: + `O(n) space` performance, explicit one (if the `graph` is ***explicit***) or + by mean of: + + - `depth` of `actions` for an `optimal-solution` + - `max-number-of-actions` in **any** `path` + - `branching-factor` for a node + +### Uninformed Algorithms + +These `algorithms` know **nothing** about the `space` + +#### Breadth-First Search + +```python + + def expand( + problem: Problem, + node: Node + ) : Node + """ + Gets all children from a node and packet them + in a node + """ + + # Initialize variables + state = node.state + + for action in problem.actions: + new_state = problem.result(state, action) + cost = node.path_cost + problem.action_cost(state, action, new_state) + + # See https://docs.python.org/3/reference/expressions.html#yield-expressions + yield Node( + state = new_node, + parent = node, + action = action, + path_cost = cost + ) + + + + def breadth_first_search( + problem: Problem + ) : Node | null + """ + Graph-Search + + Gets all nodes with lower cost + to expand first. + """ + + # Initialize variables + root = problem.initial_state + + # Check if root is goal + if problem.is_goal(root.state): + return node + + # This will change according to the algorithms + frontier = FIFO_Queue() + + reached_nodes = set[State] + reached_nodes.add(root.state) + + + # Repeat until all states have been expanded + while len(frontier) != 0: + + node = frontier.pop() + + # Get all reachable states + for child in expand(problem, node): + + state = child.state + + # If state is goal, return the node + # Early Goal Checking + if problem.is_goal(state): + return child + + # Check if state is new and add it + if state is not in reached_nodes: + reached_nodes.append(child) + frontier.push(child) + continue + + + # We get here if we have no + # more nodes to expand from + return null + +``` + +In this `algorithm` we use the ***depth*** of `nodes` as the `cost` to +reach such nodes. + +In comparison to the [Best-First Search](#best-first-search), we have these +differences: + +- `FIFO Queue` instead of a `Priority Queue`: + Since we expand on ***breadth***, a `FIFO` guarantees us that + all nodes are in order as the `nodes` generated at the same `depth` + are generated before that those at `depth + 1`. + +- `early-goal test` instead of a `late-goal test`: + We can immediately see if the `state` is the `goal-state` as + it would have the ***minimum `cost` already*** + +- The `reached_states` is now a `set` instead of a `dict`: + Since `depth + 1` has a ***higher `cost`*** than `depth`, this means + that we alread reached the ***minimum `cost`*** for that `state` + after the first time we reached it. + +However the `space-complexity` and `time-complexity` are +***high*** with $O(b^d)$ space, where $b$ is the +`max-branching-factor` and $d$ is the `search-depth`[^breadth-first-performance] + +This algorithm is: +- `optimal` +- `complete` (as long each `action` has the same `cost`) + +> [!CAUTION] +> All of these considerations are valid as long as each `edge` has a `uniform-cost` + +#### Dijkstra'Algorithm | AKA Uniform-Cost Search[^dijkstra-algorithm] + +This algorithm is basically [Best-First Search](#best-first-search) but with `path_cost()` +as the `cost_function`. + +It works by `expanding` all `nodes` that have the ***lowest*** `path-cost` and +evaluating them for the `goal` after `poppoing` them out of the `queue`, otherwise +it would pick up one of the `non-optimal solutions`. + +Its ***performance*** depends on $C^{*}$, the `optimal-solution` and $\epsilon > 0$, the lower +bound over the `cost` of each `action`. The `worst-case` would be +$O(b^{1 + \frac{C^*}{\epsilon}})$ for bot `time` and `space-complexity` + +In the `worst-case` the `complexity` is $O(b^{d + 1})$ when all `actions` cost $\epsilon$ + +This algorithm is: +- `optimal` +- `complete` + +> [!TIP] +> Notice that at ***worst***, we will have to expand $\frac{C^*}{\epsilon}$ if ***each*** +> action costed at most $\epsilon$, since $C^*$ is the `optimal-cost`, plus the +> ***last-expansion*** before realizing it got the `optimal-solution` + +#### Depth-First Search + +```python + + def expand( + problem: Problem, + node: Node + ) : Node + """ + Gets all children from a node and packet them + in a node + """ + + # Initialize variables + state = node.state + + for action in problem.actions: + new_state = problem.result(state, action) + cost = node.path_cost + problem.action_cost(state, action, new_state) + + # See https://docs.python.org/3/reference/expressions.html#yield-expressions + yield Node( + state = new_node, + parent = node, + action = action, + path_cost = cost + ) + + + + def depth_first_search( + problem: Problem + ) : Node | null + """ + Graph-Search + + Gets all nodes with lower cost + to expand first. + """ + + # Initialize variables + root = problem.initial_state + + # Check if root is goal + if problem.is_goal(root.state): + return node + + # This will change according to the algorithms + frontier = LIFO_Queue() + + + # Repeat until all states have been expanded + while len(frontier) != 0: + + node = frontier.pop() + + # Get all reachable states + for child in expand(problem, node): + + state = child.state + + # If state is goal, return the node + # Early Goal Checking + if problem.is_goal(state): + return child + + # We don't care if we reached that state + # or not before + frontier.push(child) + + + # We get here if we have no + # more nodes to expand from + return null + +``` + +This is basically a [Best-First Search](#best-first-search) but with the `cost_function` +being the ***negative*** of `depth`. However we can use a `LIFO Queue`, instead of a +`cost_function`, and delete the `reached_space` `dict`. + +This algorithm is: +- `non-optimal` as it returns the ***first*** `solution`, not the ***best*** +- `incomplete` as it is `non-systematic`, but it is `complete` for `acyclic graphs` + and `trees` +- $O(b^{m})$ with $m$ being the `max-depth` of the `space` +- $O(b\, m)$ for `space-complexity` with $m$ being the `max-depth` of the `space` + + +One evolution of this algorithm, is the ***backtracking search*** + +> [!TIP] +> While it is `non-optimal` and `not-complete` and having a ***huge*** `time-complexity`, +> the `space-complexity` makes it appealing as we have ***much more time than space*** +> available. + +> [!CAUTION] +> This algorithm needs a way to handle `cycles` + +#### Depth-Limited + +```python + + def expand( + problem: Problem, + node: Node + ) : Node + """ + Gets all children from a node and packet them + in a node + """ + + # Initialize variables + state = node.state + + for action in problem.actions: + new_state = problem.result(state, action) + cost = node.path_cost + problem.action_cost(state, action, new_state) + + # See https://docs.python.org/3/reference/expressions.html#yield-expressions + yield Node( + state = new_node, + parent = node, + action = action, + path_cost = cost + ) + + + + def depth_limited_search( + problem: Problem, + max_depth: int + ) : Node | null + """ + Graph-Search + + Gets all nodes with lower cost + to expand first. + """ + + # Initialize variables + root = problem.initial_state + + # Check if root is goal + if problem.is_goal(root.state): + return node + + # This will change according to the algorithms + frontier = LIFO_Queue() + + + # Repeat until all states have been expanded + while len(frontier) != 0: + + node = frontier.pop() + + # Do not expand, over max_depth + # + # We throw an error to differentiate + # from when there's no solution + if abs(node.path_cost) >= max_depth: + throw MaxDepthError("We got over max_depth") + + # Get all reachable states + for child in expand(problem, node): + + state = child.state + + # If state is goal, return the node + # Early Goal Checking + if problem.is_goal(state): + return child + + # We don't care if we reached that state + # or not before + frontier.push(child) + + + # We get here if we have no + # more nodes to expand from + return null + +``` + +This is a ***flavour*** of [Depth-First Search](#depth-first-search). + +One addition of this algorithm is the parameter of `max_depth`. After we go after +`max_depth`, we don't expand anymore. + +This algorithm is: +- `non-optimal` (see [Depth-First Search](#depth-first-search)) +- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends + on the `max_depth` as well +- $O(b^{max\_depth})$ `time-complexity` +- $O(b\, max\_depth)$ `space-complexity` + +> [!CAUTION] +> This algorithm needs a way to handle `cycles` as its ***parent*** + +> [!TIP] +> Depending on the domain of the `problem`, we can estimate a good `max_depth`, for +> example, ***graphs*** have a number called `diameter` that tells us the +> ***max number of `actions` to reach any `node` in the `graph`*** + +#### Iterative Deepening Search + +```python + def iterative_deepening_search( + problem: Porblem + ) : Node | null + + done = False + max_depth = 0 + + while not done: + + try: + + solution = depth_limited_search( + problem, + max_depth + ) + + done = True + + # We only catch for this exception + except MaxDepthError as e: + pass + + max_depth += 1 + + return solution +``` + +This is a ***flavour*** of `depth-limited search`. whenever it reaches the `max_depth` +the `search` is ***restarted*** until one is found. + +This algorithm is: +- `non-optimal` (see [Depth-First Search](#depth-first-search)) +- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends + on the `max_depth` as well +- $O(b^{m})$ `time-complexity` +- $O(b\, m)$ `space-complexity` + +> [!TIP] +> This is the ***preferred*** method for `uninformed-search` when +> ***we have no idea of the `max_depth`*** and ***the `space` is larger than memory*** + +#### Bidirectional Search + +```python + + # Watch other implementations + def expand( problem: Problem, node: Node ) : Node + pass + + # It is not defined + def join_nodes( node_1: Node, node_2 : Node) : Node + + + def best_first_search( + problem_1: Problem, # Initial Problem + cost_function_1: Callable[[node: Node] float], + problem_2: Problem, # Reverse Problem + cost_function_2: Callable[[node: Node] float] + ) : Node | null + """ + Graph-Search + + Gets all nodes with lower cost + to expand first. + """ + + # Initialize variables + root_1 = problem_1.initial_state + root_2 = problem_2.initial_state + + # This will change according to the algorithms + frontier_1 = Priority_Queue(order_by = cost_function_1) + frontier_2 = Priority_Queue(order_by = cost_function_2) + + reached_nodes_1 = dict[State, Node] + reached_nodes_1[root_1.state] = root_1 + + reached_nodes_2 = dict[State, Node] + reached_nodes_2[root_2.state] = root_2 + + # Keep track of best solution + solution = null + + + # Repeat until all states have been expanded + while len(frontier_1) != 0 and len(frontier_2) != 0: + + # Expand frontier with lowest cost + if cost_function_1(frontier_1[0]) < cost_function_2(frontier_2[0]): + + node_1 = frontier_1.pop() + + # Get all reachable states for 1 + for child in expand(problem_1, node_1): + state = child.state + + # Check if state is new or has + # lower cost and add to frontier + if ( + state is not in reached_nodes_1 + or + child.path_cost < reached_nodes_1[state].path_cost + ): + # Add node to frontier + reached_nodes_1[state] = child + frontier.push(child) + + # Check if state has previously been + # reached by the other frontier + if state is in reached_nodes_2: + tmp_solution = join_solutions( + reached_nodes_1[state], + reached_nodes_2[state] + ) + + # Check if this solution is better + if tmp_solution.path_cost < solution.path_cost: + solution = tmp_solution + else: + + node_2 = frontier21.pop() + + # Get all reachable states for 2 + for child in expand(problem_2, node_2): + state = child.state + + # Check if state is new or has + # lower cost and add to frontier + if ( + state is not in reached_nodes_2 + or + child.path_cost < reached_nodes_2[state].path_cost + ): + + # Add node to frontier + reached_nodes_2[state] = child + frontier.push(child) + + # Check if state has previously been + # reached by the other frontier + if state is in reached_nodes_1: + + tmp_solution = join_solutions( + reached_nodes_1[state], + reached_nodes_2[state] + ) + + # Check if this solution is better + if tmp_solution.path_cost < solution.path_cost: + solution = tmp_solution + + # We get here if we have no + # more nodes to expand from + return solution + +``` + +This method `expands` from both the `starting-state` and `goal-state` like +a [Best-First Search](#best-first-search), making us save up on ***memory*** and ***time***. + +On the other hand, the implementation algorithm is ***harder*** to implement + +This algorithm is: +- `optimal` (see [Best-First Search](#best-first-search)) +- `complete` (see [Best-First Search](#best-first-search)) +- $O(b^{1 + \frac{C^*}{2\epsilon}})$ `time-complexity` + (see [Uniform-Cost Search](#dijkstraalgorithm--aka-uniform-cost-search)) +- $O(b^{1 + \frac{C^*}{2\epsilon}})$ `space-complexity` + (see [Uniform-Cost Search](#dijkstraalgorithm--aka-uniform-cost-search)) + +> [!TIP] +> If the `cost_function` is the `path_cost`, it is `bi-directional` and we can do the following +> consideration: +> +> - $C^*$ is the `optimal-path` and no `node` with $path\_cost > \frac{C^*}{2}$ will be expanded +> +> This is an important speedup, however, without having a `bi-directional` `cost_function`, +> then we need to check for the ***best `solution`*** several times. + +### Informed Algorithms | AKA Heuristic Search + +These `algorithms` know something about the ***closeness*** of nodes + +#### Best-First Search + +```python + + def expand( + problem: Problem, + node: Node + ) : Node + """ + Gets all children from a node and packet them + in a node + """ + + # Initialize variables + state = node.state + + for action in problem.actions: + new_state = problem.result(state, action) + cost = node.path_cost + problem.action_cost(state, action, new_state) + + # See https://docs.python.org/3/reference/expressions.html#yield-expressions + yield Node( + state = new_node, + parent = node, + action = action, + path_cost = cost + ) + + + + def best_first_search( + problem: Problem, + cost_function: Callable[[node: Node] float] + ) : Node | null + """ + Graph-Search + + Gets all nodes with lower cost + to expand first. + """ + + # Initialize variables + root = problem.initial_state + + # This will change according to the algorithms + frontier = Priority_Queue(order_by = cost_function) + + reached_nodes = dict[State, Node] + reached_nodes[root.state] = root + + + # Repeat until all states have been expanded + while len(frontier) != 0: + + node = frontier.pop() + + # If state is goal, return the node + if problem.is_goal(node.state): + return node + + # Get all reachable states + for child in expand(problem, node): + state = child.state + + # Check if state is new and add it + if state is not in reached_nodes: + reached_nodes[state] = child + frontier.push(child) + continue + + # Here we do know that the state has been reached before + # Check if state has a lower cost and add it + if child.path_cost < reached_nodes[state].path_cost: + reached_nodes[state] = child + frontier.push(child) + continue + + # We get here if we have no + # more nodes to expand from + return null + +``` + +In a `best_first_search` we start from the `root` and then we +`expand` and add these `states` as `nodes` at `frontier` if they are +***either new or at lower path_cost***. + +Whenever we get from a `state` to a `node`, we keep track of: + +- `state` +- `parent-node` +- `action` used to go from `parent-node` to here +- `cost` that is cumulative from `parent-node` and the `action` used + +If there's no `node` available in the `frontier`, and we didn't find +the `goal-state`, then the solution is `null`. + +#### Greedy Best-First Search + +#### A* Search + +#### Weighted A* + +#### Bidirectional Heuristic Search + +[^breadth-first-performance]: Artificial Intelligence: A Modern Approach Global Edition 4th | +Ch. 3 pg. 95 + +[^dijkstra-algorithm]: Artificial Intelligence: A Modern Approach Global Edition 4th | +Ch. 3 pg. 96 \ No newline at end of file