Rework of chapters 2 and 3
This commit is contained in:
parent
1a3e33b5ed
commit
642559ee4d
@ -49,32 +49,37 @@ This is an `agent` which has `factored` or `structures` representation of states
|
||||
|
||||
## Search Problem
|
||||
|
||||
A search problem is the union of the followings:
|
||||
A search problem is the union of the followings:
|
||||
|
||||
- **State Space**
|
||||
Set of *possible* `states`.
|
||||
|
||||
Set of *possible* `states`.
|
||||
|
||||
It can be represented as a `graph` where each `state` is a `node`
|
||||
and each `action` is an `edge`, leading from a `state` to another
|
||||
and each `action` is an `edge`, leading from a `state` to another
|
||||
- **Initial State**
|
||||
The initial `state` the `agent` is in
|
||||
- **Goal State(s)**
|
||||
The `state` where the `agent` will have reached its goal. There can be multiple
|
||||
The `state` where the `agent` will have reached its goal. There can be multiple
|
||||
`goal-states`
|
||||
- **Available Actions**
|
||||
All the `actions` available to the `agent`:
|
||||
|
||||
```python
|
||||
def get_actions(state: State) : set[Action]
|
||||
```
|
||||
|
||||
- **Transition Model**
|
||||
A `function` which returns the `next-state` after taking
|
||||
an `action` in the `current-state`:
|
||||
|
||||
```python
|
||||
def move_to_next_state(state: State, action: Action): State
|
||||
```
|
||||
|
||||
- **Action Cost Function**
|
||||
A `function` which denotes the cost of taking that
|
||||
A `function` which denotes the cost of taking that
|
||||
`action` to reach a `new-state` from `current-state`:
|
||||
|
||||
```python
|
||||
def action_cost(
|
||||
current_state: State,
|
||||
@ -86,28 +91,28 @@ A search problem is the union of the followings:
|
||||
A `sequence` of `actions` to go from a `state` to another is called `path`.
|
||||
A `path` leading to the `goal` is called a `solution`.
|
||||
|
||||
The ***shortest*** `path` to the `goal` is called the `optimal-solution`, or
|
||||
The ***shortest*** `path` to the `goal` is called the `optimal-solution`, or
|
||||
in other words, this is the `path` with the ***lowest*** `cost`.
|
||||
|
||||
Obviously we always need a level of ***abstraction*** to get our `agent`
|
||||
perform at its best. For example, we don't need to express any detail
|
||||
Obviously we always need a level of ***abstraction*** to get our `agent`
|
||||
perform at its best. For example, we don't need to express any detail
|
||||
about the ***physics*** of the real world to go from *point-A* to *point-B*.
|
||||
|
||||
## Searching Algorithms
|
||||
|
||||
Most algorithms used to solve [Searching Problems](#search-problem) rely
|
||||
Most algorithms used to solve [Searching Problems](#search-problem) rely
|
||||
on a `tree` based representation, where the `root-node` is the `initial-state`
|
||||
and each `child-node` is the `next-available-state` from a `node`.
|
||||
|
||||
By the `data-structure` being a `search-tree`, each `node` has a ***unique***
|
||||
`path` back to the `root` as each `node` has a ***reference*** to the `parent-node`.
|
||||
|
||||
For each `action` we generate a `node` and each `generated-node`, wheter
|
||||
For each `action` we generate a `node` and each `generated-node`, wheter
|
||||
***further explored*** or not, become part of the `frontier` or `fringe`.
|
||||
|
||||
> [!TIP]
|
||||
> Before going on how to implement `search` algorithms,
|
||||
> let's say that we'll use these `data-structures` for
|
||||
> Before going on how to implement `search` algorithms,
|
||||
> let's say that we'll use these `data-structures` for
|
||||
> `frontiers`:
|
||||
>
|
||||
> - `priority-queue` when we need to evaluate for `lowest-costs` first
|
||||
@ -116,41 +121,45 @@ For each `action` we generate a `node` and each `generated-node`, wheter
|
||||
>
|
||||
> Then we need to take care of ***reduntant-paths*** in some ways:
|
||||
>
|
||||
> - Remember all previous `states` and only care for best `paths` to these
|
||||
> - Remember all previous `states` and only care for best `paths` to these
|
||||
> `states`, ***best when `problem` fits into memory***.
|
||||
> - Ignore the problem when it is ***rare*** or ***impossible*** to repeat them,
|
||||
> like in an ***assembly line*** in factories.
|
||||
> - Check for repeated `states` along the `parent-chain` up to the `root` or
|
||||
> - Check for repeated `states` along the `parent-chain` up to the `root` or
|
||||
> first `n-links`. This allows us to ***save up on memory***
|
||||
>
|
||||
> If we check for `redundant-paths` we have a `graph-search`, otherwise a `tree-like-search`
|
||||
>
|
||||
>
|
||||
|
||||
### Measuring Performance
|
||||
|
||||
We have 4 parameters:
|
||||
|
||||
- #### Completeness:
|
||||
Is the `algorithm` guaranteed to find the `solution`, if any, and report
|
||||
- #### Completeness
|
||||
|
||||
Is the `algorithm` guaranteed to find the `solution`, if any, and report
|
||||
for ***no solution***?
|
||||
|
||||
This is easy for `finite` `state-spaces` while we need a ***systematic***
|
||||
algorithm for `infinite` ones, though it would be difficult reporting
|
||||
algorithm for `infinite` ones, though it would be difficult reporting
|
||||
for ***no solution*** as it is impossible to explore the ***whole `space`***.
|
||||
|
||||
- #### Cost Optimality:
|
||||
- #### Cost Optimality
|
||||
|
||||
Can it find the `optimal-solution`?
|
||||
|
||||
- #### Time Complexity:
|
||||
- #### Time Complexity
|
||||
|
||||
`O(n) time` performance
|
||||
|
||||
- #### Space Complexity:
|
||||
`O(n) space` performance, explicit one (if the `graph` is ***explicit***) or
|
||||
by mean of:
|
||||
- #### Space Complexity
|
||||
|
||||
- `depth` of `actions` for an `optimal-solution`
|
||||
- `max-number-of-actions` in **any** `path`
|
||||
- `branching-factor` for a node
|
||||
`O(n) space` performance, explicit one (if the `graph` is ***explicit***) or
|
||||
by mean of:
|
||||
|
||||
- `depth` of `actions` for an `optimal-solution`
|
||||
- `max-number-of-actions` in **any** `path`
|
||||
- `branching-factor` for a node
|
||||
|
||||
### Uninformed Algorithms
|
||||
|
||||
@ -238,19 +247,19 @@ These `algorithms` know **nothing** about the `space`
|
||||
|
||||
```
|
||||
|
||||
In this `algorithm` we use the ***depth*** of `nodes` as the `cost` to
|
||||
In this `algorithm` we use the ***depth*** of `nodes` as the `cost` to
|
||||
reach such nodes.
|
||||
|
||||
In comparison to the [Best-First Search](#best-first-search), we have these
|
||||
In comparison to the [Best-First Search](#best-first-search), we have these
|
||||
differences:
|
||||
|
||||
- `FIFO Queue` instead of a `Priority Queue`:
|
||||
Since we expand on ***breadth***, a `FIFO` guarantees us that
|
||||
Since we expand on ***breadth***, a `FIFO` guarantees us that
|
||||
all nodes are in order as the `nodes` generated at the same `depth`
|
||||
are generated before that those at `depth + 1`.
|
||||
|
||||
- `early-goal test` instead of a `late-goal test`:
|
||||
We can immediately see if the `state` is the `goal-state` as
|
||||
We can immediately see if the `state` is the `goal-state` as
|
||||
it would have the ***minimum `cost` already***
|
||||
|
||||
- The `reached_states` is now a `set` instead of a `dict`:
|
||||
@ -258,11 +267,12 @@ differences:
|
||||
that we alread reached the ***minimum `cost`*** for that `state`
|
||||
after the first time we reached it.
|
||||
|
||||
However the `space-complexity` and `time-complexity` are
|
||||
***high*** with $O(b^d)$ space, where $b$ is the
|
||||
However the `space-complexity` and `time-complexity` are
|
||||
***high*** with $O(b^d)$ space, where $b$ is the
|
||||
`max-branching-factor` and $d$ is the `search-depth`[^breadth-first-performance]
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `optimal`
|
||||
- `complete` (as long each `action` has the same `cost`)
|
||||
|
||||
@ -274,23 +284,24 @@ This algorithm is:
|
||||
This algorithm is basically [Best-First Search](#best-first-search) but with `path_cost()`
|
||||
as the `cost_function`.
|
||||
|
||||
It works by `expanding` all `nodes` that have the ***lowest*** `path-cost` and
|
||||
evaluating them for the `goal` after `poppoing` them out of the `queue`, otherwise
|
||||
it would pick up one of the `non-optimal solutions`.
|
||||
It works by `expanding` all `nodes` that have the ***lowest*** `path-cost` and
|
||||
evaluating them for the `goal` after `poppoing` them out of the `queue`, otherwise
|
||||
it would pick up one of the `non-optimal solutions`.
|
||||
|
||||
Its ***performance*** depends on $C^{*}$, the `optimal-solution` and $\epsilon > 0$, the lower
|
||||
bound over the `cost` of each `action`. The `worst-case` would be
|
||||
bound over the `cost` of each `action`. The `worst-case` would be
|
||||
$O(b^{1 + \frac{C^*}{\epsilon}})$ for bot `time` and `space-complexity`
|
||||
|
||||
In the `worst-case` the `complexity` is $O(b^{d + 1})$ when all `actions` cost $\epsilon$
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `optimal`
|
||||
- `complete`
|
||||
|
||||
> [!TIP]
|
||||
> Notice that at ***worst***, we will have to expand $\frac{C^*}{\epsilon}$ if ***each***
|
||||
> action costed at most $\epsilon$, since $C^*$ is the `optimal-cost`, plus the
|
||||
> action costed at most $\epsilon$, since $C^*$ is the `optimal-cost`, plus the
|
||||
> ***last-expansion*** before realizing it got the `optimal-solution`
|
||||
|
||||
#### Depth-First Search
|
||||
@ -371,10 +382,11 @@ This algorithm is:
|
||||
```
|
||||
|
||||
This is basically a [Best-First Search](#best-first-search) but with the `cost_function`
|
||||
being the ***negative*** of `depth`. However we can use a `LIFO Queue`, instead of a
|
||||
being the ***negative*** of `depth`. However we can use a `LIFO Queue`, instead of a
|
||||
`cost_function`, and delete the `reached_space` `dict`.
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `non-optimal` as it returns the ***first*** `solution`, not the ***best***
|
||||
- `incomplete` as it is `non-systematic`, but it is `complete` for `acyclic graphs`
|
||||
and `trees`
|
||||
@ -479,12 +491,13 @@ One evolution of this algorithm, is the ***backtracking search***
|
||||
|
||||
This is a ***flavour*** of [Depth-First Search](#depth-first-search).
|
||||
|
||||
One addition of this algorithm is the parameter of `max_depth`. After we go after
|
||||
One addition of this algorithm is the parameter of `max_depth`. After we go after
|
||||
`max_depth`, we don't expand anymore.
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `non-optimal` (see [Depth-First Search](#depth-first-search))
|
||||
- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends
|
||||
- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends
|
||||
on the `max_depth` as well
|
||||
- $O(b^{max\_depth})$ `time-complexity`
|
||||
- $O(b\, max\_depth)$ `space-complexity`
|
||||
@ -493,8 +506,8 @@ This algorithm is:
|
||||
> This algorithm needs a way to handle `cycles` as its ***parent***
|
||||
|
||||
> [!TIP]
|
||||
> Depending on the domain of the `problem`, we can estimate a good `max_depth`, for
|
||||
> example, ***graphs*** have a number called `diameter` that tells us the
|
||||
> Depending on the domain of the `problem`, we can estimate a good `max_depth`, for
|
||||
> example, ***graphs*** have a number called `diameter` that tells us the
|
||||
> ***max number of `actions` to reach any `node` in the `graph`***
|
||||
|
||||
#### Iterative Deepening Search
|
||||
@ -531,15 +544,16 @@ This is a ***flavour*** of `depth-limited search`. whenever it reaches the `max_
|
||||
the `search` is ***restarted*** until one is found.
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `non-optimal` (see [Depth-First Search](#depth-first-search))
|
||||
- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends
|
||||
- `incomplete` (see [Depth-First Search](#depth-first-search)) and it now depends
|
||||
on the `max_depth` as well
|
||||
- $O(b^{m})$ `time-complexity`
|
||||
- $O(b\, m)$ `space-complexity`
|
||||
|
||||
> [!TIP]
|
||||
> This is the ***preferred*** method for `uninformed-search` when
|
||||
> ***we have no idea of the `max_depth`*** and ***the `space` is larger than memory***
|
||||
> This is the ***preferred*** method for `uninformed-search` when
|
||||
> ***we have no idea of the `max_depth`*** and ***the `space` is larger than memory***
|
||||
|
||||
#### Bidirectional Search
|
||||
|
||||
@ -663,9 +677,10 @@ a [Best-First Search](#best-first-search), making us save up on ***memory*** and
|
||||
On the other hand, the implementation algorithm is ***harder*** to implement
|
||||
|
||||
This algorithm is:
|
||||
|
||||
- `optimal` (see [Best-First Search](#best-first-search))
|
||||
- `complete` (see [Best-First Search](#best-first-search))
|
||||
- $O(b^{1 + \frac{C^*}{2\epsilon}})$ `time-complexity`
|
||||
- $O(b^{1 + \frac{C^*}{2\epsilon}})$ `time-complexity`
|
||||
(see [Uniform-Cost Search](#dijkstraalgorithm--aka-uniform-cost-search))
|
||||
- $O(b^{1 + \frac{C^*}{2\epsilon}})$ `space-complexity`
|
||||
(see [Uniform-Cost Search](#dijkstraalgorithm--aka-uniform-cost-search))
|
||||
@ -676,7 +691,7 @@ This algorithm is:
|
||||
>
|
||||
> - $C^*$ is the `optimal-path` and no `node` with $path\_cost > \frac{C^*}{2}$ will be expanded
|
||||
>
|
||||
> This is an important speedup, however, without having a `bi-directional` `cost_function`,
|
||||
> This is an important speedup, however, without having a `bi-directional` `cost_function`,
|
||||
> then we need to check for the ***best `solution`*** several times.
|
||||
|
||||
### Informed Algorithms | AKA Heuristic Search
|
||||
@ -766,8 +781,8 @@ These `algorithms` know something about the ***closeness*** of nodes
|
||||
|
||||
```
|
||||
|
||||
In a `best_first_search` we start from the `root` and then we
|
||||
`expand` and add these `states` as `nodes` at `frontier` if they are
|
||||
In a `best_first_search` we start from the `root` and then we
|
||||
`expand` and add these `states` as `nodes` at `frontier` if they are
|
||||
***either new or at lower path_cost***.
|
||||
|
||||
Whenever we get from a `state` to a `node`, we keep track of:
|
||||
@ -792,4 +807,4 @@ the `goal-state`, then the solution is `null`.
|
||||
Ch. 3 pg. 95
|
||||
|
||||
[^dijkstra-algorithm]: Artificial Intelligence: A Modern Approach Global Edition 4th |
|
||||
Ch. 3 pg. 96
|
||||
Ch. 3 pg. 96
|
||||
|
||||
249
docs/2-INTELLIGENT-AGENTS.md
Normal file
249
docs/2-INTELLIGENT-AGENTS.md
Normal file
@ -0,0 +1,249 @@
|
||||
# Intelligent Agents
|
||||
|
||||
## Task Environment
|
||||
|
||||
### Performance
|
||||
|
||||
How well the job is done
|
||||
|
||||
### Environment
|
||||
|
||||
Something we must accept as it is
|
||||
|
||||
### Actuators
|
||||
|
||||
Something we can use to change our Environment
|
||||
|
||||
### Sensors
|
||||
|
||||
Something we can use to perceive our Environment
|
||||
|
||||
## Properties of a Task Environment
|
||||
|
||||
- Observability
|
||||
- Fully Observable: Sensors gather all info
|
||||
- Partially Observable: Sensors gather info, but some is unavailable
|
||||
- Unobservable: No Sensors at all
|
||||
- Number of Agents:
|
||||
- Multi-Agent: Other Agents (even people may be considered agents) live in the environment
|
||||
> [!NOTE]
|
||||
> In order to be considered an agent, the other entity must maximize an objective
|
||||
> that depends on our agent behaviour
|
||||
- Cooperative: All agents try to maximize the same objective
|
||||
- Competitive: Agent objective can be maximized penalizing other agents objective
|
||||
- Single-Agent: Only one agent exists
|
||||
- Predictability
|
||||
- Deterministic: We can predict everything
|
||||
- Stochastic: We can predict outcomes according to some probabilities
|
||||
- Nondeterministic: We cannot predict everything nor the probability
|
||||
- Memory-Dependent
|
||||
- Episodic: Each stimulus-action is independent from previous actions
|
||||
- Sequential: Current actions may influence ones in the future, so we need to keep memory
|
||||
- Staticity
|
||||
- Static: Environment does not change **while our agent is deciding**
|
||||
- Dynamic: Environment changes **while our agent is deliberating**
|
||||
- SemiDynamic: Environment is Static, but agent's performance changes with time
|
||||
- Continuousity **(Applies to States)**
|
||||
- Continuous: State has continuous elements
|
||||
- Discrete: State has no continuous elements
|
||||
- Knowlegde
|
||||
> [!CAUTION]
|
||||
> It is not influenced by Observability, as this refers to the **outcomes
|
||||
> of actions** and **not over the state of the agent**
|
||||
- Known: Each rule is known a priori (known outcomes)
|
||||
- Unknown: The agent must discover environment rules (unknown outcomes)
|
||||
|
||||
### Environment classes
|
||||
|
||||
According to properties, we can define a class of Environments on where we can
|
||||
test our agents
|
||||
|
||||
### Architecture
|
||||
|
||||
The actual *hardware* (actuators) where our program will run, like a robot, or
|
||||
a pc.
|
||||
|
||||
## Agent Programs
|
||||
|
||||
> [!NOTE]
|
||||
> All programs have the same pseudo code:
|
||||
>
|
||||
> ```python
|
||||
> def agent(percept) -> Action:
|
||||
> ```
|
||||
|
||||
### Table Driven Agent
|
||||
|
||||
It basically has all possible reactions to stimuli at time $\mathcal{T}_i$, thus
|
||||
a space of $\sum_{t=1}^{T}|\mathcal{S}|^{t}$, which quicly becomes enormous
|
||||
|
||||
> [!TIP]
|
||||
> It is actually complete and makes us react at best
|
||||
|
||||
> [!CAUTION]
|
||||
> It is very memory consuming, so it is suitable for small problems
|
||||
|
||||
```python
|
||||
class TableDrivenAgent(Agent):
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
action_table: dict[list[Percept], Action]
|
||||
):
|
||||
self.__action_table = action_table
|
||||
self.__percept_sequence : list[Percept]= []
|
||||
|
||||
|
||||
def agent(self, percept: Percept) -> Action:
|
||||
self.__percept_sequence.push(
|
||||
percept
|
||||
)
|
||||
|
||||
return self.__action_table.get(
|
||||
self.__percept_sequence
|
||||
)
|
||||
|
||||
```
|
||||
|
||||
### Reflex Agent
|
||||
|
||||
Acts only based on stimuli at that very time $t$
|
||||
|
||||
> [!TIP]
|
||||
> It only reacts to stimuli based on the current state, so it's smaller and very fast
|
||||
|
||||
> [!CAUTION]
|
||||
> It is very limited in its capabilities and **requires a fully observable environment**
|
||||
|
||||
```python
|
||||
class ReflexAgent(Agent):
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
rules: list[Rule], # It depends on how the actual function is implemented
|
||||
):
|
||||
self.__rules = rules
|
||||
|
||||
|
||||
def agent(self, percept: Percept) -> Action:
|
||||
status = self.__get_status(percept)
|
||||
rule = self.__get_rule(status)
|
||||
return rule.action
|
||||
|
||||
|
||||
# MUST BE IMPLEMENTED
|
||||
def __get_status(self, percept: Percept) -> Status:
|
||||
pass
|
||||
|
||||
|
||||
# MUST BE IMPLENTED (it uses our rules)
|
||||
def __get_rule(self, state: State) -> Rule:
|
||||
pass
|
||||
|
||||
```
|
||||
|
||||
### Model-Based Reflex Agent
|
||||
|
||||
This agent has 2 models that helps it keeping an internal representation of the world:
|
||||
|
||||
- **Transition Model**:\
|
||||
Knowledge of *how the world works* (What my actions do)
|
||||
- **Sensor Model**:\
|
||||
Knowledge of *how the world is reflected in agent's percept*
|
||||
(How the world evolves without me)
|
||||
|
||||
> [!TIP]
|
||||
> It only reacts to stimuli based on the current state, but is also capable of predicting
|
||||
> next state, thus having some info on unobserved states
|
||||
|
||||
> [!CAUTION]
|
||||
> It is more complicated to code and slightly worse in terms of raw speed. While it is
|
||||
> more flexible, it is still somewhat limited
|
||||
|
||||
```python
|
||||
class ReflexAgent(Agent):
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
initial_state: State,
|
||||
transition_model: TransitionModel, # Implementation dependent
|
||||
sensor_model: SensorModel, # Implementation dependent
|
||||
rules: Rules # Implementation dependent
|
||||
):
|
||||
self.__current_state = initial_state
|
||||
self.__transition_model = transition_model
|
||||
self.__sensor_model = sensor_model
|
||||
self.__rules = rules
|
||||
self.__last_action : Action = None
|
||||
|
||||
|
||||
def agent(self, percept: Percept) -> Action:
|
||||
self.__update_state(percept)
|
||||
rule = self.__get_rule()
|
||||
return rule.action
|
||||
|
||||
|
||||
# MUST BE IMPLEMENTED
|
||||
def __update_state(self, percept: Percept) -> State:
|
||||
"""
|
||||
Uses:
|
||||
- percept
|
||||
- self.__current_state,
|
||||
- self.__last_action,
|
||||
- self.__transiton_model,
|
||||
- self.__sensor_model
|
||||
"""
|
||||
# Do something
|
||||
self.__current_state = state
|
||||
|
||||
|
||||
# MUST BE IMPLENTED (it uses our rules)
|
||||
def __get_rule(self) -> Rule:
|
||||
"""
|
||||
Uses:
|
||||
self.__current_state,
|
||||
self.__rules
|
||||
"""
|
||||
# Do something
|
||||
return rule
|
||||
|
||||
```
|
||||
|
||||
### Goal-Based Agent
|
||||
|
||||
It's an agent that **has an internal state representation**, **can predict the next state** and **chooses an action that satisfies its goals**.
|
||||
|
||||
> [!TIP]
|
||||
> It is very flexible and needs less hardcoded info, compared to a reflex-based approach, allowing a rapid
|
||||
> change of goals without reprogramming the agent
|
||||
|
||||
> [!CAUTION]
|
||||
> It is more computationally expensive to implement, moreover it requires a strategy to choose the best action
|
||||
|
||||
### Utility-Based Agent
|
||||
|
||||
It's an agent that **performs to maximize its expected utility**, useful when **goals are incompatible** and there's **uncertainty
|
||||
aboout how to achieve a goal**
|
||||
|
||||
> [!TIP]
|
||||
> Useful when we have many goals with different importance and when we need to balance some incompatible ones
|
||||
|
||||
> [!CAUTION]
|
||||
> It adds another layer of computation and since it chooses based on **estimantions**, it could be wrong
|
||||
|
||||
> [!NOTE]
|
||||
> Not each goal-based agent has a model to guide it
|
||||
|
||||
## Learning Agent
|
||||
|
||||
Each agent may be a Learning Agent. and it is composed of:
|
||||
|
||||
- **Learning Element**: Responsible for improvements
|
||||
- **Performance Element**: The entire agent, which takes actions
|
||||
- **Critic**: Gives Feedback to the Learning element on how to improve the agent
|
||||
- **Problem Genrator**: Suggests new actions to promote exploration of possibilites, avoiding to repeat the **best known action**
|
||||
|
||||
## States Representation
|
||||
|
||||
> [!NOTE]
|
||||
> Read pg 76 to 78 of Artificial Intelligence: A Modern Approach
|
||||
303
docs/3-SOLVING-PROBLEMS-BY-SEARCHING.md
Normal file
303
docs/3-SOLVING-PROBLEMS-BY-SEARCHING.md
Normal file
@ -0,0 +1,303 @@
|
||||
# Solving Problems by Searching
|
||||
|
||||
> [!WARNING]
|
||||
> In this chaptee we'll be talking about an environment with these properties:
|
||||
>
|
||||
> - episodic
|
||||
> - single agent
|
||||
> - fully observable
|
||||
> - deterministic
|
||||
> - static
|
||||
> - discrete
|
||||
> - known
|
||||
|
||||
## Problem-solving agent
|
||||
|
||||
It uses a **search method** to solve a problem. This agent is useful when it is not possible
|
||||
to define an immediate action to perform, but rather it is needed to **plan ahead**
|
||||
|
||||
However, this agent only uses **Atomic States**, making it faster, but limiting its
|
||||
power.
|
||||
|
||||
## Algorithms
|
||||
|
||||
They can be of 2 types:
|
||||
|
||||
- **Informed**: The agent can estimate how far it is from the goal
|
||||
- **Uninformed**: The agent canìt estimate how far is from the goal
|
||||
|
||||
And is usually composed of 4 phases:
|
||||
|
||||
- **Goal Formulation**: Choose a goal
|
||||
- **Porblem Formulation**: Create a representation of the world
|
||||
- **Search**: Before taking any action, search a sequence that brings the agent to the goal
|
||||
- **Execution**: Execute the solution (the sequence of actions to the goal) in the real world
|
||||
|
||||
> [!NOTE]
|
||||
> This is technically an **open loop** as the agent after the search phase **will ignore any other percept**
|
||||
|
||||
### Problem
|
||||
|
||||
This is what we are trying to solve with our agent, and it's a description of our environment
|
||||
|
||||
- Set of States - **State space**
|
||||
- Initial State
|
||||
- One or more Goal States
|
||||
- Available Actions - `def actions(state: State) -> list[Action]`
|
||||
- Transition Model - `def result(state: State, action: Action) -> State`
|
||||
- Action Cost Function - `def action_cost(current_state: State, action: Action, new_state: State) -> float`
|
||||
|
||||
the sequence of actions that brings us to the goal is a solution, and if it has the lowest cost, then it is optimal.
|
||||
|
||||
> [!TIP]
|
||||
> A State-Space can be represented as a graph
|
||||
|
||||
### Search algorithms
|
||||
|
||||
We usually use a tree to represent our state space:
|
||||
|
||||
- Root node: Initial State
|
||||
- Child nodes: Reachable States
|
||||
- Frontier: Unexplored Nodes
|
||||
|
||||
```python
|
||||
class Node:
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
state: State, # State represented
|
||||
parent: Node | None, # Node that generated this node
|
||||
action: Action, # Action that generated this node
|
||||
path_cost: float # Total cost to reach this node
|
||||
):
|
||||
pass
|
||||
|
||||
```
|
||||
|
||||
```python
|
||||
|
||||
class Frontier:
|
||||
"""
|
||||
Can Inherit from:
|
||||
- heapq -> Best-first search
|
||||
- queue -> Breadth-first search
|
||||
- stack -> Depth-first search
|
||||
"""
|
||||
|
||||
|
||||
def is_empty() -> bool:
|
||||
"""
|
||||
True if no nodes in Frontier
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def pop() -> Node:
|
||||
"""
|
||||
Returns the top node and removes it
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def top() -> Node:
|
||||
"""
|
||||
Returns top node without removing it
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
def add(node: Node):
|
||||
"""
|
||||
Adds node to frontier
|
||||
"""
|
||||
pass
|
||||
|
||||
```
|
||||
|
||||
#### Loop Detection
|
||||
|
||||
There are 3 maun techniques to avoid loopy paths
|
||||
|
||||
- Remember all previous states and keep only the best path
|
||||
- Used in best-first search
|
||||
- Fast computation
|
||||
- Huge memroy requirement
|
||||
- Just forget about it (Tree-like search, because it does not check for redundant paths)
|
||||
- Not every problem has repeated states, or little
|
||||
- no memory overhead
|
||||
- may potentially slow computation by a lot, or halt it completely
|
||||
- Check for repeated states along a chain
|
||||
- Slows computation based on how many links are inspected
|
||||
- Computation overhead
|
||||
- No memory impact
|
||||
|
||||
#### Performance metrics for search algorithms
|
||||
|
||||
- Completeness:\
|
||||
Can the algorithm **always** find a solution **if any** or report if **none** are found?
|
||||
- Cost optimality:\
|
||||
Has the found solution the **lowest cost**?
|
||||
- Time complexity:\
|
||||
O(n) notation for computation
|
||||
- Space complexity:\
|
||||
O(n) notation for memory
|
||||
|
||||
> [!NOTE]
|
||||
> We'll be using the following notation
|
||||
>
|
||||
> - $b$: branching factor (max number of branches by one action)
|
||||
> - $d$: solution depth
|
||||
> - $m$: maximum tree depth
|
||||
> - $C$: cost
|
||||
> - $C^*$: Theoretical optimal cost
|
||||
> - $\epsilon$: Minimum cost
|
||||
|
||||
#### Uninformed Strategies
|
||||
|
||||
##### Best First Search
|
||||
|
||||
- Take node with least cost and expand
|
||||
- Frontier must be a Priority Queue (or Heap Queue)
|
||||
- Equivalent to a Breadth-First approach when each node has the same cost
|
||||
- If $\epsilon = 0$ then the algorithm may stall, thus give every action a minial cost
|
||||
|
||||
It is:
|
||||
|
||||
- Complete
|
||||
- Optimal
|
||||
- $O(b^{1 + \frac{C^*}{\epsilon}})$ Time Complexity
|
||||
- $O(b^{1 + \frac{C^*}{\epsilon}})$ Space Complexity
|
||||
|
||||
> [!NOTE]
|
||||
> If everything costs $\epsilon$ this is basically a breadth-first that costs 1 more on depth
|
||||
|
||||
##### Breadth-First Search
|
||||
|
||||
- Take nodes that were expanded the first
|
||||
- Frontier should be a Queue
|
||||
- Equivalent to a Best-First Search with an evaluation function that takes depth as the cost
|
||||
|
||||
It is:
|
||||
|
||||
- Complete
|
||||
- Optimal as **long cost is uniform across all States**
|
||||
- $O(b^d)$ Time Complexity
|
||||
- $O(b^d)$ Space Complexity
|
||||
|
||||
##### Depth-First Search
|
||||
|
||||
It is the best whenever we need to save up on memory, implemented as a tree like search, but has many drawbacks
|
||||
|
||||
- Take nodes that were expansed the last
|
||||
- Frontier should be a Stack
|
||||
- Equivalent to a Best-First Search when the evaluation function has -depth as the cost
|
||||
|
||||
It is:
|
||||
|
||||
- Complete as long as the space is finite
|
||||
- Non-Optimal
|
||||
- $O(b^m)$ Time Complexity
|
||||
- $O(bm)$ Space complexity
|
||||
|
||||
> [!TIP]
|
||||
> It has 2 variants
|
||||
>
|
||||
> - Backtracking Search: Uses less memory going from $O(bm)$ to $O(m)$ Space complexity at the expense of making the algorithm more complex
|
||||
> - Depth-Limited: We limit depth at a certain limit and we don't expand once reached it. By expanding such limit iteratively, we have an **Iterative deepening search**
|
||||
|
||||
##### Bidirectional Search
|
||||
|
||||
Instead of searching only from the Initial State, we may search from the Goal State as well, potentially reducing our
|
||||
complexity down to $O(b^{frac{d}{2}})$
|
||||
|
||||
- We need two Frontiers
|
||||
- Two tables of reached states
|
||||
- Need to find a way to find a collision
|
||||
|
||||
It is:
|
||||
|
||||
- Complete if both directions are breadth-first or uniform cost and space is finite
|
||||
- Optimal (as for breadth-first)
|
||||
- $O(b^{frac{d}{2}})$ Time Complexity
|
||||
- $O(b^{frac{d}{2}})$ Space Complexity
|
||||
|
||||
#### Informed Strategies
|
||||
|
||||
A strategy is informed if it uses info about the domain to make a decision.
|
||||
|
||||
Here we introduce a new funciton called **heuristic function** $h(n)$ which is the estimated cost of
|
||||
the cheapest path from the state at node n to the goal state
|
||||
|
||||
##### Greedy best-first search
|
||||
|
||||
- It is a best-first search with $h(n)$ as the evaluation function
|
||||
- Never expands anything that is not on the solution path
|
||||
- May yield a worse outcome than other strategies
|
||||
- Can amortize Complexity to $O(bm)$ with good heuristics
|
||||
|
||||
> [!CAUTION]
|
||||
> In other words, a greedy search takes only in account the heuristic distance from a state to the goal, not the
|
||||
> whole cost
|
||||
|
||||
> [!NOTE]
|
||||
> There exists a version called **speedy search** that uses the estimated number of actions to reach the goal as its heuristic,
|
||||
> regardless of the actual cost
|
||||
|
||||
It is:
|
||||
|
||||
- Complete when space state is finite
|
||||
- Non-Optimal
|
||||
- $O(|V|)$ Time complexity
|
||||
- $O(|V|)$ Space complexity
|
||||
|
||||
##### A\*
|
||||
|
||||
- Best-First search with the evaluation valuation function equal to the path cost to node n plus the heuristic: $f(n) = g(n) + h(n)$
|
||||
- Optimality depends on the **admissibility** of its heuristic (a function that does not overestimate the cost to reach a goal)
|
||||
|
||||
It is:
|
||||
|
||||
- Complete
|
||||
- Optimal as long as the heuristic is admissable (demonstration on page 106 of book)
|
||||
- $O(b^d) Time complexity
|
||||
- $O(b^d) Space complexity
|
||||
|
||||
> [!TIP]
|
||||
> While A\* is a great algorithm, we may sacrifice its optimality for a faster approach by introducing a weigth over the
|
||||
> heuristic function: $f(n) = g(n) + W \cdot h(n)$.
|
||||
>
|
||||
> The higher $W$ the faster and less accurate A\* will be
|
||||
|
||||
> [!NOTE]
|
||||
> There exists a version called **Iterative-Deepening A\*** which cuts off at a certain value of $f(n)$ and if no goal is reached it
|
||||
> restarts by increasing the cut off using the smalles cost of the node that went over the previous $f(n)$, practically searching over a contour
|
||||
|
||||
## Search Contours
|
||||
|
||||
See pg 107 to 110 book
|
||||
|
||||
## Memory bounded search
|
||||
|
||||
See pg 110 to 115
|
||||
|
||||
### Reference Count and Separation
|
||||
|
||||
Discard nodes that have been visited by all of their neighbours
|
||||
|
||||
### Beam Search
|
||||
|
||||
Discard all Frontiers nodes that are not the K-strongest, or that are more than $\delta$ away from the strongest candidate
|
||||
|
||||
### RBFS
|
||||
|
||||
It's an algorithm similar to IDA\*, but faster. It tries to mimic a best-first search in linear space, but it regenerates a lot nodes already explored.
|
||||
At its core, it keeps the second best value in memory and each time expands, if it ahs not reached the goal, overwrites the cost of a node with it's cheapest child
|
||||
(in order to resume the exploration) and then goes to the second best it had in memory, and keeps the other result as the second best.
|
||||
|
||||
### MA\* and SMA\*
|
||||
|
||||
These are 2 algorithms that were born to make use of all the memory available to solve the "too little memory" sufferance of both IDA\* and RBFS
|
||||
|
||||
## Heuristic Functions
|
||||
|
||||
See pg 115 to 122
|
||||
Loading…
x
Reference in New Issue
Block a user