Chapter 2 Initial Commit
This commit is contained in:
parent
30d03f4db0
commit
b5b15edc65
255
Chapters/2-INTELLIGENT-AGENTS.md
Normal file
255
Chapters/2-INTELLIGENT-AGENTS.md
Normal file
@ -0,0 +1,255 @@
|
||||
# Intelligent Agents
|
||||
|
||||
## What is an Agent
|
||||
|
||||
An `agent` is something that acts in an `environment` through the help of `actuators`
|
||||
and takes such decisions by evaluating the informations gathered by `sensors`
|
||||
|
||||
- `Sensor`: Provides info by analyzing the `Environment`
|
||||
- `Actuator`: Changes the `Environment`
|
||||
- `Environment`: Holds the source of truth of the `Environment`
|
||||
- `Agent`: Anything that can change the `Environment` by some `Decisions`
|
||||
|
||||
the ultimate `goal` for us is to have a `rational agent` which takes
|
||||
`rational` decisions to reach a `goal`.
|
||||
|
||||
> [!WARNING]
|
||||
> `Rationality` is not `Omniscence`.
|
||||
>
|
||||
> For a moment think of trying to reach a big structure in a city that is always visible.
|
||||
> The first thing you may think of is trying to walk towards this structure and turning only
|
||||
> when you find a wall in front of you.
|
||||
>
|
||||
> However, one turn left and you find yourself stuck on a dead end that is **very** close
|
||||
> to your `goal`. This doesn't mean that you were `Irrational` to begin with, but just that
|
||||
> you didn't have enough information.
|
||||
>
|
||||
> If you apply this concept to our `agent`, this is what we refer as `Rationality`
|
||||
|
||||
### Morality of an agent
|
||||
|
||||
*Be careful for what you ask for, cause you may receive it* - W.W. Jacobs
|
||||
|
||||
An interesting example of this fenomenon is probably ***King's Mida***.
|
||||
Now I want to bring you another one from a videogame called ***SOMA***.
|
||||
|
||||
In this game, some robots where programmed with the goal of keeping humans alive.
|
||||
So far, so good you may say, the thing is that most humans kept alive by these
|
||||
machines were in sufference and asking for the sweet relief of death.
|
||||
|
||||
The thing is that the `Goal` of these machines did not include any
|
||||
information about the **Wellness** of human beings, but just for them
|
||||
to be alive, transforming a wish into a nightmare.
|
||||
|
||||
### Extra
|
||||
|
||||
#### Percept Sequence
|
||||
|
||||
This is a sequence of `Actions` taken until time $t_1$.
|
||||
|
||||
Let's say that we have $n$ possible `Actions` for out `Agent`, then
|
||||
a `percept-sequence` has $n^{t_1}$ possible sequences
|
||||
|
||||
> [!WARNING]
|
||||
> Notice that an `action` may be repeated many times
|
||||
|
||||
## Modelling an Agent
|
||||
|
||||
We usually use ***PEAS*** to represent our `task-environment`.
|
||||
|
||||
- **P**erformance Measure
|
||||
- **E**nvironment
|
||||
- **A**ctuators
|
||||
- **S**ensors
|
||||
|
||||
### Task Environment
|
||||
|
||||
<!-- TODO: Add explanations -->
|
||||
|
||||
- Observability
|
||||
- Fully Observable
|
||||
- Partially Observable
|
||||
- Non-Observable
|
||||
- Number of Agents
|
||||
- Single
|
||||
- Multiagent
|
||||
- Competitive
|
||||
- Cooperative
|
||||
- Predictability
|
||||
- Deterministic
|
||||
- Non-Deterministic
|
||||
- Stochastic (when you know the probability of random events)
|
||||
- Reactions
|
||||
- Episodic (do not depend on past actions)
|
||||
- Sequential (depend on past actions)
|
||||
- Sate of Environment
|
||||
- Static
|
||||
- Dynamic
|
||||
- Semi-Dynamic
|
||||
- Contnuousness
|
||||
- Continuous
|
||||
- Discrete
|
||||
- Rules of Environment
|
||||
- Known
|
||||
- Unknown
|
||||
|
||||
However `Experiments` will often include more than one `Taks-Environment`
|
||||
|
||||
### Agent Architectures
|
||||
<!-- TODO: Insert images -->
|
||||
|
||||
### Table Driven Agent
|
||||
|
||||
> [!WARNING]
|
||||
> This is used only for education purposes only. It is
|
||||
> not inteded to be used in a real environment
|
||||
|
||||
This agent reacts to a table of `Perception` - `Action`
|
||||
as described in this `pseudo-code`.
|
||||
|
||||
```python
|
||||
# This is pseudo code in python
|
||||
def percept_action(percept: Perception) : Action
|
||||
|
||||
# Add perception to the sequence of
|
||||
# perceptions
|
||||
this.perception_sequence.append(percept)
|
||||
|
||||
# Look into table
|
||||
action = this.percept_action_table[this.perception_sequence]
|
||||
|
||||
return action
|
||||
```
|
||||
|
||||
However, this is very inefficient as it takes the whole **sequence of perceptions**, hence requiring the
|
||||
`LookUp Table` to have $\sum_{t=1}^{T}|Perception|^{t}$ entries in `Memory`.
|
||||
|
||||
### Reflex Agent
|
||||
|
||||
This agent reacts ***only*** to the newest `Perpection`
|
||||
|
||||
```python
|
||||
def percept_action(percept: Perception) : Action
|
||||
state = this.get_new_state(this.state, percept)
|
||||
return this.state_action_table[state]
|
||||
```
|
||||
|
||||
This time we have only $|Perception|$ entries in our `LookUp Table`.
|
||||
Of course our agent will be less `intelligent`, but more `reactive` and
|
||||
`small`
|
||||
|
||||
#### Randomization Problem
|
||||
|
||||
These `agents` are prone to infinite loops. In this case it is useful to have it
|
||||
`randomize` its `Actions`[^randomizing-reflex-agents]
|
||||
|
||||
### Model Based Reflex Agent
|
||||
|
||||
These models are very useful in case of `partial-observability` as they keep
|
||||
a sort of `internal-state` dependent on `perception-history`, getting some
|
||||
info about the `non-observable-space`.
|
||||
|
||||
Now, differently from before, we need to keep track of:
|
||||
|
||||
- `internal-state`: Something the agent has
|
||||
- `Environment` evloution independently of our `agent` | AKA ***Transition Model***
|
||||
- How our `agent`'s actions will change the `Environment`
|
||||
- How our `agent` percept the world | AKA ***Sensor Model***
|
||||
|
||||
However, even though we may perceive the world around us, this perception can only
|
||||
`partially` reconstruct the whole `Envrionment` for our best `educated guesses`
|
||||
|
||||
```python
|
||||
def percept_action(percept: Perception) : Action
|
||||
|
||||
# Get new state based on agent info
|
||||
this.state = this.state_update(
|
||||
this.state,
|
||||
this.recent_action,
|
||||
percept,
|
||||
this.transition_model
|
||||
this.sensor_model
|
||||
)
|
||||
|
||||
return this.state_action_table[state]
|
||||
```
|
||||
|
||||
Since we are dealing with `uncertainty`, we our `agent` need to be able to **take
|
||||
decisions** even **when** it's **unsure** about the `Environment`
|
||||
|
||||
### Goal Based Agent
|
||||
|
||||
This is useful when we need to achieve a particular situation, rather than ***blindly*** responding
|
||||
to the stimuli provided by the `Environment`[^reflex-vs-goal-agents].
|
||||
|
||||
These `agents`, while requiring more resources as they need to take a *ponderated* decision,
|
||||
will result in more **flexible** `models` than can adapt to more situations, like enabling
|
||||
an *autopilot* to travel to multiple destinations, instead of `hardcoding` all the actions
|
||||
in a `LookUp Table`
|
||||
|
||||
### Utility-Based Agents
|
||||
|
||||
However, just reaching the `goal` is not probably what we desire, but also the ***how*** is important
|
||||
as well. For this, imagine the `goal` as the desired state and all of our ***secondary goals*** as
|
||||
what happend during our way to the `goal`.
|
||||
|
||||
For example, in **SOMA** our `agents` needed to keep people alive, but as we could see, they did not
|
||||
care about ***quality of life***. If they used a `utility-based` `agent` that kept track of
|
||||
***quality of life***, then probably they wouldn't have had such problems.
|
||||
|
||||
Moreover, a `utility-based` `agent` is capable of making tradeoffs on **incompatible goals**, or multiple
|
||||
`goals` that may not be achievable with ***certainty***.
|
||||
|
||||
> [!CAUTION]
|
||||
> These agents maximize the output of the `Utility Function`, thus *beware of what you wish for*
|
||||
|
||||
### Learning Agents
|
||||
|
||||
It is divided in 4 parts and can be of any previous cateogry
|
||||
|
||||
#### Learning Element
|
||||
|
||||
Responsible to ***make improvements***
|
||||
|
||||
#### Performance Element
|
||||
|
||||
Responsible for ***taking external actions***, which is basically what we described until now
|
||||
as our `agent`
|
||||
|
||||
#### Critic Element
|
||||
|
||||
Gives to the ***[Learning element](#learning-element)*** our performance evaluation.
|
||||
This element is not something that the `agent` can *learn* or *modify*
|
||||
|
||||
#### Problem Generator
|
||||
|
||||
Responsible to give our `agent` stimuli to have ***informative and new experiences*** so
|
||||
that our ***[Performance Element](#performance-element)*** doesn't dominate our behaviour,
|
||||
thus ***making only actions that are the best according to its own knowledge***
|
||||
|
||||
### States of an Agent[^states-of-an-agent]
|
||||
|
||||
They usually are one of these 3:
|
||||
|
||||
- #### Atomic
|
||||
|
||||
These states have no other variables and are indivisible
|
||||
|
||||
- #### Factored
|
||||
|
||||
States are represented by a vector and each transition makes these values change.
|
||||
|
||||
So, each state may have something in common
|
||||
with the next one
|
||||
|
||||
- #### Structured
|
||||
|
||||
States are objects that may contain other objects as well or relationships between objects
|
||||
|
||||
Moreover these states can be `localist` or `distributed` if these states are either on a single
|
||||
machine, or distributed machines
|
||||
|
||||
<!-- Footnotes-->
|
||||
[^randomizing-reflex-agents]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 69
|
||||
[^reflex-vs-goal-agents]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 71
|
||||
[^states-of-an-agent]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 77
|
||||
Loading…
x
Reference in New Issue
Block a user