250 lines
7.1 KiB
Markdown
250 lines
7.1 KiB
Markdown
|
|
# Intelligent Agents
|
||
|
|
|
||
|
|
## Task Environment
|
||
|
|
|
||
|
|
### Performance
|
||
|
|
|
||
|
|
How well the job is done
|
||
|
|
|
||
|
|
### Environment
|
||
|
|
|
||
|
|
Something we must accept as it is
|
||
|
|
|
||
|
|
### Actuators
|
||
|
|
|
||
|
|
Something we can use to change our Environment
|
||
|
|
|
||
|
|
### Sensors
|
||
|
|
|
||
|
|
Something we can use to perceive our Environment
|
||
|
|
|
||
|
|
## Properties of a Task Environment
|
||
|
|
|
||
|
|
- Observability
|
||
|
|
- Fully Observable: Sensors gather all info
|
||
|
|
- Partially Observable: Sensors gather info, but some is unavailable
|
||
|
|
- Unobservable: No Sensors at all
|
||
|
|
- Number of Agents:
|
||
|
|
- Multi-Agent: Other Agents (even people may be considered agents) live in the environment
|
||
|
|
> [!NOTE]
|
||
|
|
> In order to be considered an agent, the other entity must maximize an objective
|
||
|
|
> that depends on our agent behaviour
|
||
|
|
- Cooperative: All agents try to maximize the same objective
|
||
|
|
- Competitive: Agent objective can be maximized penalizing other agents objective
|
||
|
|
- Single-Agent: Only one agent exists
|
||
|
|
- Predictability
|
||
|
|
- Deterministic: We can predict everything
|
||
|
|
- Stochastic: We can predict outcomes according to some probabilities
|
||
|
|
- Nondeterministic: We cannot predict everything nor the probability
|
||
|
|
- Memory-Dependent
|
||
|
|
- Episodic: Each stimulus-action is independent from previous actions
|
||
|
|
- Sequential: Current actions may influence ones in the future, so we need to keep memory
|
||
|
|
- Staticity
|
||
|
|
- Static: Environment does not change **while our agent is deciding**
|
||
|
|
- Dynamic: Environment changes **while our agent is deliberating**
|
||
|
|
- SemiDynamic: Environment is Static, but agent's performance changes with time
|
||
|
|
- Continuousity **(Applies to States)**
|
||
|
|
- Continuous: State has continuous elements
|
||
|
|
- Discrete: State has no continuous elements
|
||
|
|
- Knowlegde
|
||
|
|
> [!CAUTION]
|
||
|
|
> It is not influenced by Observability, as this refers to the **outcomes
|
||
|
|
> of actions** and **not over the state of the agent**
|
||
|
|
- Known: Each rule is known a priori (known outcomes)
|
||
|
|
- Unknown: The agent must discover environment rules (unknown outcomes)
|
||
|
|
|
||
|
|
### Environment classes
|
||
|
|
|
||
|
|
According to properties, we can define a class of Environments on where we can
|
||
|
|
test our agents
|
||
|
|
|
||
|
|
### Architecture
|
||
|
|
|
||
|
|
The actual *hardware* (actuators) where our program will run, like a robot, or
|
||
|
|
a pc.
|
||
|
|
|
||
|
|
## Agent Programs
|
||
|
|
|
||
|
|
> [!NOTE]
|
||
|
|
> All programs have the same pseudo code:
|
||
|
|
>
|
||
|
|
> ```python
|
||
|
|
> def agent(percept) -> Action:
|
||
|
|
> ```
|
||
|
|
|
||
|
|
### Table Driven Agent
|
||
|
|
|
||
|
|
It basically has all possible reactions to stimuli at time $\mathcal{T}_i$, thus
|
||
|
|
a space of $\sum_{t=1}^{T}|\mathcal{S}|^{t}$, which quicly becomes enormous
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> It is actually complete and makes us react at best
|
||
|
|
|
||
|
|
> [!CAUTION]
|
||
|
|
> It is very memory consuming, so it is suitable for small problems
|
||
|
|
|
||
|
|
```python
|
||
|
|
class TableDrivenAgent(Agent):
|
||
|
|
|
||
|
|
def __init__(
|
||
|
|
self,
|
||
|
|
action_table: dict[list[Percept], Action]
|
||
|
|
):
|
||
|
|
self.__action_table = action_table
|
||
|
|
self.__percept_sequence : list[Percept]= []
|
||
|
|
|
||
|
|
|
||
|
|
def agent(self, percept: Percept) -> Action:
|
||
|
|
self.__percept_sequence.push(
|
||
|
|
percept
|
||
|
|
)
|
||
|
|
|
||
|
|
return self.__action_table.get(
|
||
|
|
self.__percept_sequence
|
||
|
|
)
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
### Reflex Agent
|
||
|
|
|
||
|
|
Acts only based on stimuli at that very time $t$
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> It only reacts to stimuli based on the current state, so it's smaller and very fast
|
||
|
|
|
||
|
|
> [!CAUTION]
|
||
|
|
> It is very limited in its capabilities and **requires a fully observable environment**
|
||
|
|
|
||
|
|
```python
|
||
|
|
class ReflexAgent(Agent):
|
||
|
|
|
||
|
|
def __init__(
|
||
|
|
self,
|
||
|
|
rules: list[Rule], # It depends on how the actual function is implemented
|
||
|
|
):
|
||
|
|
self.__rules = rules
|
||
|
|
|
||
|
|
|
||
|
|
def agent(self, percept: Percept) -> Action:
|
||
|
|
status = self.__get_status(percept)
|
||
|
|
rule = self.__get_rule(status)
|
||
|
|
return rule.action
|
||
|
|
|
||
|
|
|
||
|
|
# MUST BE IMPLEMENTED
|
||
|
|
def __get_status(self, percept: Percept) -> Status:
|
||
|
|
pass
|
||
|
|
|
||
|
|
|
||
|
|
# MUST BE IMPLENTED (it uses our rules)
|
||
|
|
def __get_rule(self, state: State) -> Rule:
|
||
|
|
pass
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
### Model-Based Reflex Agent
|
||
|
|
|
||
|
|
This agent has 2 models that helps it keeping an internal representation of the world:
|
||
|
|
|
||
|
|
- **Transition Model**:\
|
||
|
|
Knowledge of *how the world works* (What my actions do)
|
||
|
|
- **Sensor Model**:\
|
||
|
|
Knowledge of *how the world is reflected in agent's percept*
|
||
|
|
(How the world evolves without me)
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> It only reacts to stimuli based on the current state, but is also capable of predicting
|
||
|
|
> next state, thus having some info on unobserved states
|
||
|
|
|
||
|
|
> [!CAUTION]
|
||
|
|
> It is more complicated to code and slightly worse in terms of raw speed. While it is
|
||
|
|
> more flexible, it is still somewhat limited
|
||
|
|
|
||
|
|
```python
|
||
|
|
class ReflexAgent(Agent):
|
||
|
|
|
||
|
|
def __init__(
|
||
|
|
self,
|
||
|
|
initial_state: State,
|
||
|
|
transition_model: TransitionModel, # Implementation dependent
|
||
|
|
sensor_model: SensorModel, # Implementation dependent
|
||
|
|
rules: Rules # Implementation dependent
|
||
|
|
):
|
||
|
|
self.__current_state = initial_state
|
||
|
|
self.__transition_model = transition_model
|
||
|
|
self.__sensor_model = sensor_model
|
||
|
|
self.__rules = rules
|
||
|
|
self.__last_action : Action = None
|
||
|
|
|
||
|
|
|
||
|
|
def agent(self, percept: Percept) -> Action:
|
||
|
|
self.__update_state(percept)
|
||
|
|
rule = self.__get_rule()
|
||
|
|
return rule.action
|
||
|
|
|
||
|
|
|
||
|
|
# MUST BE IMPLEMENTED
|
||
|
|
def __update_state(self, percept: Percept) -> State:
|
||
|
|
"""
|
||
|
|
Uses:
|
||
|
|
- percept
|
||
|
|
- self.__current_state,
|
||
|
|
- self.__last_action,
|
||
|
|
- self.__transiton_model,
|
||
|
|
- self.__sensor_model
|
||
|
|
"""
|
||
|
|
# Do something
|
||
|
|
self.__current_state = state
|
||
|
|
|
||
|
|
|
||
|
|
# MUST BE IMPLENTED (it uses our rules)
|
||
|
|
def __get_rule(self) -> Rule:
|
||
|
|
"""
|
||
|
|
Uses:
|
||
|
|
self.__current_state,
|
||
|
|
self.__rules
|
||
|
|
"""
|
||
|
|
# Do something
|
||
|
|
return rule
|
||
|
|
|
||
|
|
```
|
||
|
|
|
||
|
|
### Goal-Based Agent
|
||
|
|
|
||
|
|
It's an agent that **has an internal state representation**, **can predict the next state** and **chooses an action that satisfies its goals**.
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> It is very flexible and needs less hardcoded info, compared to a reflex-based approach, allowing a rapid
|
||
|
|
> change of goals without reprogramming the agent
|
||
|
|
|
||
|
|
> [!CAUTION]
|
||
|
|
> It is more computationally expensive to implement, moreover it requires a strategy to choose the best action
|
||
|
|
|
||
|
|
### Utility-Based Agent
|
||
|
|
|
||
|
|
It's an agent that **performs to maximize its expected utility**, useful when **goals are incompatible** and there's **uncertainty
|
||
|
|
aboout how to achieve a goal**
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> Useful when we have many goals with different importance and when we need to balance some incompatible ones
|
||
|
|
|
||
|
|
> [!CAUTION]
|
||
|
|
> It adds another layer of computation and since it chooses based on **estimantions**, it could be wrong
|
||
|
|
|
||
|
|
> [!NOTE]
|
||
|
|
> Not each goal-based agent has a model to guide it
|
||
|
|
|
||
|
|
## Learning Agent
|
||
|
|
|
||
|
|
Each agent may be a Learning Agent. and it is composed of:
|
||
|
|
|
||
|
|
- **Learning Element**: Responsible for improvements
|
||
|
|
- **Performance Element**: The entire agent, which takes actions
|
||
|
|
- **Critic**: Gives Feedback to the Learning element on how to improve the agent
|
||
|
|
- **Problem Genrator**: Suggests new actions to promote exploration of possibilites, avoiding to repeat the **best known action**
|
||
|
|
|
||
|
|
## States Representation
|
||
|
|
|
||
|
|
> [!NOTE]
|
||
|
|
> Read pg 76 to 78 of Artificial Intelligence: A Modern Approach
|