Foundation-of-Artificial-In.../docs/2-INTELLIGENT-AGENTS.md

# Intelligent Agents

## Task Environment

### Performance

How well the job is done

### Environment

Something we must accept as it is

### Actuators

Something we can use to change our Environment

### Sensors

Something we can use to perceive our Environment

## Properties of a Task Environment

- Observability
  - Fully Observable: Sensors gather all info
  - Partially Observable: Sensors gather info, but some is unavailable
  - Unobservable: No Sensors at all
- Number of Agents:
  - Multi-Agent: Other Agents (even people may be considered agents) live in the environment
    > [!NOTE]
    > In order to be considered an agent, the other entity must maximize an objective
    > that depends on our agent behaviour
    - Cooperative: All agents try to maximize the same objective
    - Competitive: Agent objective can be maximized penalizing other agents objective
  - Single-Agent: Only one agent exists
- Predictability
  - Deterministic: We can predict everything
  - Stochastic: We can predict outcomes according to some probabilities
  - Nondeterministic: We cannot predict everything nor the probability
- Memory-Dependent
  - Episodic: Each stimulus-action is independent from previous actions
  - Sequential: Current actions may influence ones in the future, so we need to keep memory
- Staticity
  - Static: Environment does not change **while our agent is deciding**
  - Dynamic: Environment changes **while our agent is deliberating**
  - SemiDynamic: Environment is Static, but agent's performance changes with time
- Continuousity **(Applies to States)**
  - Continuous: State has continuous elements
  - Discrete: State has no continuous elements
- Knowlegde
    > [!CAUTION]
    > It is not influenced by Observability, as this refers to the **outcomes
    > of actions** and **not over the state of the agent**
  - Known: Each rule is known a priori (known outcomes)
  - Unknown: The agent must discover environment rules (unknown outcomes)

### Environment classes

According to properties, we can define a class of Environments on where we can
test our agents

### Architecture

The actual *hardware* (actuators) where our program will run, like a robot, or
a pc.

## Agent Programs

> [!NOTE]
> All programs have the same pseudo code:
>
> ```python
>  def agent(percept) -> Action:  
> ```

### Table Driven Agent

It basically has all possible reactions to stimuli at time $\mathcal{T}_i$, thus
a space of $\sum_{t=1}^{T}|\mathcal{S}|^{t}$, which quicly becomes enormous

> [!TIP]
> It is actually complete and makes us react at best

> [!CAUTION]
> It is very memory consuming, so it is suitable for small problems

```python
class TableDrivenAgent(Agent):

    def __init__(
        self,
        action_table: dict[list[Percept], Action]
    ):
        self.__action_table = action_table
        self.__percept_sequence : list[Percept]= []

    
    def agent(self, percept: Percept) -> Action:
        self.__percept_sequence.push(
            percept
        )

        return self.__action_table.get(
            self.__percept_sequence
        )

```

### Reflex Agent

Acts only based on stimuli at that very time $t$

> [!TIP]
> It only reacts to stimuli based on the current state, so it's smaller and very fast

> [!CAUTION]
> It is very limited in its capabilities and **requires a fully observable environment**

```python
class ReflexAgent(Agent):

    def __init__(
        self,
        rules: list[Rule], # It depends on how the actual function is implemented
    ):
        self.__rules = rules
    

    def agent(self, percept: Percept) -> Action:
        status = self.__get_status(percept)
        rule = self.__get_rule(status)
        return rule.action


    # MUST BE IMPLEMENTED
    def __get_status(self, percept: Percept) -> Status:
        pass


    # MUST BE IMPLENTED (it uses our rules)
    def __get_rule(self, state: State) -> Rule:
        pass

```

### Model-Based Reflex Agent

This agent has 2 models that helps it keeping an internal representation of the world:

- **Transition Model**:\
    Knowledge of *how the world works* (What my actions do)
- **Sensor Model**:\
    Knowledge of *how the world is reflected in agent's percept*
    (How the world evolves without me)

> [!TIP]
> It only reacts to stimuli based on the current state, but is also capable of predicting
> next state, thus having some info on unobserved states

> [!CAUTION]
> It is more complicated to code and slightly worse in terms of raw speed. While it is
> more flexible, it is still somewhat limited

```python
class ReflexAgent(Agent):

    def __init__(
        self,
        initial_state: State,
        transition_model: TransitionModel,  # Implementation dependent
        sensor_model: SensorModel,          # Implementation dependent
        rules: Rules                        # Implementation dependent
    ):
        self.__current_state = initial_state
        self.__transition_model = transition_model
        self.__sensor_model = sensor_model
        self.__rules = rules
        self.__last_action : Action = None
    

    def agent(self, percept: Percept) -> Action:
        self.__update_state(percept)
        rule = self.__get_rule()
        return rule.action


    # MUST BE IMPLEMENTED
    def __update_state(self, percept: Percept) -> State:
        """
        Uses:
            - percept
            - self.__current_state,
            - self.__last_action,
            - self.__transiton_model,
            - self.__sensor_model
        """
        # Do something
        self.__current_state = state


    # MUST BE IMPLENTED (it uses our rules)
    def __get_rule(self) -> Rule:
        """
        Uses:
            self.__current_state,
            self.__rules
        """
        # Do something
        return rule

```

### Goal-Based Agent

It's an agent that **has an internal state representation**, **can predict the next state** and **chooses an action that satisfies its goals**.

> [!TIP]
> It is very flexible and needs less hardcoded info, compared to a reflex-based approach, allowing a rapid
> change of goals without reprogramming the agent

> [!CAUTION]
> It is more computationally expensive to implement, moreover it requires a strategy to choose the best action

### Utility-Based Agent

It's an agent that **performs to maximize its expected utility**, useful when **goals are incompatible** and there's **uncertainty
aboout how to achieve a goal**

> [!TIP]
> Useful when we have many goals with different importance and when we need to balance some incompatible ones

> [!CAUTION]
> It adds another layer of computation and since it chooses based on **estimantions**, it could be wrong

> [!NOTE]
> Not each goal-based agent has a model to guide it

## Learning Agent

Each agent may be a Learning Agent. and it is composed of:

- **Learning Element**: Responsible for improvements
- **Performance Element**: The entire agent, which takes actions
- **Critic**: Gives Feedback to the Learning element on how to improve the agent
- **Problem Genrator**: Suggests new actions to promote exploration of possibilites, avoiding to repeat the **best known action**

## States Representation

> [!NOTE]
> Read pg 76 to 78 of Artificial Intelligence: A Modern Approach
Rework of chapters 2 and 3 2025-08-19 20:03:51 +02:00			`# Intelligent Agents`

			`## Task Environment`

			`### Performance`

			`How well the job is done`

			`### Environment`

			`Something we must accept as it is`

			`### Actuators`

			`Something we can use to change our Environment`

			`### Sensors`

			`Something we can use to perceive our Environment`

			`## Properties of a Task Environment`

			`- Observability`
			`- Fully Observable: Sensors gather all info`
			`- Partially Observable: Sensors gather info, but some is unavailable`
			`- Unobservable: No Sensors at all`
			`- Number of Agents:`
			`- Multi-Agent: Other Agents (even people may be considered agents) live in the environment`
			`> [!NOTE]`
			`> In order to be considered an agent, the other entity must maximize an objective`
			`> that depends on our agent behaviour`
			`- Cooperative: All agents try to maximize the same objective`
			`- Competitive: Agent objective can be maximized penalizing other agents objective`
			`- Single-Agent: Only one agent exists`
			`- Predictability`
			`- Deterministic: We can predict everything`
			`- Stochastic: We can predict outcomes according to some probabilities`
			`- Nondeterministic: We cannot predict everything nor the probability`
			`- Memory-Dependent`
			`- Episodic: Each stimulus-action is independent from previous actions`
			`- Sequential: Current actions may influence ones in the future, so we need to keep memory`
			`- Staticity`
			`- Static: Environment does not change while our agent is deciding`
			`- Dynamic: Environment changes while our agent is deliberating`
			`- SemiDynamic: Environment is Static, but agent's performance changes with time`
			`- Continuousity (Applies to States)`
			`- Continuous: State has continuous elements`
			`- Discrete: State has no continuous elements`
			`- Knowlegde`
			`> [!CAUTION]`
			`> It is not influenced by Observability, as this refers to the **outcomes`
			`> of actions and not over the state of the agent**`
			`- Known: Each rule is known a priori (known outcomes)`
			`- Unknown: The agent must discover environment rules (unknown outcomes)`

			`### Environment classes`

			`According to properties, we can define a class of Environments on where we can`
			`test our agents`

			`### Architecture`

			`The actual hardware (actuators) where our program will run, like a robot, or`
			`a pc.`

			`## Agent Programs`

			`> [!NOTE]`
			`> All programs have the same pseudo code:`
			`>`
			> ```python
			`> def agent(percept) -> Action:`
			> ```

			`### Table Driven Agent`

			`It basically has all possible reactions to stimuli at time $\mathcal{T}_i$, thus`
			`a space of $\sum_{t=1}^{T}\|\mathcal{S}\|^{t}$, which quicly becomes enormous`

			`> [!TIP]`
			`> It is actually complete and makes us react at best`

			`> [!CAUTION]`
			`> It is very memory consuming, so it is suitable for small problems`

			```python
			`class TableDrivenAgent(Agent):`

			`def __init__(`
			`self,`
			`action_table: dict[list[Percept], Action]`
			`):`
			`self.__action_table = action_table`
			`self.__percept_sequence : list[Percept]= []`


			`def agent(self, percept: Percept) -> Action:`
			`self.__percept_sequence.push(`
			`percept`
			`)`

			`return self.__action_table.get(`
			`self.__percept_sequence`
			`)`

			```

			`### Reflex Agent`

			`Acts only based on stimuli at that very time $t$`

			`> [!TIP]`
			`> It only reacts to stimuli based on the current state, so it's smaller and very fast`

			`> [!CAUTION]`
			`> It is very limited in its capabilities and requires a fully observable environment`

			```python
			`class ReflexAgent(Agent):`

			`def __init__(`
			`self,`
			`rules: list[Rule], # It depends on how the actual function is implemented`
			`):`
			`self.__rules = rules`


			`def agent(self, percept: Percept) -> Action:`
			`status = self.__get_status(percept)`
			`rule = self.__get_rule(status)`
			`return rule.action`


			`# MUST BE IMPLEMENTED`
			`def __get_status(self, percept: Percept) -> Status:`
			`pass`


			`# MUST BE IMPLENTED (it uses our rules)`
			`def __get_rule(self, state: State) -> Rule:`
			`pass`

			```

			`### Model-Based Reflex Agent`

			`This agent has 2 models that helps it keeping an internal representation of the world:`

			`- Transition Model:\`
			`Knowledge of how the world works (What my actions do)`
			`- Sensor Model:\`
			`Knowledge of how the world is reflected in agent's percept`
			`(How the world evolves without me)`

			`> [!TIP]`
			`> It only reacts to stimuli based on the current state, but is also capable of predicting`
			`> next state, thus having some info on unobserved states`

			`> [!CAUTION]`
			`> It is more complicated to code and slightly worse in terms of raw speed. While it is`
			`> more flexible, it is still somewhat limited`

			```python
			`class ReflexAgent(Agent):`

			`def __init__(`
			`self,`
			`initial_state: State,`
			`transition_model: TransitionModel, # Implementation dependent`
			`sensor_model: SensorModel, # Implementation dependent`
			`rules: Rules # Implementation dependent`
			`):`
			`self.__current_state = initial_state`
			`self.__transition_model = transition_model`
			`self.__sensor_model = sensor_model`
			`self.__rules = rules`
			`self.__last_action : Action = None`


			`def agent(self, percept: Percept) -> Action:`
			`self.__update_state(percept)`
			`rule = self.__get_rule()`
			`return rule.action`


			`# MUST BE IMPLEMENTED`
			`def __update_state(self, percept: Percept) -> State:`
			`"""`
			`Uses:`
			`- percept`
			`- self.__current_state,`
			`- self.__last_action,`
			`- self.__transiton_model,`
			`- self.__sensor_model`
			`"""`
			`# Do something`
			`self.__current_state = state`


			`# MUST BE IMPLENTED (it uses our rules)`
			`def __get_rule(self) -> Rule:`
			`"""`
			`Uses:`
			`self.__current_state,`
			`self.__rules`
			`"""`
			`# Do something`
			`return rule`

			```

			`### Goal-Based Agent`

			`It's an agent that has an internal state representation, can predict the next state and chooses an action that satisfies its goals.`

			`> [!TIP]`
			`> It is very flexible and needs less hardcoded info, compared to a reflex-based approach, allowing a rapid`
			`> change of goals without reprogramming the agent`

			`> [!CAUTION]`
			`> It is more computationally expensive to implement, moreover it requires a strategy to choose the best action`

			`### Utility-Based Agent`

			`It's an agent that performs to maximize its expected utility, useful when goals are incompatible and there's **uncertainty`
			`aboout how to achieve a goal**`

			`> [!TIP]`
			`> Useful when we have many goals with different importance and when we need to balance some incompatible ones`

			`> [!CAUTION]`
			`> It adds another layer of computation and since it chooses based on estimantions, it could be wrong`

			`> [!NOTE]`
			`> Not each goal-based agent has a model to guide it`

			`## Learning Agent`

			`Each agent may be a Learning Agent. and it is composed of:`

			`- Learning Element: Responsible for improvements`
			`- Performance Element: The entire agent, which takes actions`
			`- Critic: Gives Feedback to the Learning element on how to improve the agent`
			`- Problem Genrator: Suggests new actions to promote exploration of possibilites, avoiding to repeat the best known action`

			`## States Representation`

			`> [!NOTE]`
			`> Read pg 76 to 78 of Artificial Intelligence: A Modern Approach`