# Intelligent Agents ## What is an Agent An `agent` is something that acts in an `environment` through the help of `actuators` and takes such decisions by evaluating the informations gathered by `sensors` - `Sensor`: Provides info by analyzing the `Environment` - `Actuator`: Changes the `Environment` - `Environment`: Holds the source of truth of the `Environment` - `Agent`: Anything that can change the `Environment` by some `Decisions` the ultimate `goal` for us is to have a `rational agent` which takes `rational` decisions to reach a `goal`. > [!WARNING] > `Rationality` is not `Omniscence`. > > For a moment think of trying to reach a big structure in a city that is always visible. > The first thing you may think of is trying to walk towards this structure and turning only > when you find a wall in front of you. > > However, one turn left and you find yourself stuck on a dead end that is **very** close > to your `goal`. This doesn't mean that you were `Irrational` to begin with, but just that > you didn't have enough information. > > If you apply this concept to our `agent`, this is what we refer as `Rationality` ### Morality of an agent *Be careful for what you ask for, cause you may receive it* - W.W. Jacobs An interesting example of this fenomenon is probably ***King's Mida***. Now I want to bring you another one from a videogame called ***SOMA***. In this game, some robots where programmed with the goal of keeping humans alive. So far, so good you may say, the thing is that most humans kept alive by these machines were in sufference and asking for the sweet relief of death. The thing is that the `Goal` of these machines did not include any information about the **Wellness** of human beings, but just for them to be alive, transforming a wish into a nightmare. ### Extra #### Percept Sequence This is a sequence of `Actions` taken until time $t_1$. Let's say that we have $n$ possible `Actions` for out `Agent`, then a `percept-sequence` has $n^{t_1}$ possible sequences > [!WARNING] > Notice that an `action` may be repeated many times ## Modelling an Agent We usually use ***PEAS*** to represent our `task-environment`. - **P**erformance Measure - **E**nvironment - **A**ctuators - **S**ensors ### Task Environment - Observability - Fully Observable - Partially Observable - Non-Observable - Number of Agents - Single - Multiagent - Competitive - Cooperative - Predictability - Deterministic - Non-Deterministic - Stochastic (when you know the probability of random events) - Reactions - Episodic (do not depend on past actions) - Sequential (depend on past actions) - Sate of Environment - Static - Dynamic - Semi-Dynamic - Contnuousness - Continuous - Discrete - Rules of Environment - Known - Unknown However `Experiments` will often include more than one `Taks-Environment` ### Agent Architectures ### Table Driven Agent > [!WARNING] > This is used only for education purposes only. It is > not inteded to be used in a real environment This agent reacts to a table of `Perception` - `Action` as described in this `pseudo-code`. ```python # This is pseudo code in python def percept_action(percept: Perception) : Action # Add perception to the sequence of # perceptions this.perception_sequence.append(percept) # Look into table action = this.percept_action_table[this.perception_sequence] return action ``` However, this is very inefficient as it takes the whole **sequence of perceptions**, hence requiring the `LookUp Table` to have $\sum_{t=1}^{T}|Perception|^{t}$ entries in `Memory`. ### Reflex Agent This agent reacts ***only*** to the newest `Perpection` ```python def percept_action(percept: Perception) : Action state = this.get_new_state(this.state, percept) return this.state_action_table[state] ``` This time we have only $|Perception|$ entries in our `LookUp Table`. Of course our agent will be less `intelligent`, but more `reactive` and `small` #### Randomization Problem These `agents` are prone to infinite loops. In this case it is useful to have it `randomize` its `Actions`[^randomizing-reflex-agents] ### Model Based Reflex Agent These models are very useful in case of `partial-observability` as they keep a sort of `internal-state` dependent on `perception-history`, getting some info about the `non-observable-space`. Now, differently from before, we need to keep track of: - `internal-state`: Something the agent has - `Environment` evloution independently of our `agent` | AKA ***Transition Model*** - How our `agent`'s actions will change the `Environment` - How our `agent` percept the world | AKA ***Sensor Model*** However, even though we may perceive the world around us, this perception can only `partially` reconstruct the whole `Envrionment` for our best `educated guesses` ```python def percept_action(percept: Perception) : Action # Get new state based on agent info this.state = this.state_update( this.state, this.recent_action, percept, this.transition_model this.sensor_model ) return this.state_action_table[state] ``` Since we are dealing with `uncertainty`, we our `agent` need to be able to **take decisions** even **when** it's **unsure** about the `Environment` ### Goal Based Agent This is useful when we need to achieve a particular situation, rather than ***blindly*** responding to the stimuli provided by the `Environment`[^reflex-vs-goal-agents]. These `agents`, while requiring more resources as they need to take a *ponderated* decision, will result in more **flexible** `models` than can adapt to more situations, like enabling an *autopilot* to travel to multiple destinations, instead of `hardcoding` all the actions in a `LookUp Table` ### Utility-Based Agents However, just reaching the `goal` is not probably what we desire, but also the ***how*** is important as well. For this, imagine the `goal` as the desired state and all of our ***secondary goals*** as what happend during our way to the `goal`. For example, in **SOMA** our `agents` needed to keep people alive, but as we could see, they did not care about ***quality of life***. If they used a `utility-based` `agent` that kept track of ***quality of life***, then probably they wouldn't have had such problems. Moreover, a `utility-based` `agent` is capable of making tradeoffs on **incompatible goals**, or multiple `goals` that may not be achievable with ***certainty***. > [!CAUTION] > These agents maximize the output of the `Utility Function`, thus *beware of what you wish for* ### Learning Agents It is divided in 4 parts and can be of any previous cateogry #### Learning Element Responsible to ***make improvements*** #### Performance Element Responsible for ***taking external actions***, which is basically what we described until now as our `agent` #### Critic Element Gives to the ***[Learning element](#learning-element)*** our performance evaluation. This element is not something that the `agent` can *learn* or *modify* #### Problem Generator Responsible to give our `agent` stimuli to have ***informative and new experiences*** so that our ***[Performance Element](#performance-element)*** doesn't dominate our behaviour, thus ***making only actions that are the best according to its own knowledge*** ### States of an Agent[^states-of-an-agent] They usually are one of these 3: - #### Atomic These states have no other variables and are indivisible - #### Factored States are represented by a vector and each transition makes these values change. So, each state may have something in common with the next one - #### Structured States are objects that may contain other objects as well or relationships between objects Moreover these states can be `localist` or `distributed` if these states are either on a single machine, or distributed machines [^randomizing-reflex-agents]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 69 [^reflex-vs-goal-agents]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 71 [^states-of-an-agent]: Artificial Intelligence a Modern Approach 4th edition | Ch. 2 pg. 77