Foundation-of-Artificial-In.../docs/4-SEARCH-IN-COMPLEX-ENVIRONMENTS.md

# Search in complex Environments

## Local Search

Differently from a path search, they do not store steps taken. This is because we need only the **final state**
and no more.

However this comes at the cost of not being systematic, thus not exploring all the potential space, however
having an upper hand on memory and computation in infinte or large spaces

> [!TIP]
> These are good when we only care about the final state, not the process of achieving it.
> Moreover it uses little memory and can be used for large spaces where conventional search fails

> [!CAUTION]
> They are not systematic and may never explore space where the solution resides

## Optimization problems

We need to find a place in a function where we find a **reasonable highest number** (or lowest).
Usually our function has:

- Local Maxima
- Ridges: Places where we have many local maxima near to each other but are not conected to each other (difficult to navigate with greedy search)
- Plateaus: Places where the difference in height is minimal or nonexistant

We have some algorithms and strategies:

- Stochastic Hill climbing: Choose a higher point with some probability according to its height
- First Choice hill climbing: Generate successor nodes randombly until one is better than the current (implements Stochastic Hill Climbing)
- Random-Restart Hill Climbing: If you fail, try again

> [!NOTE]
> Computing expected number of steps in a Random Restart:
>
> - $p$: Probability of success of the algorithm without restart
> - $S$: Number of steps of a **successful** iteration
> - $s$: Number of steps of an **unsuccessful** iteration
> - $\frac{1}{p}$: Expected number of retries before finding a solution
>
> $\text{Expected Steps} = 1 \cdot S + \frac{1-p}{p} \cdot s$

### Simulated Annealing

We start searching for our objective, however we initially search with
a stat called **Temperature** which allow the model to take more random choices
the higher it is.

If it is not better than our current state, it'll use **Boltzmann distribution** ($e^{\frac{\Delta E}{T}}$)
across all $\Delta E$ of successor nodes.

```python
def simulated_annealing(problem, schedule) -> State:

    # Take initial state, time, temperature
    current = problem.initial_state
    start_time = time.now()
    T = schedule(0)

    # While temperature is higher than 0
    while (T > 0):

        # Here I generate successors, but can
        #   be implemented in other ways
        successors : list[State] = current.children
        move = random.int(0, len(successors))
        candidate = successors[move]

        # Compute the fitness of the move
        delta_energy = candidate.value - current.value

        if delta_energy > 0:
            current = candidate
        else:
            # Notice that since the energy is negative, probability is less than 1
            #   so, when T -> 0, it'll be e^(-inf) == 0
            boltzman_probability = e^(delta_energy / T)
            # Probability helper
            if random() < boltzman_probability:
                current = candidate

        # Update temperature
        T = schedule(time.now() - start_time)

    return current


```

> [!NOTE]
> The lower the temperature, the higher the concentration over a local maximum.

> [!TIP]
> Low memory usage and fast computation

> [!CAUTION]
> Suboptimal solutions with respect to other algorithms

### Local Beam Search

- Generate $k$ successors of all $k$ states kept in memory
- If one is goal, stop
- Else keeps the best $k$ in memory and starts again

```python
def local_beam_search(problem, k: int) -> State:

    # Get k initial states
    current: problem.random_states(k)
    halt = False

    while (not halt):

        successors = []

        # Note: this may be parallelizable

        # Add all successors
        for state in current:

            children = state.children

            # check if a children is a goal
            for child in children:

                # Found, stop execution
                if is_goal(child):
                    return child # If parallelized, no return

            successors.push(
                state.children
            )

        # Note: this is sequential, unless we want to implement
        #       a distributed heap sort
        current = successors.sort()[0:k]

```

> [!TIP]
> Better use of available memory and fast computation (parallelizable)

> [!CAUTION]
> Lack of diversity in its $k$ nodes

> [!NOTE]
> Stochastic Beam Search takes $k$ nodes with a probability proportional to successor's value

### Evolutionary algorithms

pg 133 to 137

Similar to **Stochastic Beam Search**.

They have the following properties:

- Population size
- Indivisual Representation
  - Boolean String (Genetic Algorithms)
  - Real numbers (Evolution Strategies)
  - Computer Program (Genetic Programming)
- Mixing number $\rho$ (usually $\rho = 2$): Number of parents per offspring
- Selection Process
  - Select n fittest
  - Select n fittest and $\rho$ fittest as parents
- Recombination Procedure: Randomly select a **Crossover point**, split there and then recombine
- Mutation Rate: Probability of an offspring to have a spontaneous mutation
- Making the next generation
  - Only children
  - Mix Children and Fittest Parents

> [!NOTE]
> In Genetic Algorithms we usually talk about:
>
> - Schema: String with incomplete cells (eg: 1*98**00)
> - Instance: String that has schema and complete cells (eg: 12980000 - 13987500)
>
> The more a Schema is fittest, the more istances of that schema will appear over time

```python

# This is somewhat a pseudoimplementation and
#   many details must be implemented
def genetic_evolution(
    problem,
    population,
    fitness,
    breed,
    mutate,
    mutation_probability,
    max_steps
)

    population : list[(State, float)]= []

    # We are assuming different initial states
    for i in range(0, population)
        candidate = problem.random_unique_state()

        if is_goal(candidate):
            return candidate

        population.push(candidate, fitness(candidate))

    population = population.sort()

    halt = False
    step = 0

    while (not halt):
        step += 1

        # Create new generation
        new_generation = breed(population)

        # Mutate kids
        for kid in new_generation:
            kid = mutate(kid, mutation_probability)

        # Check if kid is better than parents
        for kid in new_generation:

            kid_score = fitness(kid)

            if is_goal(kid):
                return kid

            # If kid is better than last of population,
            #   kid is enough fit
            if kid_score > population[-1].score:

                # Remove least fit entity,
                #   replace with kid and SORT
                #   (unless you used a heap)
                population.pop()
                population.push(
                    (kid, kid_score)
                )
                population.sort()

        if step > max_steps:
            halt = True

    return population[0]

```

## Local Search in Continuous Spaces

We have these techniques:

- Gradient Descent
- Discretize
- Newton Raphson

## Search with non-deterministic actions

Since our agents **plan** ahead ,it must know whatever will happen after an action.

> [!NOTE]
> Brief recap
>
> - **Deterministic**: $result(state_{i_1}, action_j) = state_{i_2}$
> - **Non-Determinsitic**: $result(state_{i_1}, action_j) =  \Set{ \cup_k \; state_{k}}$
>
> It can be seen that in a Non-Deterministic world, we have a **set of states resulting from an action**

If we would be in our agent's shoes for a moment, we would plan accordingly, saying that **if** we will be in $state_a$ **then** we'll act
in a way, **else** in another.

Basically, we are finding a plan that involves all possible possibilities and then we delay the moment of acting when we'll have more info,
discarding plans for which conditions were not met.

In our agent world it translates roughly to **conditional plans** (or contingency plans).
Our plan is not anymore a sequence of actions, but rather contains some **if-else** or **switch-case** actions.

```python
# This is just a pseudo implementation
#   and it's supposed to be used only
#   on a plan, not while planning
class ConditionalAction(Action):

    def __init__(
        self,
        conditional_map: dict[State, Action]
    ):
        self.__conditional_map = conditional_map


    def decide(self, percept: Percept):
        state = get_state(percept)
        return conditional_map[state]
```

### AND - OR Search Trees

These trees have the following properties:

- **OR Nodes**: Nodes resulting from a choice of an action
- **AND Nodes**: Nodes resulting from having multiple states coming from our action

> [!NOTE]
> We may think of AND Nodes as **meta states** or **shrodinger states**, where they are all technically real
> and not at the same time, until we observe.
>
> Contrary to the exploration for OR Nodes, we are **forced** to explore all AND Nodes, as we need to plan
> for them in case they exist.

> [!WARNING]
> If any AND Node has a **failure**, then we fail

> [!TIP]
> We will succeed only if all AND Nodes return a success

Some search techniques do not implement a retry action if there's a failure in an action, so that they can
avoid cycles. Retrying needs having cycles and this calls for the need of a **while-action**

```python

class RetryAction(Action):

    def __init__(
        self,
        stuck_state: State,
        action: Action
    ):
        self.__stuck_state = stuck_state
        self.__action


    def act(self, percept: Percept):
        current_state = get_state(percept)

        while (current_state == self.__stuck_state):
            current_state = result(current_state, self.__action)

        return current_state

```

## Search in Partially Observable Environments

This is a situation where you are unsure about a state you are in. In this case you know the rules, so you know the outcome or outcomes,
however you do not know where you are.

If you were blind folded, you would have to assess where you are and proceed from there on. But imagine that you had some pits in your home,
in that case you would not wander aimelessly, but rather check each step you do.

### No observation (Sensorless Problem)

This is an extreme version of the preceeding problem. Here you can't perceive anything at all, so you can't narrow down states according
to percepts, but only by acting.

Imagine that moving is always an option even if blocked and you know exactly the geometry of the room. In this case, by acting in a certain way,
you can always certainly know that you'll come to a certain point. This is called **coercion**: The ability to make a belief state to converge to a
single state.

Since we can coerce states and we have no percepts, there are **no contingency plans** and **solutions are sequence of actions**

> [!TIP]
> A sensorless plan is not useless, and sometimes is better when you are resource constraint, as it may be faster

- **Initial State**: Set of all states ($\mathcal{S}$)
- **Number of Belief States**: $2^{\mathcal{S}}$[^number-of-subsets]
- **Legal Actions**
  - **If any action is legal in any state**: All actions that can be performed in any state contained in the belief state $\cup_{s \in b}a_{\mathcal{P}}(s)$
  - **If actions may be not legal in some state**: Only actions that are legal in **all** states of the belief state $\cap_{s \in b}a_{\mathcal{P}}(s)$
- **Transition Model**
  - **If Actions are deterministic**: Our belief state is the **Set** of all states coming from actions and it'll have a size equal or less than our previous belief state
  - **If Actions are not deterministic**: Our belief state is the **Set** of all states coming from actions and can be of a dimension higher than our previosu belief state
- **Goal**
  - **Possibly Achievable**: One of the states in the belief state is a goal state
  - **Necessarily Achievable**: All of states in the belief state are a goal state
- **Cost**
  - **Uniform**
  - **Variable**

#### Pruning

Whenever we encounter a superset of a belief state we already encountered, we can prune it.
This comes from the fact that a superset is a much harder problem than a subset, and since it would probably include also our
subset solution, then this means that it's easier to directly explore our subset.

On the contrary, let's say that we already found a solution for our superset, then we also found a solution for the subset,
proving it to be solvable

#### Incremental Searching

Instead of searching a solution for each state in the belief state **contemporarily**, we can search a solution for just one of
those states and check if it goes well for the others, and if not, search again.

That comes from the fact that all states must have the same sequence of actions, so if we run out of possible solutions for a state,
then our problem is unsolvable.

### Partial Observability

Now, we can observe at least a portion of the belief state we are in, so it'll be easier to construct and coerce our belief state into a single state

Here we do the following:

1. Check my environment (get initial state)
2. Choose to do an action and predict next states
3. Check what states I would find after checking that state (Here we will have disjoint states if sensing is deterministic, either one or the other)
4. Repeat from step 2 until here until goal is reached

As for [Non deterministic](#search-with-non-deterministic-actions) actions, ww need to explore each possible state as we won't
be sure of where we are until we execute our plan

## Online Searching

Book from pg. 152 to 159

<!-- Footnotes -->

[^number-of-subsets]: [Proof](https://math.stackexchange.com/questions/546414/what-is-the-proof-that-the-total-number-of-subsets-of-a-set-is-2n)