PoliBa-Courses/Foundation-of-Artificial-Intelligence

Christian Risi 4a6a542f8d Rework of chapter 4

2025-08-28 19:00:57 +02:00

14 KiB

Raw Permalink Blame History

Search in complex Environments

Local Search

Differently from a path search, they do not store steps taken. This is because we need only the final state and no more.

However this comes at the cost of not being systematic, thus not exploring all the potential space, however having an upper hand on memory and computation in infinte or large spaces

Tip

These are good when we only care about the final state, not the process of achieving it. Moreover it uses little memory and can be used for large spaces where conventional search fails

Caution

They are not systematic and may never explore space where the solution resides

Optimization problems

We need to find a place in a function where we find a reasonable highest number (or lowest). Usually our function has:

Local Maxima
Ridges: Places where we have many local maxima near to each other but are not conected to each other (difficult to navigate with greedy search)
Plateaus: Places where the difference in height is minimal or nonexistant

We have some algorithms and strategies:

Stochastic Hill climbing: Choose a higher point with some probability according to its height
First Choice hill climbing: Generate successor nodes randombly until one is better than the current (implements Stochastic Hill Climbing)
Random-Restart Hill Climbing: If you fail, try again

Note

Computing expected number of steps in a Random Restart:

p: Probability of success of the algorithm without restart

S: Number of steps of a successful iteration

s: Number of steps of an unsuccessful iteration

\frac{1}{p}: Expected number of retries before finding a solution

\text{Expected Steps} = 1 \cdot S + \frac{1-p}{p} \cdot s

Simulated Annealing

We start searching for our objective, however we initially search with a stat called Temperature which allow the model to take more random choices the higher it is.

If it is not better than our current state, it'll use Boltzmann distribution (e^{\frac{\Delta E}{T}}) across all \Delta E of successor nodes.

def simulated_annealing(problem, schedule) -> State:

    # Take initial state, time, temperature
    current = problem.initial_state
    start_time = time.now()
    T = schedule(0)

    # While temperature is higher than 0
    while (T > 0):

        # Here I generate successors, but can
        #   be implemented in other ways
        successors : list[State] = current.children
        move = random.int(0, len(successors))
        candidate = successors[move]

        # Compute the fitness of the move
        delta_energy = candidate.value - current.value

        if delta_energy > 0:
            current = candidate
        else:
            # Notice that since the energy is negative, probability is less than 1
            #   so, when T -> 0, it'll be e^(-inf) == 0
            boltzman_probability = e^(delta_energy / T)
            # Probability helper
            if random() < boltzman_probability:
                current = candidate

        # Update temperature
        T = schedule(time.now() - start_time)

    return current

Note

The lower the temperature, the higher the concentration over a local maximum.

Tip

Low memory usage and fast computation

Caution

Suboptimal solutions with respect to other algorithms

Local Beam Search

Generate k successors of all k states kept in memory
If one is goal, stop
Else keeps the best k in memory and starts again

def local_beam_search(problem, k: int) -> State:

    # Get k initial states
    current: problem.random_states(k)
    halt = False

    while (not halt):

        successors = []

        # Note: this may be parallelizable

        # Add all successors
        for state in current:

            children = state.children

            # check if a children is a goal
            for child in children:

                # Found, stop execution
                if is_goal(child):
                    return child # If parallelized, no return

            successors.push(
                state.children
            )

        # Note: this is sequential, unless we want to implement
        #       a distributed heap sort
        current = successors.sort()[0:k]

Tip

Better use of available memory and fast computation (parallelizable)

Caution

Lack of diversity in its k nodes

Note

Stochastic Beam Search takes k nodes with a probability proportional to successor's value

Evolutionary algorithms

pg 133 to 137

Similar to Stochastic Beam Search.

They have the following properties:

Population size
Indivisual Representation
- Boolean String (Genetic Algorithms)
- Real numbers (Evolution Strategies)
- Computer Program (Genetic Programming)
Mixing number \rho (usually \rho = 2): Number of parents per offspring
Selection Process
- Select n fittest
- Select n fittest and \rho fittest as parents
Recombination Procedure: Randomly select a Crossover point, split there and then recombine
Mutation Rate: Probability of an offspring to have a spontaneous mutation
Making the next generation
- Only children
- Mix Children and Fittest Parents

Note

In Genetic Algorithms we usually talk about:

Schema: String with incomplete cells (eg: 1*98**00)

Instance: String that has schema and complete cells (eg: 12980000 - 13987500)

The more a Schema is fittest, the more istances of that schema will appear over time


# This is somewhat a pseudoimplementation and
#   many details must be implemented
def genetic_evolution(
    problem,
    population,
    fitness,
    breed,
    mutate,
    mutation_probability,
    max_steps
)

    population : list[(State, float)]= []

    # We are assuming different initial states
    for i in range(0, population)
        candidate = problem.random_unique_state()

        if is_goal(candidate):
            return candidate

        population.push(candidate, fitness(candidate))

    population = population.sort()

    halt = False
    step = 0

    while (not halt):
        step += 1

        # Create new generation
        new_generation = breed(population)

        # Mutate kids
        for kid in new_generation:
            kid = mutate(kid, mutation_probability)

        # Check if kid is better than parents
        for kid in new_generation:

            kid_score = fitness(kid)

            if is_goal(kid):
                return kid

            # If kid is better than last of population,
            #   kid is enough fit
            if kid_score > population[-1].score:

                # Remove least fit entity,
                #   replace with kid and SORT
                #   (unless you used a heap)
                population.pop()
                population.push(
                    (kid, kid_score)
                )
                population.sort()

        if step > max_steps:
            halt = True

    return population[0]

Local Search in Continuous Spaces

We have these techniques:

Gradient Descent
Discretize
Newton Raphson

Search with non-deterministic actions

Since our agents plan ahead ,it must know whatever will happen after an action.

Note

Brief recap

Deterministic: result(state_{i_1}, action_j) = state_{i_2}

Non-Determinsitic: result(state_{i_1}, action_j) = \Set{ \cup_k \; state_{k}}

It can be seen that in a Non-Deterministic world, we have a set of states resulting from an action

If we would be in our agent's shoes for a moment, we would plan accordingly, saying that if we will be in state_a then we'll act in a way, else in another.

Basically, we are finding a plan that involves all possible possibilities and then we delay the moment of acting when we'll have more info, discarding plans for which conditions were not met.

In our agent world it translates roughly to conditional plans (or contingency plans). Our plan is not anymore a sequence of actions, but rather contains some if-else or switch-case actions.

# This is just a pseudo implementation
#   and it's supposed to be used only
#   on a plan, not while planning
class ConditionalAction(Action):

    def __init__(
        self,
        conditional_map: dict[State, Action]
    ):
        self.__conditional_map = conditional_map


    def decide(self, percept: Percept):
        state = get_state(percept)
        return conditional_map[state]

AND - OR Search Trees

These trees have the following properties:

OR Nodes: Nodes resulting from a choice of an action
AND Nodes: Nodes resulting from having multiple states coming from our action

Note

We may think of AND Nodes as meta states or shrodinger states, where they are all technically real and not at the same time, until we observe.

Contrary to the exploration for OR Nodes, we are forced to explore all AND Nodes, as we need to plan for them in case they exist.

Warning

If any AND Node has a failure, then we fail

Tip

We will succeed only if all AND Nodes return a success

Some search techniques do not implement a retry action if there's a failure in an action, so that they can avoid cycles. Retrying needs having cycles and this calls for the need of a while-action


class RetryAction(Action):

    def __init__(
        self,
        stuck_state: State,
        action: Action
    ):
        self.__stuck_state = stuck_state
        self.__action


    def act(self, percept: Percept):
        current_state = get_state(percept)

        while (current_state == self.__stuck_state):
            current_state = result(current_state, self.__action)

        return current_state

Search in Partially Observable Environments

This is a situation where you are unsure about a state you are in. In this case you know the rules, so you know the outcome or outcomes, however you do not know where you are.

If you were blind folded, you would have to assess where you are and proceed from there on. But imagine that you had some pits in your home, in that case you would not wander aimelessly, but rather check each step you do.

No observation (Sensorless Problem)

This is an extreme version of the preceeding problem. Here you can't perceive anything at all, so you can't narrow down states according to percepts, but only by acting.

Imagine that moving is always an option even if blocked and you know exactly the geometry of the room. In this case, by acting in a certain way, you can always certainly know that you'll come to a certain point. This is called coercion: The ability to make a belief state to converge to a single state.

Since we can coerce states and we have no percepts, there are no contingency plans and solutions are sequence of actions

Tip

A sensorless plan is not useless, and sometimes is better when you are resource constraint, as it may be faster

Initial State: Set of all states (\mathcal{S})
Number of Belief States: $2^{\mathcal{S}}$¹
Legal Actions
- If any action is legal in any state: All actions that can be performed in any state contained in the belief state \cup_{s \in b}a_{\mathcal{P}}(s)
- If actions may be not legal in some state: Only actions that are legal in all states of the belief state \cap_{s \in b}a_{\mathcal{P}}(s)
Transition Model
- If Actions are deterministic: Our belief state is the Set of all states coming from actions and it'll have a size equal or less than our previous belief state
- If Actions are not deterministic: Our belief state is the Set of all states coming from actions and can be of a dimension higher than our previosu belief state
Goal
- Possibly Achievable: One of the states in the belief state is a goal state
- Necessarily Achievable: All of states in the belief state are a goal state
Cost
- Uniform
- Variable

Pruning

Whenever we encounter a superset of a belief state we already encountered, we can prune it. This comes from the fact that a superset is a much harder problem than a subset, and since it would probably include also our subset solution, then this means that it's easier to directly explore our subset.

On the contrary, let's say that we already found a solution for our superset, then we also found a solution for the subset, proving it to be solvable

Incremental Searching

Instead of searching a solution for each state in the belief state contemporarily, we can search a solution for just one of those states and check if it goes well for the others, and if not, search again.

That comes from the fact that all states must have the same sequence of actions, so if we run out of possible solutions for a state, then our problem is unsolvable.

Partial Observability

Now, we can observe at least a portion of the belief state we are in, so it'll be easier to construct and coerce our belief state into a single state

Here we do the following:

Check my environment (get initial state)
Choose to do an action and predict next states
Check what states I would find after checking that state (Here we will have disjoint states if sensing is deterministic, either one or the other)
Repeat from step 2 until here until goal is reached

As for Non deterministic actions, ww need to explore each possible state as we won't be sure of where we are until we execute our plan

Online Searching

Book from pg. 152 to 159

Proof ↩︎

14 KiB Raw Permalink Blame History