14 KiB
Search in complex Environments
Local Search
Differently from a path search, they do not store steps taken. This is because we need only the final state and no more.
However this comes at the cost of not being systematic, thus not exploring all the potential space, however having an upper hand on memory and computation in infinte or large spaces
Tip
These are good when we only care about the final state, not the process of achieving it. Moreover it uses little memory and can be used for large spaces where conventional search fails
Caution
They are not systematic and may never explore space where the solution resides
Optimization problems
We need to find a place in a function where we find a reasonable highest number (or lowest). Usually our function has:
- Local Maxima
- Ridges: Places where we have many local maxima near to each other but are not conected to each other (difficult to navigate with greedy search)
- Plateaus: Places where the difference in height is minimal or nonexistant
We have some algorithms and strategies:
- Stochastic Hill climbing: Choose a higher point with some probability according to its height
- First Choice hill climbing: Generate successor nodes randombly until one is better than the current (implements Stochastic Hill Climbing)
- Random-Restart Hill Climbing: If you fail, try again
Note
Computing expected number of steps in a Random Restart:
p: Probability of success of the algorithm without restartS: Number of steps of a successful iterations: Number of steps of an unsuccessful iteration\frac{1}{p}: Expected number of retries before finding a solution
\text{Expected Steps} = 1 \cdot S + \frac{1-p}{p} \cdot s
Simulated Annealing
We start searching for our objective, however we initially search with a stat called Temperature which allow the model to take more random choices the higher it is.
If it is not better than our current state, it'll use Boltzmann distribution (e^{\frac{\Delta E}{T}})
across all \Delta E of successor nodes.
def simulated_annealing(problem, schedule) -> State:
# Take initial state, time, temperature
current = problem.initial_state
start_time = time.now()
T = schedule(0)
# While temperature is higher than 0
while (T > 0):
# Here I generate successors, but can
# be implemented in other ways
successors : list[State] = current.children
move = random.int(0, len(successors))
candidate = successors[move]
# Compute the fitness of the move
delta_energy = candidate.value - current.value
if delta_energy > 0:
current = candidate
else:
# Notice that since the energy is negative, probability is less than 1
# so, when T -> 0, it'll be e^(-inf) == 0
boltzman_probability = e^(delta_energy / T)
# Probability helper
if random() < boltzman_probability:
current = candidate
# Update temperature
T = schedule(time.now() - start_time)
return current
Note
The lower the temperature, the higher the concentration over a local maximum.
Tip
Low memory usage and fast computation
Caution
Suboptimal solutions with respect to other algorithms
Local Beam Search
- Generate
ksuccessors of allkstates kept in memory - If one is goal, stop
- Else keeps the best
kin memory and starts again
def local_beam_search(problem, k: int) -> State:
# Get k initial states
current: problem.random_states(k)
halt = False
while (not halt):
successors = []
# Note: this may be parallelizable
# Add all successors
for state in current:
children = state.children
# check if a children is a goal
for child in children:
# Found, stop execution
if is_goal(child):
return child # If parallelized, no return
successors.push(
state.children
)
# Note: this is sequential, unless we want to implement
# a distributed heap sort
current = successors.sort()[0:k]
Tip
Better use of available memory and fast computation (parallelizable)
Caution
Lack of diversity in its
knodes
Note
Stochastic Beam Search takes
knodes with a probability proportional to successor's value
Evolutionary algorithms
pg 133 to 137
Similar to Stochastic Beam Search.
They have the following properties:
- Population size
- Indivisual Representation
- Boolean String (Genetic Algorithms)
- Real numbers (Evolution Strategies)
- Computer Program (Genetic Programming)
- Mixing number
\rho(usually\rho = 2): Number of parents per offspring - Selection Process
- Select n fittest
- Select n fittest and
\rhofittest as parents
- Recombination Procedure: Randomly select a Crossover point, split there and then recombine
- Mutation Rate: Probability of an offspring to have a spontaneous mutation
- Making the next generation
- Only children
- Mix Children and Fittest Parents
Note
In Genetic Algorithms we usually talk about:
- Schema: String with incomplete cells (eg: 1*98**00)
- Instance: String that has schema and complete cells (eg: 12980000 - 13987500)
The more a Schema is fittest, the more istances of that schema will appear over time
# This is somewhat a pseudoimplementation and
# many details must be implemented
def genetic_evolution(
problem,
population,
fitness,
breed,
mutate,
mutation_probability,
max_steps
)
population : list[(State, float)]= []
# We are assuming different initial states
for i in range(0, population)
candidate = problem.random_unique_state()
if is_goal(candidate):
return candidate
population.push(candidate, fitness(candidate))
population = population.sort()
halt = False
step = 0
while (not halt):
step += 1
# Create new generation
new_generation = breed(population)
# Mutate kids
for kid in new_generation:
kid = mutate(kid, mutation_probability)
# Check if kid is better than parents
for kid in new_generation:
kid_score = fitness(kid)
if is_goal(kid):
return kid
# If kid is better than last of population,
# kid is enough fit
if kid_score > population[-1].score:
# Remove least fit entity,
# replace with kid and SORT
# (unless you used a heap)
population.pop()
population.push(
(kid, kid_score)
)
population.sort()
if step > max_steps:
halt = True
return population[0]
Local Search in Continuous Spaces
We have these techniques:
- Gradient Descent
- Discretize
- Newton Raphson
Search with non-deterministic actions
Since our agents plan ahead ,it must know whatever will happen after an action.
Note
Brief recap
- Deterministic:
result(state_{i_1}, action_j) = state_{i_2}- Non-Determinsitic:
result(state_{i_1}, action_j) = \Set{ \cup_k \; state_{k}}It can be seen that in a Non-Deterministic world, we have a set of states resulting from an action
If we would be in our agent's shoes for a moment, we would plan accordingly, saying that if we will be in state_a then we'll act
in a way, else in another.
Basically, we are finding a plan that involves all possible possibilities and then we delay the moment of acting when we'll have more info, discarding plans for which conditions were not met.
In our agent world it translates roughly to conditional plans (or contingency plans). Our plan is not anymore a sequence of actions, but rather contains some if-else or switch-case actions.
# This is just a pseudo implementation
# and it's supposed to be used only
# on a plan, not while planning
class ConditionalAction(Action):
def __init__(
self,
conditional_map: dict[State, Action]
):
self.__conditional_map = conditional_map
def decide(self, percept: Percept):
state = get_state(percept)
return conditional_map[state]
AND - OR Search Trees
These trees have the following properties:
- OR Nodes: Nodes resulting from a choice of an action
- AND Nodes: Nodes resulting from having multiple states coming from our action
Note
We may think of AND Nodes as meta states or shrodinger states, where they are all technically real and not at the same time, until we observe.
Contrary to the exploration for OR Nodes, we are forced to explore all AND Nodes, as we need to plan for them in case they exist.
Warning
If any AND Node has a failure, then we fail
Tip
We will succeed only if all AND Nodes return a success
Some search techniques do not implement a retry action if there's a failure in an action, so that they can avoid cycles. Retrying needs having cycles and this calls for the need of a while-action
class RetryAction(Action):
def __init__(
self,
stuck_state: State,
action: Action
):
self.__stuck_state = stuck_state
self.__action
def act(self, percept: Percept):
current_state = get_state(percept)
while (current_state == self.__stuck_state):
current_state = result(current_state, self.__action)
return current_state
Search in Partially Observable Environments
This is a situation where you are unsure about a state you are in. In this case you know the rules, so you know the outcome or outcomes, however you do not know where you are.
If you were blind folded, you would have to assess where you are and proceed from there on. But imagine that you had some pits in your home, in that case you would not wander aimelessly, but rather check each step you do.
No observation (Sensorless Problem)
This is an extreme version of the preceeding problem. Here you can't perceive anything at all, so you can't narrow down states according to percepts, but only by acting.
Imagine that moving is always an option even if blocked and you know exactly the geometry of the room. In this case, by acting in a certain way, you can always certainly know that you'll come to a certain point. This is called coercion: The ability to make a belief state to converge to a single state.
Since we can coerce states and we have no percepts, there are no contingency plans and solutions are sequence of actions
Tip
A sensorless plan is not useless, and sometimes is better when you are resource constraint, as it may be faster
- Initial State: Set of all states (
\mathcal{S}) - Number of Belief States: $2^{\mathcal{S}}$1
- Legal Actions
- If any action is legal in any state: All actions that can be performed in any state contained in the belief state
\cup_{s \in b}a_{\mathcal{P}}(s) - If actions may be not legal in some state: Only actions that are legal in all states of the belief state
\cap_{s \in b}a_{\mathcal{P}}(s)
- If any action is legal in any state: All actions that can be performed in any state contained in the belief state
- Transition Model
- If Actions are deterministic: Our belief state is the Set of all states coming from actions and it'll have a size equal or less than our previous belief state
- If Actions are not deterministic: Our belief state is the Set of all states coming from actions and can be of a dimension higher than our previosu belief state
- Goal
- Possibly Achievable: One of the states in the belief state is a goal state
- Necessarily Achievable: All of states in the belief state are a goal state
- Cost
- Uniform
- Variable
Pruning
Whenever we encounter a superset of a belief state we already encountered, we can prune it. This comes from the fact that a superset is a much harder problem than a subset, and since it would probably include also our subset solution, then this means that it's easier to directly explore our subset.
On the contrary, let's say that we already found a solution for our superset, then we also found a solution for the subset, proving it to be solvable
Incremental Searching
Instead of searching a solution for each state in the belief state contemporarily, we can search a solution for just one of those states and check if it goes well for the others, and if not, search again.
That comes from the fact that all states must have the same sequence of actions, so if we run out of possible solutions for a state, then our problem is unsolvable.
Partial Observability
Now, we can observe at least a portion of the belief state we are in, so it'll be easier to construct and coerce our belief state into a single state
Here we do the following:
- Check my environment (get initial state)
- Choose to do an action and predict next states
- Check what states I would find after checking that state (Here we will have disjoint states if sensing is deterministic, either one or the other)
- Repeat from step 2 until here until goal is reached
As for Non deterministic actions, ww need to explore each possible state as we won't be sure of where we are until we execute our plan
Online Searching
Book from pg. 152 to 159