Rework of chapter 5

2025-08-29 18:12:36 +02:00 · 2025-08-29 18:12:36 +02:00 · 440741fa3d
commit 440741fa3d
parent 4a6a542f8d
1 changed files with 282 additions and 58 deletions
--- a/docs/5-CONSTRAINT-SATISFACTION-PROBLEMS.md
+++ b/docs/5-CONSTRAINT-SATISFACTION-PROBLEMS.md
@ -1,99 +1,323 @@
 # Constraint Satisfaction Problems

-Book pg. 164 to 169
+> [!NOTE]
+> From now on States have a factored representation of the following type
+>
+> - **Variables** ($\mathcal{X}$): Actual variables in the State
+> - **Domains**($\mathcal{D}$): Set of values that a variable can get (one per variable)
+> - **Constraints** ($\mathcal{C}$): Constraints about values assignable to variables

-We have 3 components
+These problems concern about finding the right assignment of values to all variables so that
+the assignment respects all constraints. For example, let's imagine that we could only get
+cats in spaces with other pets, but small ones. This is indeed a CSP problem where we need to find
+a solution where cats are never with small pets.

- $\mathcal{X}$: Set of variables
- $\mathcal{D}$: Set of domains, one (domain) for each variable
- $\mathcal{C}$: Set of constraints
+```python
+class StateCSP(State):

-Here we try to assign a value to variables, such that each variable holds a value contained in their
-domain, without violating any constraint.
+    def __init__(
+        self,
+        variables: dict[str, Value],
+        domains: dict[str, set[Value]],
+        # The scope is represented as a list of strings
+        #  because these are variables in this representation
+        constraints: dict[list[str], Rule | set[Value]]
+    ):
+        self.__variables = variables
+        self.__domains = domains
+        self.__constraints = constraints

-A solution to this problem must be:
+```

- Consistent: Each assignment is legal
- Complete: All variables have a value assigned
+In this space, in constrast with the atomic one, we can see if we are doing a mistake during assignments. Whenever we detect such
+mistakes, we **prune** that path, since it is **not admissable**. This makes the computation a bit faster, since we can detect
+early failures.

-## Inference in CSP
+## Constraints

-We can inference a solution in these ways
+> [!CAUTION]
+> Constraints can be of these types:
+>
+> - **Unary**: One variable involved (For example Bob does not like Shirts)
+> - **Binary**: Two variables involved
+> - **Global**: More than Two variables are involved

- Generate successors and if not consistent assignments, prune and continue
- Constraint Propagation:\
-    Use contraints to reduce possible assignments and mix with search, if needed
+### Preference Contstraints

-The idea is to treat each assignment as a **vertex** and each contraint as an **edge**, we our tree would not have inconsistent values, making 
-the search faster.
+These are constraints that **should be** followed, but technically can be ignored as far as the problem is concerned.

-## Consistency
+In order to account for these preferences, it is possible to **assign a higher cost** to moves that **break these constraints**

-### Node consistency
+> [!NOTE]
+> In these cases we could also talk about COPs (Constraint Optimization Problems)

-A domain already satisfies constraints
+### Common Global Constraints
+
+- **Aldiff (all different)**: Each variable in the constraint must have a unique value
+  - **Heuristic**: If the set of all values has a size lower than the number of variables, then it is impossible
+  - **Heuristic Algorithm**:
+
+    ```python
+        def aldiff_heuristic(csp_problem):
+
+            # May be optimized with a heap or priority queue
+            #   sorted by domain_size
+            variables : Queue = csp_problem.variables
+            possible_values = set()
+
+            for variable in variables:
+                prossible_values.extend(
+                    variable.domain
+                )
+
+            halt = False
+
+            while not halt:
+
+                # Count variables to avoid overflow
+                remaining_variables = len(variables)
+                # Check if there were eliminable variables
+                variables_with_singleton_domain = 0
+
+                # Check if any of these variables in a singleton
+                #   variable
+                for _ in range(0, remaining_variables):
+
+                    candidate_variable = variables.pop()
+                    domain_size = len(variable.domain)
+
+                    # Check for domain size
+                    match(domain_size):
+
+                        # Empty domain, problem can't be solved
+                        case 0:
+                            return false
+
+                        # Singleton domain
+                        case 1:
+
+                            # Add count
+                            variables_with_singleton_domain += 1
+                            # Take value
+                            value = candidate_variable.domain[0]
+                            # Remove value from all possible values
+                            possible_values.remove(value)
+
+                            # Remove value from all domains
+                            for variable in variables:
+                                variable.domain.remove(value)
+                            break
+
+                        # Variable has more values, reinsert in queue
+                        _:
+                            variables.push(variable)
+
+                # No singleton variables found, algorithm
+                #   would stall, halt now
+                if variables_with_singleton_domain == 0:
+                    halt = True
+
+            # Check that we have enough values to
+            #   assign to all variables
+            return len(possible_values) >= len(variables)
+
+
+
+    ```
+
+- **Atmost**: It defines a maximum amount of a resource
+  - **Heuristic**: Check if the sum of all minimum values in all scope variables domain is higher than the max value
+
+## Constraint Propagation
+
+It's a step in which we **prune** some **values** from **domains**. This step can be alternated with search or be a preprocessing step.
+
+For the sake of this step, we'll construct some graphs where **variables** are nodes and **binary-constraints** are edges.
+
+> [!WARNING]
+> We are not talking about search graphs, as this is another step unreleated with search
+
+> [!TIP]
+> Even though it's not a search, sometimes it can solve our problem before even beginning to search. Moreover each global constraint can become a binary constraint,
+> making this step always possible to implement (at a higher computational cost of either the machine or the human transforming these constraints)
+
+### Node Consistency
+
+A **node** is **node-consistent** when **all values in its domain satifies its unary constraints**

 ### Arc Consistency

-A variable is arc consistent wrt another variable id its domain respects all constraints
+A **node** is **arc-consistent** when **all values in its domain satisfies all of binary constraints it is involved to**

-> [!WARNING]
-> Arco Consistency is not useful for every problem
+AC3 is an Algorithm that can help in achieving this with a Time Complexity of $O(n^3)$ or $O(\text{constraints} \cdot \text{size\_values}^3)$

-> [!NOTE]
-> AC-3 is an algorithm to achieve Arc Consistency (alorithm on page 171 book)
+```python
+def AC3(csp_problem):
+
+    # Take all constraints
+    queue : queue[Constraint] = csp_problem.edges
+
+    # Examine all constraints
+    while len(queue) > 0:
+
+        # Take the first and initialize relevant variables
+        constraint = queue.pop
+        X = constraint.scope[0]
+        Y = constraint.scope[1]
+
+        # Correct domain if needed
+        correction_applied = apply_constraint([X, Y], rule)
+
+        # If correction was applied, check that there
+        #   are still admissable variables
+        if correction_applied:
+
+            # If there aren't, then notify problem is not
+            #   solvable
+            if len(X.domain) == 0:
+                return False
+
+            # If there are still admissable variables, focus
+            #   on its neighbours
+            for neighbour in (X.neighbours - Y):
+                queue.push(csp_problem.search(neighbour, X))
+
+    # Notify that problem has been correctly constrained
+    return True
+
+
+
+def apply_constraint(variables: list[Variable], rule: Rule):
+
+    # Gather all sets of values
+    x_domain = set(variables[0].domain)
+    y_domain = set(variables[1].domain)
+
+    # Set modified to false
+    x_domain_modified = False
+
+    # Check if all variables satisfy constraint
+    for x in x_domain:
+        rule_satisfied = False
+
+        # If a single y makes the couple satisfy the constraint,
+        #   break early
+        for y in y_domain
+            if rule.check(x, y):
+                rule_satisfied = True
+                break
+
+        # If no y satisfied the couple (x, y) then remove x
+        if not rule_satisfied:
+            x_domain.remove(x)
+            x_domain_modified = True
+
+    # Return if domain was modified
+    return x_domain_modified
+```

 ### Path Consistency

-It is achievable on 3 variables and says that a set of 2 variable is path consistent wrt a 3rd only if for each assignment
-consistent with constraints between these variables makes it possible to make a legal assignment on the 3rd
+A **set** of **2 variables** is **path consistent to a 3rd variable** if for **each valid assignment satisfying their arc consistency** there exists **at least a value** for
+a **3rd variable** so that it **satisfies its binary constraints with the other two variables**

 ### K-Consistency

-basically the same as the one above, but for k-variables, technically we need $O(n^2 d)$ to check if a problem is
-total k-consistent (consistent from 1 up to k), however it is expensive and only 2-Consistency is usually computed.
+A **variable** is **k-consistent** if it's also **(k-1)-consistent** down to **1-consistent (node-consistent)**

-## Global Contraints
+### Bound Propagation

-Constraints that take an arbitrary number of variables (but not necessarily all).
+Sometimes, when we deal with lots of numbers, it's better to represent possible values as a bound (or a union of bounds).

-### Checking for Aldiff Constraint
+In this case we need to check that there exists a solution for both the maximum and minimum values for all variables in order to
+have **bound-consistency**

-Aldiff constraint says that all variables must have different values.
+## Backtracking Search

-A way to check for inconsistencies is to remove variables with single value domains and remove such value
-from all domains, then iterate until there are no more single value domain variables.
+This is a step coming **after** the **constraint propagation** when we have multiple values assignable to variables and is a *traditional search*
+over the problem.

-If an empty domain is produced, or there are more variables than domain values, congrats!! You found an inconsistency
+It can be interleaved with [constraint propagation](#constraint-propagation) at each step to see if we are breaking a constraint

-### Checking for Atmost Constraint
+### Commutativity

-Atmost constraint says that we can assign max tot resources
+Since our problem **may** not care about the specific value we may take, we can just consider the number of assignments.
+In other words, we don't care much about the value we are assigning, but rather about the assignment itself

-Sum all minimum resources for each problem, and if it goes over the max, we have an inconsistency
+### Heuristics

-### Bound Consistency
+To make the algorithm faster, we can have some heuristics to help us choose the best strategy:

-We check that for each variable X and for both the lower and higher bound of X there exists a value of U that satisfies the constraints between X and each Y.
+- **Choose Variable to Assign**:
+  - **minimum-remaining-values (MRV)**: consists on choosing the **variable with the least amount of values in its domain**
+    - **Pros**:
+      - Detection of Failures is faster
+    - **Cons**:
+      - There may be still ties among variables
+  - **degree heuristic**: choose the **variable involved the most in constraints**
+    - **Pros**:
+      - Fast detection of Failures
+      - Can be a tie-breaker combined with MRV
+    - **Cons**:
+      - Worse results than MRV
+- **Choose Value to assign**:
+  - **least-constraining-value**: choose the **value that rules out the fewest choices among the variable neighbours**
+    - **Pros**:
+      - We only case about assigning, not what we assign
+- **Inference**:
+  - **forward checking**: **propagate constraintsfor all unassigned variables**
+    - **Pros**:
+      - Fast
+    - **Cons**:
+      - Does not detect all inconsistencies
+  - **MAC - Maintaining Arc Consistency**: It's a **call to [AC3](#arc-consistency)** only with **constrainst involving our assigned variable**
+    - **Pros**:
+      - Detects all inconsistencies
+    - **Cons**:
+      - Slower than forward checking
+- **Backtracking - Where should we go back once we fail**
+  - **chronological backtracing**: go a step back
+    - **Pros**:
+      - Easy to implement
+    - **Cons**:
+      - Slower time of computation
+  - **backjumping**: We go to a step where we made an **assignment that led to a conflict**. This is accomplished by having conflicts sets, sets where we track\
+                        **variable assignments** to **variables involved in constraints**
+    - **Pros**:
+      - Faster time of computation
+    - **Cons**:
+      - Need to track a conflict set
+      - Useless if any **inference** heuristic has been used
+  - **conflict-directed backjumping**: We jump to the assignment that led to a **chain of conflicts**. Now, each time an we detect a failure, we propagate the conflict\
+                                        set over the backtraced variable **recursively**
+    - **Pros**:
+      - Faster time of computation
+    - **Cons**:
+      - Harder to implement
+- **Constraint Learning**
+  - **no-good**: We record a set of variables that lead to a failure. Whenever we meet such a set, we fail immediately, or we avoid it completely
+    - **Pros**:
+      - Faster time of computation
+    - **Cons**:
+      - Memory expensive

-## Backtracking Search in CSP
+## Local Search

-Book pg 175 to 178
+The idea is that instead of **building** our goal state, we start from a **completely assigned state** (usually randomically) and we proceed from there on.

-After finishing the constraint propagation, reducing our options, we still need to search for a solution.
+Can also be useful during online searching

- The order of values is not relevant (we can save up on computation and memoru)
+### Local Search Heuristics

-### Forward Checking
-
-<!-- TODO -->
-
-### Intelligent Backtracking (Backjumping)
-
-<!-- TODO -->
-
-## The structure of Problems
-
-<!-- TODO -->
-
-Book from 183 to 187
+- **min-conflicts**: Change a random variable (that has conflcts) to a value that would minimize conflicts
+  - **Pros**:
+    - Almost problem size independent
+    - Somewhat fast
+  - **Cons**:
+    - Subsceptible to Plateaus (use taboo-search or simulated annealing)
+- **constraint weighting**: Concentrate the search over solving constraints that have a higher value (each one starting with a value of 1). Choose the\
+                            pair variable/value that minimizes this value and increment each unsolved constraint by one
+  - **Pros**:
+    - Adds Topology to plateaus
+  - **Cons**:
+    - Difficult to implement