Announcements
Assignment 4 due Thursday
Midterm prep session Thursday
Midterm in class next Tuesday, Feb. 8
To take test at different time:
[Link]
key=teBgHp8uLYo4r-QZzIULP9w&hl=en#edit
CS 188: Artificial Intelligence
Fall 2010
Lecture 9: Logic and Planning
Peter Norvig and Sebastian Thrun
Slide credits: Stuart Russell, Rina Dechter,
Rao Kambhampati
Finding Actions
Search
Over atomic states. Deterministic.
A*, etc.
Policy finding
Atomic or abstract (features). Stochastic.
MDP; POMDP
Introduce today: Planning
Features or structured states
Plan can be hierarchical, partial order
Other refinements can be added
State Representation
Representation
Atomic Representation
s1 = s2; Result(s, a) = s′; GoalTest(s); h(s)
Factored Representation
Attribute(s) = val … (val numeric or Boolean)
Result and GoalTest in terms of attributes
Structured Representation
All of above
Objects
Relations, Functions: Rel(a, b, c); F(a, b) = c
Quantifiers (for all ∀, there exists ∃)
Factored State Variable Models
World is made up of states which are defined in
terms of state variables
Can be boolean (or multi-ary or continuous)
States are complete assignments over state
variables (fully observable world)
So, k boolean state variables can represent how
many states?
Actions change the values of the state variables
Applicability conditions of actions are also specified in
terms of partial assignments over state variables
“Classical” Planning
State: conjunction of Boolean state variables
Action Schema:
Action(Fly(p, from, to),
Precond: At(p, from) ∧ Plane(p)
∧ Airport(from) ∧ Airport(to)
Effect: ¬At(p, from) ∧ At(p, to))
Implicitly defines Actions(s) and Result(s, a)
Expressiveness of the language
Advantages of the Language
Natural to read, write, verify
Abstract over similar actions
Split task into parts (two plane flights)
Easy to extend with more complex
syntax and semantics
Can be compiled or interpreted
Planning Algorithms
Forward (progression) state-space search
… it’s just search
Backward (regresssion) state-space search
Consider Goal: Own(0136042597)
Action(Buy(i), Pre: ISBN(i) Eff: Own(i))
In general, may involve unbound variables
Plan-space search
Start with empty plan, add branches
Plan-space search
Plan-space search
Start
Right
Shoe
Start Finish
Finish
Start
Left
Sock
Finish
Plan-space search
Progression vs. Regression
Progression has higher Regression has lower
branching factor branching factor
Progression searches in the Regression searches in the
space of complete (and space of partial states
consistent) states There are 3n partial states (as
against 2n complete states)
You can also do bidirectional search
A B stop when a (leaf) state in the progression tree
entails a (leaf) state (formula) in the regression tree A
holding(A)
Ontable(A) Pickup(A)
~Clear(A)
B
~Ontable(A)
Ontable(B), Ontable(B), ~clear(B) Putdown(A)
holding(A)
Clear(A) Clear(B) ~clear(B)
holding(B)
Clear(B) hand-empty
~handempty
hand-empty
~Clear(B) Stack(A,B)
Pickup(B) ~Ontable(B) holding(A)
Ontable(A), clear(B) Putdown(B)??
Clear(A)
~handempty
Planning vs. Search
Search assumes successor and goal-test functions which know how to
make sense of the states and generate new states
Planning makes the additional assumption that the states can be
represented in terms of state variables and their values
Initial and goal states are specified in terms of assignments over state variables
Which means goal-test doesn’t have to be a blackbox procedure
That the actions modify these state variable values
The preconditions and effects of the actions are in terms of partial assignments over
state variables
Given these assumptions certain generic goal-test and successor functions can
be written
Specifically, we discussed one Successor called “Progression”, another called
“Regression” and a third called “Partial-order”
Notice that the additional assumptions made by planning do not change the
search algorithms (A*, DFS, etc)—they only change the successor and
goal-test functions
In particular, search still happens in terms of search nodes that have parent
pointers etc.
The “state” part of the search node will correspond to
“Complete state variable assignments” in the case of progression
“Partial state variable assignments” in the case of regression
“A collection of steps, orderings, causal commitments and open-conditions in the case of partial
order planning
State of the art
Annual planning competitions
Best technique has varied over time
Currently: mostly forward state-space
Largely due to good heuristics (relaxed prob.)
Heuristics for atomic (state search) problem
Can only come from outside analysis of domain
Heuristics for factored (planning) problem
Can be domain-independent
8-puzzle state space
8-puzzle action schema
Action(Slide(t, a, b),
Pre: On(t, a)∧ Tile(t) ∧ Blank(b) ∧ Adjacent(a,b)
Eff: On(t, b) ∧Blank(a) ∧ ¬On(t, a) ∧¬Blank(b))
8-puzzle heuristics
Convex search: ignore del lists
Factored Rep allows control
Factored Rep allows control
Planning Graphs
Planning graphs are an efficient way to
create a representation of a planning
problem that can be used to
Achieve better heuristic estimates
Directly construct plans
Planning graphs only work for
propositional problems
Compile to propositional if necessary
Planning Graphs
Planning graphs consists of a seq of levels
that correspond to time steps in the plan.
Level 0 is the initial state.
Each level consists of a set of literals and a
set of actions that represent what might be
possible at that step in the plan
Might be is the key to efficiency
Records only a restricted subset of possible
negative interactions among actions.
Planning Graphs
Each level consists of
Literals = all those that could be true at
that time step, depending upon the actions
executed at preceding time steps.
Actions = all those actions that could
have their preconditions satisfied at that
time step, depending on which of the
literals actually hold.
Planning Graph Example
Init(Have(Cake))
Goal(Have(Cake) Eaten(Cake))
Action(Eat(Cake),
PRECOND: Have(Cake)
EFFECT: ¬Have(Cake) Eaten(Cake))
Action(Bake(Cake),
PRECOND: ¬ Have(Cake)
EFFECT: Have(Cake))
Planning Graph Example
Create level 0 from initial problem state.
Planning Graph Example
Add all applicable actions.
Add all effects to the next state.
Planning Graph Example
Add persistence actions (inaction = no-ops) to
map all literals in state Si to state Si+1.
Planning Graph Example
Identify mutual exclusions between actions and
literals based on potential conflicts.
Mutual exclusion
A mutex relation holds between two actions
when:
Inconsistent effects: one action negates the effect of another.
Interference: one of the effects of one action is the negation
of a precondition of the other.
Competing needs: one of the preconditions of one action is
mutually exclusive with the precondition of the other.
A mutex relation holds between two literals
when:
one is the negation of the other OR
each possible action pair that could achieve the
literals is mutex (inconsistent support).
Cake example
Level S1 contains all literals that could result from
picking any subset of actions in A0
Conflicts between literals that can not occur together
(as a consequence of the selection action) are
represented by mutex links.
S1 defines multiple states and the mutex links are the
constraints that define this set of states.
Cake example
Repeat process until graph levels off:
two consecutive levels are identical
PG and Heuristic Estimation
PG’s provide information about the problem
PG is a relaxed problem.
A literal that does not appear in the final level
of the graph cannot be achieved by any plan.
H(n) = ∞
Level Cost: First level in which a goal appears
Very low estimate, since several actions can occur
Improvement: restrict to one action per level
using serial PG (add mutex links between every pair
of actions, except persistence actions).
PG and Heuristic Estimation
Cost of a conjunction of goals
Max-level: maximum first level of any of
the goals
Sum-level: sum of first levels of all the
goals
Set-level: First level in which all goals
appear without being mutex
The GRAPHPLAN Algorithm
Extract a solution directly from the PG
function GRAPHPLAN(problem) return solution or failure
graph INITIAL-PLANNING-GRAPH(problem)
goals GOALS[problem]
loop do
if goals all non-mutex in last level of graph then do
solution EXTRACT-SOLUTION(graph, goals,
LEN(graph))
if solution failure then return solution
else if NO-SOLUTION-POSSIBLE(graph) then return
failure
graph EXPAND-GRAPH(graph, problem)
GRAPHPLAN example
Initially the plan consist of 5 literals from the initial state (S0).
Add actions whose preconditions are satisfied by EXPAND-GRAPH (A0)
Also add persistence actions and mutex relations.
Add the effects at level S1
Repeat until goal is in level Si
GRAPHPLAN example
EXPAND-GRAPH also looks for mutex relations
Inconsistent effects
E.g. Remove(Spare, Trunk) and LeaveOverNight due to At(Spare,Ground) and not At(Spare, Ground)
Interference
E.g. Remove(Flat, Axle) and LeaveOverNight At(Flat, Axle) as PRECOND and not At(Flat,Axle) as
EFFECT
Competing needs
E.g. PutOn(Spare,Axle) and Remove(Flat, Axle) due to At([Link]) and not At(Flat, Axle)
Inconsistent support
E.g. in S2, At(Spare,Axle) and At(Flat,Axle)
GRAPHPLAN example
In S2, the goal literals exist and are not mutex with any other
Solution might exist and EXTRACT-SOLUTION will try to find it
EXTRACT-SOLUTION can search with:
Initial state = last level of PG and goal goals of planning problem
Actions = select any set of non-conflicting actions that cover the goals in the state
Goal = reach level S0 such that all goals are satisfied
Cost = 1 for each action.
GRAPHPLAN Termination
Termination of graph construction? YES
PG are monotonically increasing or decreasing:
Literals increase monotonically
Actions increase monotonically
Mutexes decrease monotonically
Because of these properties and because
there is a finite number of actions and literals,
every PG will eventually level off
Planning with Structured Rep
Situation Calculus
First-order Logic
Situation Calculus
Convenient to have more expressive lang.
“Move all the cargo from SFO to JFK”
Use existing mechanisms for logical proof
Strong foundation for studying planning
Still, less used in practice than other
techniques
Situation Calculus
Possibility Axioms (for each action)
SomeFormula(s) ⇒Poss(a, s)
Alive(Agent, s) ∧ Have(Agent, Arrow, s) ⇒
Poss(Shoot, s)
Successor-state Axiom (for each fluent)
Poss(a, s) ⇒(fluent is true ⇔ a made it true
∨ it was true and a left it alone)
Poss(a, s) ⇒ (Holding(Agent, g, Result(s, a)) ⇔
a = Grab(g) ∨
(Holding(Agent, g, s) ∧ a ≠ Release(g)))