Planning/acting in
Nondeterministic Domains
Planning/acting in
Nondeterministic Domains
So far only looked at classical planning,
i.e. environments are fully observable, static, deterministic
Also assumed that action descriptions are correct and
complete
Unrealistic in many real-world applications:
Don’t know everything; may even hold incorrect
information
Actions can go wrong
Distinction: bounded vs. unbounded indeterminacy: can
possible preconditions and effects be listed at all?
Unbounded indeterminacy related to qualification problem.
Methods for handling
indeterminacy
Sensor-less/conformant planning: achieve goal in all
possible circumstances, relies on coercion
Contingency planning: for partially observable and non-
deterministic environments; includes sensing actions and
describes different paths for different circumstances
Online planning and re-planning: check whether plan
requires revision during execution and re-plan accordingly
Sensor-less planning
Agent has no sensors to tell which state it is in, therefore each
action might lead to one of several possible outcomes
Must reason about sets of states (belief states), and make sure
it arrives in a goal state regardless of where it comes from and
results of actions
Non-determinism of the environment does not matter – the
agent cannot detect the difference anyway
The required reasoning is often not feasible, and sensor-less
planning is therefore often not applicable
Sensor-less (Conformant)
Planning
Handles domains where the state of the world is not fully known.
Comes up with plans that work in all possible cases.
Example:
– You have a wall made of bricks.
– You have a can of white paint.
– Action: Paint(brick), effect: Color(brick, white).
– Goal: every brick should be painted white.
In a fully observable domain, you could:
– Know the initial color of every brick.
– Make a plan to paint all the bricks that are not white initially.
– No need to paint bricks that are already white.
Suppose the world is not fully observable.
– We actually cannot observe the color of a brick.
Suppose that the world is deterministic.
– The effects of an action are known in advance.
What plan would ensure achieving the goal?
– Paint all bricks, regardless of their initial color (which we
don't know anyway).
– It may be overkill, since some bricks may already be white,
but it is the only plan that guarantees achieving the goal.
Example : Paint table and chair same colour
Initial State: We have two cans of paint and table and chair, but
colours of paint and of furniture is unknown:
Object(Table) ∧ Object(Chair) ∧ Can(C1) ∧ Can(C2) ∧
InView(Table)
Goal State: Chair and table same colour:
Color(Chair, c) ∧ Color(Table, c)
Actions: To look at something; to open a can; to paint.
Formal Representation of the Three Actions
Now we allow variables in preconditions that aren’t part of the
actions’s variable list!
Action(LookAt(x),
Action(RemoveLid(can), Action(Paint(x, can), Precond:InView(y ) ∧ (x
Precond:Can(can) Precond:Object(x) ∧
y)
Effect:Open(can)) Can(can) ∧
Color(can, c) ∧
Effect:InView(x) ∧
Open(can) ¬InView(y ))
Effect:Color(x, c))
Sensing with Percepts
A percept schema models the agent’s sensors.
It tells the agent what it knows, given certain conditions about
the state it’s in.
Percept(Color(x, c),
Precond:Object(x) ∧ InView(x))
Percept(Color(can, c),
Precond:Can(can) ∧ Open(can) ∧ InView(can) )
A fully observable environment has a percept axiom for each
fluent with no preconditions!
A sensor-less planner has no percept schemata at all!
Planning
One could force the table and chair to be the same colour by
painting them both—a sensor-less planner would have to do
this!
But a contingent planner can do better than this:
1. Look at the table and chair to sense their colours.
2. If they’re the same colour, you’re done.
3. If not, look at the paint cans.
4. If one of the can’s is the same colour as one of the pieces of
furniture, then apply that paint to the other piece of furniture.
5. Otherwise, paint both pieces with one of the cans.
Sensor-less Planning
There are no InView fluents, because there are no sensors!
There are unchanging facts:
Object(Table) ∧ Object(Chair) ∧ Can(C1) ∧ Can(C2)
And we know that the objects and cans have colours:
∀x∃cColor(x, c)
After transformation of first-order logic formulas this gives an
initial belief state:
b0 = Color(x, C (x))
A belief state corresponds exactly to the set of possible worlds
that satisfy the formula—open world assumption.
The Plan
[RemoveLid(C1), Paint(Chair, C1), Paint(Table, C1)]
Rules:
You can only apply actions whose preconditions are satisfied by
the current belief state b.
The update of a belief state b given an action a is the set of all
states that result (in the physical transition model) from doing a
in each possible state s that satisfies belief state b:
b′ = Result(b, a) = {s′ : s′ = ResultP (s, a) ∧ s ∈ b}
Or, when a belief b is expressed as a formula:
1. If action adds l, l becomes a conjunct of the formula b′ (and the
conjunct ¬l removed, if necessary); so b′ |= l
2. If action deletes l, ¬l becomes a conjunct of b′ (and l removed).
3. If action says nothing about l, it retains its b-value.
Showing the Plan Works
b0 = Color(x, C (x))
b1 = Result(b0, RemoveLid(C1))
= Color(x, C (x)) ∧ Open(C1)
b2 = Result(b1, Paint(Chair, C1))
(binding {x/C1, c/C (C1)} satisfies Precond)
= Color(x, C (x)) ∧ Open(C1) ∧ Color(Chair, C (C1))
b3 = Result(b2, Paint(Table, C1))
= Color(x, C (x)) ∧ Open(C1)∧
Color(Chair, C (C1)) ∧ Color(Table, C (C1))
So far, we have only considered actions that have the same
effects on all states where the preconditions are satisfied.
This means that any initial belief state that is a conjunction is
updated by the actions to a belief state that is also a
conjunction.
But some actions are best expressed with conditional effects.
This is especially true if the effects are non-deterministic, but
in a bounded way.
Multi-Agent Planning
Why is multi-agent planning
needed?
PERFORMANCE/EFFICIENCY
If new agents are introduced to a single agent environment,
but the single agent does not change its basic algorithms then
it may perform poorly.
Agents are not indifferent to another agent’s intentions (like
nature is). So Agents can cooperate, compete, or coordinate.
Sometimes distributed computations are easier to understand
and develop.
Multi-agent planning
Multi-agent planning emerges as a pivotal domain that
orchestrates the synergy among multiple autonomous agents
to achieve collective goals.
It encompasses a spectrum of strategies and methodologies
aimed at coordinating the decision-making processes of
diverse agents navigating dynamic environments.
Multiagent Planning Components
Agents: Agents are self-governing in a multi-agent system. Such
sensors can perceive the environment and actuators can handle
actions. Agents can be designed to have internal processes such as
algorithms or learning mechanisms for them to act.
Environment: The environment in the multiagent planning is the
one where agents work. It is its characteristics that are quite
changeable due to various factors over some time. Complexity comes
from the environment's scale, connections and unpredictability.
Communication: One of the significant aspects of multiagent
planning is the ability of agents to convey information and
synchronize their actions through communication. It is composed of
techniques, such as message passing or shared memory. Adequate
communication is a prerequisite for group work, synchronization,
and conflict resolution of agents.
Collaboration: Collaborative strategies aim to encourage interaction
and joint performance of individuals. This consists of task sharing,
information exchange, conflict management, and team building.
Working together extends collective wisdom and overall system
efficiency.
Multiagent Planning System
Architecture
At its core, multiagent planning system involves:
Goal Specification: Agent grouping / coordination with a
single objective or target on which they apply their efforts.
Knowledge Sharing: For instance, the missions may
exchange important intelligence that can be an integral part of
decision making.
Action Coordination: Enacting meticulous actions
coherently among agents in the sidesteppings of conflicts and
in the disease of synergy.
Adaptation: Strategy to include planning for overcoming the
changing challenges or goal that may evoke on a constant
basis and be capable to adapt.
Types of Multi-agent Planning
Centralized Planning: In the case of the centralized planning, one
unit or the central controller decides what to do for all the agents
from the whole system's state. This method of dealing with the
coordination problem makes it easier to coordinate but at the same
time, it can turn into a bottleneck and a single point of failure.
Decentralized Planning: Decentralized planning is the process
where each agent makes its own decisions depending on the
information available locally and the limited communication with
other agents. This approach is supposed to be more robust and
scalable, but it is hard to coordinate it properly.
Distributed Planning: The so-called distributed planning is a mixed-
up method where agents have to share some info and adjust their
plans in order to obtain the common world objectives. This mixture
of the advantages of the centralized and decentralized approaches,
tries to bring the best from both these systems and to make the factors
that are both necessary for coordination and autonomy.
The doubles tennis problem-
Example
Agents (A, B)
Init( At(A, [Left, Baseline]) Λ At (B, [Right, Net]) Λ
Approaching (Ball, [Right, Baseline])) Λ Partner(A, B) Λ
Partner(B,A)
Goal(Returned(Ball) Λ At(agent, [x, Net]))
Action(Hit(agent, Ball),
PRECOND: Approaching(Ball, [x, y]) Λ At(agent, [x, y]) Λ
Partner(agent, partner) Λ ¬ At(partner, [x, y])
EFFECT: Returned(Ball))
Action(Go(agent, [x, y]),
PRECOND:At(agent, [a,b]),
EFFECT:At(agent, [x, y]) Λ ¬ At(agent, [a,b]))