Platform-Aware Mission Planning: Stefan Panjkovic, Alessandro Cimatti, Andrea Micheli, Stefano Tonetta
Platform-Aware Mission Planning: Stefan Panjkovic, Alessandro Cimatti, Andrea Micheli, Stefano Tonetta
Abstract given plan (i.e., one needs to prevent any execution violating
the safety). To represent this scenario and to offer reasoning
Planning for autonomous systems typically requires reason- guarantees on the autonomous system as a whole, we need
ing with models at different levels of abstraction, and the har-
monization of two competing sets of objectives: high-level
to model both the planning objectives and the possible plat-
mission goals that refer to an interaction of the system with form behaviors, formalizing the link between these levels.
the external environment, and low-level platform constraints In this paper, we aim at finding solution plans achiev-
that aim to preserve the integrity and the correct interaction of ing high-level mission goals and offering formal robustness
the subsystems. The complicated interplay between these two guarantees on the execution of such plans on the platform.
models makes it very hard to reason on the system as a whole, We formalize and tackle the “Platform-Aware Mission Plan-
especially when the objective is to find plans with robustness ning” (PAMP) problem, which aims at finding a plan guar-
guarantees, considering the non-deterministic behavior of the
anteed to achieve the mission objectives, such that all the
lower layers of the system.
possible evolutions of the platform controlled by the plan
In this paper, we introduce the problem of Platform-Aware satisfy a set of safety and executability properties, taking
Mission Planning (PAMP), addressing it in the setting of tem-
poral durative actions. The PAMP problem differs from stan-
into account both the flexibility of the execution platform
dard temporal planning for its exists-forall nature: the high- (that is, the possible choices the platform can operate while
level plan dealing with mission goals is required to satisfy obeying a given plan) and the non-determinism from the en-
safety and executability constraints, for all the possible non- vironment. Concretely, we propose a formal framework, in
deterministic executions of the low-level model of the plat- which a high-level temporal planning representation is cou-
form and the environment. We propose two approaches for pled with a low-level description of the platform that exe-
solving PAMP. The first baseline approach amalgamates the cutes the generated plans. We use a standard temporal plan-
mission and platform levels, while the second is based on an ning model adapted from Gigante et al. (2022) and a timed
abstraction-refinement loop that leverages the combination automaton (Alur and Dill 1994) to represent the platform
of a planner and a verification engine. We prove the sound- level. This formal framework uses existing models and is
ness and completeness of the proposed approaches and val-
idate them experimentally, demonstrating the importance of
thus easily instantiated in practice; moreover, it mirrors the
heterogeneous modeling and the superiority of the technique architecture of real-world autonomous systems (Gat 1998).
based on abstraction-refinement. We propose two techniques to solve the PAMP prob-
lem. First, we develop a baseline “amalgamated” approach
grounded in Satisfiability Modulo Theory (SMT) (Barrett
Introduction et al. 2009): we combine a standard encoding for tempo-
A commonly employed architecture to realize autonomous ral planning with a novel encoding for the executability and
systems consists in using automated planning for the synthe- safety of a symbolic temporal plan. The resulting formula
sis of plans to achieve given mission goals, which are then is quantified (∃∀), mirroring the intuitive quantifier alterna-
executed on the system’s platform (and the environment the tion in the problem definition. Second, we propose a much
system operates in). This separation of concerns allows the more efficient approach exploiting the subdivision of the
planner to reason on a high-level model, disregarding de- framework in two layers of abstraction. The technique uses
tails and non-deterministic behaviors of the platform. How- a state-of-the-art heuristic search temporal planner to gen-
ever, in safety-critical applications, the mission objectives erate candidate plans, which are then checked for safety
(also called “science objectives”, in the space domain) often and executability, explaining the conflicts as sequences of
conflict with the safety constraints dictated by the platform events that the planner is required to avoid for subsequent
and the environment (also called ”engineering constraints”). candidates. This check employs a specialized version of the
The former are easily expressible as goals in the planning amalgamated SMT encoding. We formally prove the sound-
problem and are existential in nature (i.e., one needs to find ness and completeness of the proposed approaches and we
any plan achieving the goals), whereas safety constraints are develop two scalable case-studies to empirically evaluate
universally quantified over all the possible executions of the them, showing their empirical effectiveness.
Background n, where x, y ∈ X , ∼∈ {≤, <, =, >, ≥} and n ∈ N. We use
Temporal Planning We start by defining the syntax of a C(X ) to denote the set of clock constraints on X .
temporal planning problem: we adapt the formal model used A Timed Automaton generalizes finite-state automata by
by Gigante et al. (2022), which is quite close to PDDL 2.1 means of clock variables that can be reset and track the ad-
level 3 (Fox and Long 2003). vancement of time.
Definition 1 (Temporal Planning Problem). A temporal Definition 5 (Timed Automaton). A Timed Automaton (TA)
planning problem Π is a tuple hP, A, I, Gi, where P is is a tuple T = hΣ, L, l0 , X , ∆, Invi, where:
a set of propositions, A is a set of durative actions, I : • Σ is the alphabet;
P → {⊤, ⊥} is the initial state and G ⊆ P is the goal • L is a finite set of locations;
condition. A snap (instantaneous) action is a tuple h = • l0 ∈ L is the initial location;
hpre(h), eff+ (h), eff− (h)i, where pre(h) ⊆ P is the set of • X is a finite set of clocks;
preconditions and eff+ (h), eff− (h) ⊆ P are two disjoint sets • ∆ ⊆ L × C(X ) × Σ × 2X × L is the transition relation;
of propositions, called the positive and negative effects of h, • Inv : L → C(X ) maps each location to its invariant.
respectively. We write eff(h) for eff+ (h)∪eff− (h). A durative g,a,r
We will write l −−−→ l′ when hl, g, a, r, l′i ∈ ∆.
action a ∈ A is a tuple ha⊢ , a⊣ , pre↔ (a), [La , Ua ]i, where
a⊢ and a⊣ are the start and end snap actions, respectively, Definition 6 (State of TAs). Given a TA T =
pre↔ (a) ⊆ P is the over-all condition, and La ∈ Q>0 and hΣ, L, l0 , X , ∆, Invi, a state of T is a pair hl, ui, where
Ua ∈ Q>0 ∪ {∞} are the bounds on the action duration. l ∈ L and u : X → R≥0 is a clock assignment.
A (time-triggered) plan is defined as a set of triples, each We use u |= g to mean that the clock values denoted by u
specifying an action, its starting time and its duration. satisfy the guard g ∈ C(X ). For d ∈ R≥0 , we use u + d to
Definition 2 (Plan). Let Π = hP, A, I, Gi be a tempo- denote the clock assignment that maps all clocks c ∈ X to
ral planning problem. A plan for Π is a set of tuples π = u(c) + d. For r ⊆ X , we use [r → 0]u to denote the clock
{ha1 , t1 , d1 i, . . . , han , tn , dn i}, where, for each 1 ≤ i ≤ n, assignment that maps all c ∈ r to 0, and all c ∈ X\ r to u(c).
ai ∈ A is a durative action, ti ∈ Q≥0 is its start time, and Definition 7 (Semantics of TAs). The semantics of a TA is
di ∈ Q>0 is its duration. defined in terms of a transition system, with states of the
We will call length of a time-triggered plan π (denoted form hl, ui and transitions defined by the following rules:
d
with |π|) the number of snap actions in π (i.e. twice the num- • hl, ui −
→ hl, u + di if u ∈ Inv(l) and (u + d) ∈ Inv(l), for
ber of durative actions). d ∈ R≥0 ;
A time-triggered plan π is a solution plan for the problem a g,a,r
→ hl′ , u′ i if l −−−→ l′ , u |= g, u′ = [r → 0]u and
• hl, ui −
Π if, starting from the initial state I, each durative action in u′ ∈ Inv(l′ ).
the plan can be applied at the specified time with the given
duration (the preconditions of its start and end snap actions Definition 8 (Timed trace). Let T = hΣ, L, l0 , X , ∆, Invi
are true at the start and at the end of the action respectively), be a TA. A timed action is a pair ht, ai, where t ∈ R≥0
and if by applying all the effects a final state is reached after and a ∈ Σ. A timed trace is a (possibly infinite) sequence
the end of the last action in which the goal condition is sat- of timed actions ξ = hht1 , a1 i, ht2 , a2 i, . . . , hti , ai i, . . .i,
isfied. The formal semantics is presented in (Gigante et al. where ti ≤ ti+1 for all i ≥ 1.
2022), which we omit here for the sake of brevity. Definition 9 (Run of a TA). The run of a TA T =
We assume a semantics without self-overlapping of ac- hΣ, L, l0 , X , ∆, Invi with initial state hl0 , u0 i over a timed
tions (Gigante et al. 2022), which makes the temporal plan- trace ξ = hht1 , a1 i, ht2 , a2 i, . . .i is the sequence of tran-
ning problem decidable: it is not possible for two instances d
1 1 a 2 2d a
of the same ground action to overlap in time. sitions hl0 , u0 i −→ −→ hl1 , u1 i −→ −→ hl2 , u2 i . . ., where
d1 = t1 and di = ti − ti−1 for all i ≥ 2.
Definition 3 (Action self-overlapping). A plan
{ha1 , t1 , d1 i, . . . , han , tn , dn i} is without self-overlapping
if there exist no i, j ∈ {1, . . . , n} such that ai = aj and Problem definition
ti ≤ tj < ti + di . In this section, we formalize our composite framework and
This formal model of temporal planning is simplified with the Platform-Aware Mission Planning (PAMP) problem.
respect to concrete planning languages (e.g. for the sake of In our framework, we consider an autonomous system ar-
simplicity we only defined the ground model, while most chitecture with two layers of abstraction: a planning layer,
languages allow a first-order lifted representation for com- represented as a temporal planning problem, describing the
pactness), but it already achieves the full computational high-level durative actions and a mission goal; and a plat-
complexity of very expressive languages such as ANML form layer, represented as a TA, which describes the low-
(Gigante, Micheli, and Scala 2022). level details and internal actions of the platform that is con-
trolled by the planner. We consider an interface between
Timed Automata Here, we recall the standard definitions. the two layers where each start and end event of an ac-
Definition 4 (Clock constraints). Let X be a finite set of tion of the planning problem is associated with a signal of
elements called clocks. A clock constraint is a conjunctive the TA (a letter of its alphabet), and define the execution of
formula of atomic constraints of the form x ∼ n or x − y ∼ a time-triggered plan by synchronizing the action start/end
Work [20] distance ≥ 10 Work [20]
π1 Processing [55]
C⊣ 0 55 t
W RESUMING C STARTED
start OFF cC == 2 BAD
cW ≤ 2 cC ≤ 2
Work [20] Work [20]
P⊢ τ C⊢ τ
P⊣
π2 Processing [45]
cP := 0 W⊢ cC := 0 cP > 50
0 45 t
c ≥ 10
cW := 0
Work [20] Cool [2] Work [20]
P STARTED
W⊢
cW := 0
W STARTING
cW ≤ 2
τ W STARTED
cW ≤ 20 W⊣
cW == 20
W ENDED
P⊣
P ENDED
π3 Processing [45]
c := 0 0 45 t
(a) (b)
Figure 1: Running example TA platform model (a) and example plans (b). The first two plans violate safety (π1 ) and executabil-
ity (π2 ) constraints, while the third one (π3 ) is correct for all platform executions.
commands of the plan with transitions of the platform la- can be reached by executing a sequence of timed snap ac-
beled with the corresponding events. In the time between tions ρ on a TA T , are all the states that belong to a run of
two high-level commands, the platform can freely evolve by T where all and only the snap actions in ρ are applied, by
performing internal transitions and advancing time. taking the corresponding transitions at the times specified
Example. Figures 1a and 1b show a small running example in ρ. We formally define this with a function H, that maps
of the considered framework. An industrial process needs the snap actions of ρ with steps in the run of T where the
to be completed by starting a ”Process” (P) action and transitions with the corresponding labels are taken.
applying in parallel two ”Work” (W) actions. In the plan- Definition 10 (States reachable by plan execution). Let
ning model, we have a Boolean variable processing (initially Π = hP, A, I, Gi be a temporal planning problem, and let
false) and a bounded integer variable completed-steps (ini- T = hΣ, L, l0 , X , ∆, Invi be a TA such that τ a⊢ , τ a⊣ ∈ Σ,
tially 0).1 The ”Process” action sets the processing variable for all actions a ∈ A. Let ρ = h(t1 , e1 ), . . . , (tn , en )i
to true at the start, and sets it back to false at the end. The be a (possibly empty) ordered sequence of timed snap ac-
”Work” action has an over-all condition requiring the pro- tions of Π, where ti < ti+1 for all i ∈ {1, . . . , n − 1}.
cessing variable to be true during the entire duration of the A state rs is ”reachable” by executing ρ on T from the
action, and an end effect that increments the value of the initial state r0 = hl0 , u0 i if and only if there exists a run
completed-steps variable by 1. There is also a ”Cooldown” d 1 σ1 dk kσ
r0 −→ −→ . . . −→ −→ rk , with 0 ≤ s ≤ k, and an injective
action, which does not have any effect on the variables of the
function H : {0, 1, . . . , n} → {0, 1, . . . , k} with the follow-
planning model. The goal requires that the ”Work” action is
ing properties:
applied two times (i.e. completed-steps == 2). At the plat-
1. H(0) = 0 (Required to handle the case with ρ = hi);
form layer, modeled by the TA shown in Fig. 1a, there are
2. for all i ∈ {1, . . . , n}, for all j ∈ {1, . . . , k}, if H(i) = j
transitions with labels corresponding to the start/end events
then τ ei = σj and ti = jl=1 dl ;
P
of the high-level durative actions (e.g. W⊢ corresponds to
the start of the ”Work” action), and internal transitions that 3. for all j ∈ {1, . . . , k}, if j 6∈ Im(H) then for all e ∈
the platform can take, which are not linked to high-level {a⊢ , a⊣ : a ∈ A}, σj 6= τ e .
events (the transitions with label τ in this example). The TA We define analogously the set of states that are reachable
encodes a low-level constraint between successive applica- after executing ρ on T from the initial state r0 , denoted by
tions of the ”Work” actions that is not modeled in the plan- ReachableAfterT (r0 , ρ) (while in the previous definition we
ning problem: when a ”Work” action is performed (reaching consider all states rs along the run, here we include in the
the W ENDED location), a component becomes heated and set only the final state rk ).
needs to cool down before the next ”Work” action can be ap- In the running example, consider the sequence ρ =
plied, and this occurs either by waiting 10 time units (tran- h(0, P⊢ ), (1, W⊢ ), (21, W⊣ )i. Then we have that
sition from W ENDED to W RESUMING with guard c ≥ 10),
or by explicitly applying a ”Cooldown” action which cools ReachableT (OFF, ρ) = {OFF, P STARTED , W STARTING ,
the component after 2 time units (transitions to C STARTED W STARTED, W ENDED, BAD }
and P STARTED with labels C⊢ and C⊣ ). Moreover, the pro- ReachableAfterT (OFF, ρ) = {W ENDED , BAD }
cess has a deadline of 50 time units, after which the platform
can reach an undesirable state (transition from W ENDED to For instance, BAD ∈ ReachableAfterT (OFF, ρ) since there
BAD with guard cP > 50). 1 d =0 σ =P
1 2 d =1 σ =W
exists the run OFF −− −→−− ⊢
−−→ P STARTED −−−→− −2−−−→⊢
We start by introducing the notion of states that are reach- d3 =2 σ3 =τ d4 =18 σ4 =W⊣
W STARTING −−−→−−−→ W STARTED − −−−→−−−−−→
able by executing a plan on a TA. Intuitively, the states that d5 =31 σ5 =τ
W ENDED − −−−→−−−→ BAD and the function H s.t.
1
For simplicity, we use a bounded integer variable to count the H(1) = 1, H(2) = 2 and H(3) = 4, satisfying Definition 10.
number of completed ”Work” actions. Such a variable can be com- Next, we formalize the notion of executability of a time-
piled in our planning model using unary or binary encodings. triggered plan on the platform. Intuitively, we say that a
time-triggered plan is executable on a TA if every snap ac- Consider the running example, and suppose that B is
tion of the plan is applicable at the prescribed time, for any the set of all states with location BAD. The sequence ρ =
possible internal behavior of the platform, assuming that the {(0, P⊢ ), (1, W⊢ ), (21, W⊣ ), (32, W⊢ ), (52, W⊣ ), (55, P⊣ )},
platform applied all the previous commands of the plan. A which corresponds to the first plan in Fig. 1b, is not B-safe,
snap action is applicable if a corresponding transition can be because it is possible to reach location BAD with γ = 53 be-
taken at the time specified in the plan. tween the application of the last two snap actions (52, W⊣ )
Formally, given a state hl, ui of T , a snap action a⊢/⊣ is and (55, P⊣ ) (this state belongs to ReachableT (r0 , ρ)).
applicable in hl, ui if and only if there exists a transition We can now formally define the PAMP problem, where
a⊢/⊣
g,τ ,r the objective is to find a solution plan for the planning prob-
l −−−−−−→ l′ such that u |= g and [r → 0]u ∈ Inv(l′ ). lem, such that it is safe and executable for all the platform
For a time-triggered plan π = traces that are compliant with the plan.
{ha1 , t1 , d1 i, . . . , han , tn , dn i}, we indicate with
ρπ = h(t′1 , e1 ), . . . , (t′2n , e2n )i the ordered sequence Definition 13 (PAMP). A Platform-Aware Mission Plan-
of timed snap actions of π, with t′i < t′i+1 for all ning (PAMP) problem is a tuple Υ = hΠ, T , Bi, where Π is
i ∈ {1, . . . , 2n − 1}. For simplicity, we assume that all a temporal planning problem, T is a TA, and B ⊆ L × RX
the valid plans of the considered planning problems do not is a set of bad states for T . A solution for Υ is a plan π such
contain simultaneous events, i.e. snap actions scheduled at that: (i) π is a valid solution plan for Π; (ii) π is executable
the same time: since the semantics of TA is super-dense on T ; (iii) π is B-safe w.r.t. T .
(multiple discrete steps can be taken at the same time in a The third plan in Fig. 1b is a solution for the example
specific order), in order to properly define and check the PAMP problem. The application of the ”Cool” action be-
executability of a plan with simultaneous events for all tween the two ”Work” actions makes it fully executable and
platform behaviors, all the possible orderings for the sets of safe, since the BAD is unreachable (cP > 50 remains false).
simultaneous events would need to be considered. Given a
sequence of timed snap actions ρ = h(t1 , e1 ), . . . , (tn , en )i, Solution approaches
we denote with ρi = h(t1 , e1 ), . . . , (ti , ei )i the prefix
obtained by considering the first i ≤ n timed snap actions. In this section, we propose two approaches for solving the
We denote with ρ0 = hi the empty sequence. PAMP problem. We assume that a constant k is given, which
represents the maximum possible ratio between the length
Definition 11 (Time-triggered plan executability on TA). of a platform trace and the length of the executed plan.
Let Π be a temporal planning problem and let T be a TA Hence, when considering a plan π of length L (the num-
with initial state r0 = hl0 , u0 i. Suppose that T has a global ber of snap actions in the plan), we will analyze its safety
clock γ that is not reset in any transition and has value 0 and executability for platform traces of length up to κL. It
in the initial state. An ordered sequence of timed snap ac- is reasonable to assume that such a constant exists and that
tions ρ = h(t1 , e1 ), . . . , (tn , en )i is executable on T if and it can be computed for a platform, as plans have a finite du-
only if for all i ∈ {0, . . . , n − 1}, for all r = hl, ui ∈ ration and in most practical systems only a finite number of
ReachableAfterT (r0 , ρi ), if u(γ) = ti+1 then ei+1 is appli- transitions can be taken in a given time.
cable in r. A time-triggered plan π of Π is executable on T
if its sequence of timed snap actions ρπ is executable on T . Encoding-based approach We will now describe our
SMT encoding of the PAMP problem (Fig. 2). Consider a
For example, the sequence ρ =
temporal planning problem Π, a timed automaton T mod-
{(0, P⊢ ), (1, W⊢ ), (21, W⊣ ), (22, W⊢ ), (42, W⊣ ), (45, P⊣ )},
eling the platform, a set of bad states B, and a bound h on
which corresponds to the second plan in Fig. 1b is not ex-
the length of the plan. We assume that T contains a global
ecutable on the TA of Fig. 1a, because it is possible to
clock γ with initial value 0, that is never reset in any tran-
reach location W ENDED with γ = 22 and c = 1 (this state
sition. The encoding represents two distinct traces: a plan
belongs to ReachableAfterT (r0 , ρ3 )) and the transition with
trace with h timed steps, and a platform trace with κh timed
label W⊢ is not applicable (the guard c ≥ 10 is false).
steps. In each step of a plan trace at most one snap action
We formalize the notion of safety for a plan w.r.t a TA, can be applied (as we discussed in the previous section).
given a set of bad states B, by requiring that all the states that
We start by defining the variables of our encoding. For
can be reached by executing ρπ = h(t1 , e1 ), . . . , (tn , en )i,
every step i ∈ {1, . . . , h} of the plan trace, we use the real
within time tn , do not belong to B.
variable ti to denote the time associated to step i; for every
Definition 12 (Plan safety w.r.t. TA). Let Π be a temporal action a, we use the Boolean variable ai to denote whether
planning problem and let T be a TA with initial state r0 = the action a is started at step i, and the real variable dai to
hl0 , u0 i. Suppose that T has a global clock γ that is not represent the duration of action a when started at step i. For
reset in any transition and has value 0 in the initial state. every step i ∈ {1, . . . , κh} of the platform trace, we use
Let B ⊆ L × RX be a set of bad states for T . An ordered the variable li to denote the location of T at step i; for every
sequence of timed snap actions ρ = h(t1 , e1 ), . . . , (tn , en )i clock c, we use the variable ci to represent the value of clock
is B-safe w.r.t. T if and only if for all states r = hl, ui ∈ c at step i (the value of the global clock γ at step i is γi );
ReachableT (r0 , ρ) such that u(γ) ≤ tn , r ∈ / B. A time- finally, for every action a, we use the Boolean variable τia⊢
triggered plan π of Π is B-safe w.r.t. T if its sequence of (respectively τia⊣ ) to denote whether a transition with label
timed snap actions ρπ is B-safe w.r.t. T . τ a⊢ (respectively τ a⊣ ) will be taken by T at step i.
h−1
^
~ LAN VALIDΠ ~t, ~a, d,
Φh : ∃~t, ~a, d.P ~ h ∧ ∀~l, ~c. T RACE VALID T ~l, ~c, h, κ ∧ COMPLIANTT ~t, ~a, d,
~ ~l, ~c, i, κ → APPLICABLE T ~t, ~a, d,
~ ~l, ~c, i + 1, κ ∧
i=0
~ ~l, ~c, h, κ → SAFETYT ~l, ~c, B, h, κ
T RACE VALID T ~l, ~c, h, κ ∧ COMPLIANTT ~t, ~a, d,
^κh
T RACE VALID T ~l, ~c, h, κ : I NITT ~l, ~c, 1 ∧ T RANST ~l, ~c, i − 1, i
i=2
h κh
~ ~l, ~c, h, κ :
^ ^ _ a ^
COMPLIANT T ~ γj ′ = ti → ¬τja′⊢
t, ~a, d, ai →
τ ⊢ ∧ γj = ti ∧
j
a∈A i=1 j=1 j ′ ∈{1,...,κh}
j ′ 6=j
h
^ ^ h
^ κh
_ ^
a a⊣
(as ∧ ts + das = ti ) →
∧
τ ⊣ ∧ γj = ti ∧
j γj ′ = ti → ¬τj ′ ∧
a∈A s=1 i=s+1 j=1 j ′ ∈{1,...,κh}
j ′ 6=j
κh
^ ^ h
_ κh
^ ^ h
_ h
_
τia⊢ → (aj ∧ tj = γi ) ∧ τia⊣ → (as ∧ ts + das = tj ∧ tj = γi )
a∈A i=1 j=1 a∈A i=1 s=1 j=s+1
κh i−1
~ ~l, ~c, h, κ : ENABLED (~
^ ^ ^ _
APPLICABLE T ~ ¬τja⊢
t, ~a, d, ah → γ i = th ∧ γ j < th ∨ → l, ~c, δ, i) ∧
a∈A i=1 j=1 δ=hli ,g,τ a⊢ ,r,l′i i∈∆
^ h−1 κh i−1
ENABLED(~
^ ^ ^ _
as ∧ ts + das = th → γj < th ∨ ¬τja⊣ →
γi = th ∧ l, ~c, δ, i)
a∈A s=1 i=1 j=1 δ=hli ,g,τ a⊣ ,r,l′i i∈∆
κh
SAFETYT ~ γi ≤ th → ¬BAD ~l, ~c, B, i
^
l, ~c, B, h, κ :
i=1
We will use the notation ~t to denote the set of variables of the corresponding label at the same time; the same applies
the form ti , and analogously for ~a, d,~ ~l and ~c. for actions ending at step i, which were started at a previous
The formula Φh represents time-triggered plans of length step s (as ∧ ts + das = ti ); if a transition with label a⊢ is
up to h that satisfy Definition 13: the plan must be a valid taken at step i (τia⊢ ), then there must exist a step j in the
solution for Π (P LAN VALIDΠ ); for all possible traces of T , plan trace at which a is started and the times are the same
the plan must be executable, i.e. for all plan prefixes i from 0 (aj ∧ tj = γi ), and similarly for transitions with label a⊣ .
to h − 1, if a trace of T is valid (T RACE VALIDT ) and all the The applicability of snap actions is encoded by
snap actions up to i have been applied (COMPLIANTT ), then APPLICABLE T . If the action a is started at step h, then for all
the (i + 1)-th snap action will be applicable at the prescribed the steps i in the platform trace where the value of the global
time (APPLICABLET ); finally, for all possible traces of T , clock corresponds to the time at which a is started and the
all the states of T that can be visited by executing the plan corresponding transition has not already been taken in a pre-
do not intersect the set of bad states B, i.e. if a trace of T is vious step j at the current time (γj < th ∨ ¬τja⊢ ), there must
valid and all the snap actions of the plan have been applied, exist a transition from the current location li with label τ a⊢
then the safety property is satisfied (SAFETYT ). that is enabled (ENABLED encodes the fact that the guard is
The formula P LAN VALIDΠ is a standard bounded en- true under the current clock evaluation and that the invariant
coding of temporal planning (Shin and Davis 2005), which of the reached location is true after the necessary clocks are
we omit for the sake of brevity. Similarly, the formula reset). The applicability of ends is handled similarly.
T RACE VALID T is a standard unrolling of the transition re- Finally, the formula SAFETYT states that for all the steps
lation of the timed automaton T up to step κh, where we in the platform trace that occur before the end of the plan
denote with INIT T the formula for the initial state of T , and (γi ≤ th ), the current state is not included in the set of bad
with TRANS T the transition relation of T . states B (BAD encodes the set B ⊆ L × RX ).
The formula COMPLIANT T encodes the fact that a plat- The overall procedure (PAMP-E NC ) builds the formulae
form trace applies all the snap actions of a plan up to step h, Φh for increasing bounds h and checks them with an SMT
and that no transition corresponding to a snap action is trig- solver: if it returns UNSAT, then there is no safe and exe-
gered at the wrong time or without the action being present cutable plan within bound h and the bound is increased; if a
in the plan (it characterizes the traces that appear in the def- model is returned, it corresponds to a solution to the PAMP
inition of ReachableT ): when an action a is started at step i problem, as it satisfies the planning constraints and the exe-
(ai ), there exists a step j in the platform trace where a tran- cutability and safety properties for all platform traces.
sition with the corresponding label is triggered (τja⊢ ), the We now show the soundness and completeness of the ap-
value of the global clock corresponds to the time at which a proach. Here we provide proof sketches for the theorems,
is started (γj = ti ), and there are no multiple occurrences of while the full details are included in the additional material.
Theorem 1 (Soundness and completeness of encod- Abstraction-refinement approach In our second ap-
ing-based algorithm). For every PAMP problem Υ = proach, based on an abstraction-refinement loop, we con-
hΠ, T , Bi and every bound κ: sider the planning problem and the validation problem sepa-
1. if PAMP-E NC (Π, T , B, κ) terminates and returns plan rately: a temporal planner generates solution plans consider-
π, then π is a valid solution for Υ (soundness); ing only the planning problem, and then the produced can-
2. if there exists a solution for Υ, then didate plans are checked for executability and safety at the
PAMP-E NC (Π, T , B, κ) will eventually terminate platform layer. Since we are considering time explicitly, it
and return a solution for Υ (completeness). is not feasible to exclude single time-triggered plans at each
validation check, as the planner, that is not aware of the plat-
Proof. (Sketch) form constraints, can in most cases just slightly change the
1. Suppose that the procedure returns plan π at step h. Let µ timing of the actions and the same problem would occur at
be the model of Φh from which π was extracted. We need the platform layer. Instead, at each failed validation check
to prove that π satisfies Definition 13: we want to exclude classes of plans, by determining that
(a) First, we show that π is a valid solution plan for Π. µ a certain sequence of discrete choices is infeasible, for any
satisfies P LAN VALIDΠ (~t, ~a, d,~ h), which is a standard possible scheduling of the chosen snap actions.
bounded encoding of the temporal planning problem For solving the planning problem, we rely on the TAMER
Π, hence the plan π, which is extracted from µ, is a temporal planner (Valentini, Micheli, and Cimatti 2019),
solution plan for Π of length up to h. which is a sound and complete approach for temporal plan-
(b) Second, we need to prove that π is executable on ning that is able to return plans expressed as Simple Tem-
T , i.e. it satisfies the requirements of Definition 11. poral Networks (STN) (Dechter, Meiri, and Pearl 1991): a
Let r0 = hl0 , u0 i be the initial state of T and returned solution πSTN is characterized by a fixed order-
let ρπ = h(t1 , e1 ), . . . , (tn , en )i be the ordered se- ing of snap actions e1 , . . . , en ← PATH(πSTN ), with each
quence of timed snap actions of π. Consider a pre- snap action ei associated to a symbolic time ti . The times
fix i ∈ {0, . . . , n − 1} of ρπ and a state r = are ordered increasingly, with additional constraints between
hl, ui ∈ ReachableAfterT (r0 , ρπi ) such that u(γ) = pairs of start/end snap actions representing the duration con-
ti+1 and r is reachable within κh steps starting from straints of the corresponding durative actions. The planning
d 1 σ 1 d k σ k algorithm implements an explicit-state heuristic-search ap-
r0 , i.e. there exists a run r0 −→ −→ . . . −→ −→
rk ≡ r with k ≤ κh. Then, it can be shown that proach, that works by exploring all the possible ordered se-
ei+1 is applicable in r: the run reaching r together quences of snap actions, and updating in each state a STN
with the model µ satisfy T RACE VALIDT (~l, ~c, h, κ) and whenever a new snap action is added to the sequence. If the
~ ~l, ~c, i, κ); from the encoding Φh set of STN constraints of a state becomes infeasible, then it
C OMPLIANTT (~t, ~a, d, can be pruned, as it means that the chosen sequence of dis-
this implies that A PPLICABLET (~t, ~a, d, ~ ~l, ~c, i + 1, κ) is crete events cannot be scheduled while respecting the tem-
satisfied, and this implies that ei+1 is applicable in r poral constraints of the problem. If a goal state is reached,
(full details in the additional material). Therefore, by then all the time-triggered plans that satisfy the STN con-
Definition 11, π is executable on T . straints of that state are valid solution plans, and a specific
(c) Third, we need to prove that π is B-safe w.r.t. to solution can be extracted by solving the constraints.
T , i.e. it satisfies the requirements of Definition 12. The main idea of our approach is to validate on the plat-
Let r0 = hl0 , u0 i be the initial state of T and let form the set of STN constraints πSTN produced by the plan-
ρπ = h(t1 , e1 ), . . . , (tn , en )i be the ordered sequence ner, using an encoding similar to the one of the previous
of timed snap actions of π. Consider a state r = hl, ui ∈ approach (Fig. 2). If there exists a solution to the STN con-
ReachableT (r0 , ρπ ) such that u(γ) ≤ tn and r is straints, that satisfies the executability and safety notions
reachable within κh steps starting from r0 , i.e. there for all the platform traces, then it corresponds to an an-
d1 σ1 dk σk
exists a run r0 −→ −→ . . . −→ −→ rk , with k ≤ κh swer to the PAMP problem. Otherwise, we determine the
and r = ri for some i ∈ {0, . . . , k}. Then, it can shortest prefix e1 , . . . , ei of the sequence of snap actions
be shown that r 6∈ B: the run reaching r together PATH(πSTN ), such that by considering only the STN con-
with the model µ satisfy T RACE VALIDT (~l, ~c, h, κ) and straints of e1 , . . . , ei , there does not exist a way to schedule
~ ~l, ~c, h, κ); from the encoding Φh them while guaranteeing executability and safety for all plat-
C OMPLIANTT (~t, ~a, d,
form traces. If such a prefix is found, it can be learned by the
this implies that S AFETYT (~l, ~c, B, h, κ) is satisfied, planner, and all the states that are found during exploration
and this implies that r 6∈ B. whose path starts with a learned prefix can be pruned.
2. Let π be the solution plan for Υ that exists by assumption. The overall procedure is detailed in Algorithm 1. The
Since P LAN VALIDΠ (~t, ~a, d, ~ h) is a standard bounded en- planning problem Π is solved, and we obtain a set of so-
coding of the temporal planning problem Π with com- lution plans πSTN , characterized by a fixed order of snap ac-
pleteness guarantees, there exists a step h for which there tions e1 , . . . , en ← PATH(πSTN ), together with a set of tem-
is a model µ |= P LAN VALIDΠ (~t, ~a, d, ~ h) such that π can poral constraints between their associated times t1 , . . . , tn .
be extracted from it. We can then show that µ |= Φh , The solution πSTN is then passed to C HECK, together with
which implies that PAMP-E NC (Π, T , B, κ) terminates at the set of bad states B. The C HECK procedure iterates over
a step lower or equal than h (because µ |= Φh ). all the prefixes i ∈ {1, . . . , n}, and builds the formula
ψπi , which is the subset of constraints of πSTN considering Algorithm 1: Abstraction-refinement algorithm
only t1 , . . . , ti ([πSTN ]i is the conjunction of all the con- 1 procedure PAMP-R EF (Π, T , B, κ)
straints containing ti and one of the times in {t1 , . . . , ti−1 }). 2 bad prefixes = {}
3 while True do
The constraints ψπi are then used to produce the formula 4 πSTN ← P LAN(Π, bad prefixes)
Φi , which is an encoding of the PAMP problem, consid- 5 pass, π ← C HECK(T , πSTN , B)
ering only candidate plans represented by ψπi : in the for- 6 if pass then return π
7 else
mula of Fig. 2, P LAN VALIDΠ is replaced with ψπi , while the 8 bad prefixes ← bad prefixes ∪ π
forall formula is simplified considering the specific discrete 9 procedure C HECK(T , πSTN , B)
choices e1 , . . . , ei that are made by each plan represented by 10 e1 , . . . , en ← PATH(πSTN )
ψπ0 ←⊤
ψπi (for each step j ∈ {1, . . . , i}, the truth value of all the 11
12 for i = 1 to n do
variables ai is known and can be substituted in the formula). 13 ψπ i ← ψ i−1 ∧ [π
π STN ]i
The formula Φi is then provided to an SMT solver: if it is 14 Φi ← E NCODE(ψπ i , T , B, i, κ)
REF
Rover-2
Factory2-k2 10 10 Rover-3
Factory2-k3 3 6 6000 Rover-4
Factory2-k4 0 3 Rover-5
Factory2-k5 0 0 4000 101
Rover-k2 35 44
Rover-k3 18 25 2000
Rover-k4 8 18
0 100 0
Rover-k5 8 13 0 20 40 60 80 100 120 10 101 102 103
Total 87 130 Instances Solved BMC
Figure 3: Experimental results: coverage table (a), cactus (b) and scatter (c) plots. The k values represent the bound on platform
traces w.r.t. plan lengths. B MC is the encoding-based approach, while R EF is the algorithm based on abstraction-refinement.
Experimental Evaluation nization after the deadline has passed (we experiment with
both versions of the problem). The instances are scaled by
We developed a solver written in Python based on pySMT
increasing the number of required Work actions, by consid-
(Gario and Micheli 2015) implementing both the presented
ering different deadlines, and by increasing the bound κ.
approaches. The solvers accept temporal planning problems
We performed all the experiments on a cluster of iden-
written in either PDDL 2.1 or ANML, and platform mod-
tical machines with AMD EPYC 7413 24-Core Processor
els written in timed SMV (Cimatti et al. 2019), which al-
and running Ubuntu 20.04.6. We used a timeout of 14400
lows to model TAs in a symbolic setting. We experimentally
seconds and a memory limit of 20GB. The experimental
evaluated both approaches on two novel sets of benchmarks,
results are shown in Fig. 3. We can observe that both ap-
ROVER and FACTORY, which are both available in the addi-
proaches are effective at solving the tested benchmarks, with
tional material. In ROVER , there are n locations l0 , . . . , ln−1
the approach based on abstraction-refinement having a wider
connected by edges, and a robot which is initially at location
coverage and faster solving times. This is expected, espe-
l0 . The robot can move between consecutive locations in 1
cially if the number of necessay refinement loops is low,
unit of time, while moving between non-consecutive loca-
as heuristic-search based planners are typically much faster
tions takes 100 units of time. The robot can also communi-
at finding solution plans compared to encoding-based ap-
cate at each location. The goal of the planning problem is to
proaches, and checking the executability and safety of a STN
communicate while at certain locations, and reach location
plan is computationally much cheaper compared to combin-
ln−1 in the end. At the platform layer, modeled as a network
ing the check with the full encoding of the planning problem.
of TAs, there is a task component (synchronized with the
In the tested benchmarks, the number of necessary loops in
high-level communicate action) that controls a communica-
the second approach ranged from 1 to 8. It is evident from
tion component: when a message is sent, the communication
the coverage table that the bound κ on the length of the plat-
component moves to a standby location if no other message
form traces greatly influences the performance of both ap-
is sent within 30 units of time; if this happens, the task needs
proaches, as platform traces are universally quantified in the
to resume the component by transitioning to the resuming
encoding formula. A future direction is to try to check the
location, before sending the next message. In this problem,
executability and safety notions using an unbounded tech-
we include a safety property by requiring that the platform
nique, possibly proving the non-existence of bad traces.
never transitions to the resuming location, to avoid consum-
ing excessive energy for the resumption process. Therefore,
solution plans will be required to only travel between con- Conclusions
secutive locations when the first message is sent, so that the In this paper, we formally defined the “Platform-Aware Mis-
communication component does not need to be resumed. We sion Planning” problem, motivated by the need of synthesiz-
scale the instances by increasing the number of locations, ing plans that not only achieve the mission objectives, but
by considering all the possible combinations of locations in also ensure executability and the satisfaction of safety prop-
which to send messages, and by increasing the bound κ. erties during execution. We devised an amalgamated method
FACTORY is the same domain that we used in the running and a decomposition approach that can solve the problem,
example shown in Fig. 1b and Fig. 1a. We consider two dif- and showed the superiority of the latter experimentally.
ferent ways of modeling the deadline: either at the planning As future work, we plan to generalize our model to the
layer (domain FACTORY 1), by having a durative ”Process” case of hybrid automata, allowing the representation of con-
action that needs to be run in parallel with all other actions tinuous behaviors and resources in the platform. Moreover,
in the plan, or at the platform layer (domain FACTORY 2), by we are interested in other problems that can be defined in the
having a component that synchronizes with the task associ- formal framework we proposed, such as synthesizing plans
ated with the ”Work” action, and that disables the synchro- that guarantee other formal properties like diagnosability.
Acknowledgments and Meyer, T., eds., Proceedings of the 19th International
This work has been partly supported by the PNRR project Conference on Principles of Knowledge Representation and
iNEST – Interconnected Nord-Est Innovation Ecosystem Reasoning, KR 2022, Haifa, Israel, July 31 - August 5, 2022.
(ECS00000043) funded by the European Union NextGen- Shin, J.-A.; and Davis, E. 2005. Processes and continuous
erationEU program and by the STEP-RL project funded by change in a SAT-based planner. Artificial Intelligence.
the European Research Council (grant n. 101115870). Valentini, A.; Micheli, A.; and Cimatti, A. 2019. Temporal
Planning with Intermediate Conditions and Effects. CoRR,
References abs/1909.11581.
Alur, R.; and Dill, D. L. 1994. A Theory of Timed Automata. Viehmann, T.; Hofmann, T.; and Lakemeyer, G. 2021.
Theoretical Computer Science, 126(2): 183–235. Transforming Robotic Plans with Timed Automata to Solve
Barrett, C. W.; Sebastiani, R.; Seshia, S. A.; and Tinelli, C. Temporal Platform Constraints. In Zhou, Z.-H., ed., Pro-
2009. Satisfiability Modulo Theories. In Biere, A.; Heule, ceedings of the Thirtieth International Joint Conference on
M.; van Maaren, H.; and Walsh, T., eds., Handbook of Sat- Artificial Intelligence, IJCAI-21, 2083–2089. International
isfiability, volume 185 of Frontiers in Artificial Intelligence Joint Conferences on Artificial Intelligence Organization.
and Applications, 825–885. IOS Press. ISBN 978-1-58603- Main Track.
929-5. Zanetti, A.; Moro, D. D.; Vreto, R.; Robol, M.; Roveri, M.;
Bozzano, M.; Cimatti, A.; and Roveri, M. 2021. A Compre- and Giorgini, P. 2023. Implementing BDI Continual Tem-
hensive Approach to On-Board Autonomy Verification and poral Planning for Robotic Agents. In IEEE International
Validation. ACM Trans. Intell. Syst. Technol., 12(4). Conference on Web Intelligence and Intelligent Agent Tech-
Cashmore, M.; Fox, M.; Long, D.; Magazzeni, D.; Ridder, nology, WI-IAT 2023, Venice, Italy, October 26-29, 2023,
B.; Carrera, A.; Palomeras, N.; Hurtós, N.; and Carreras, M. 378–382. IEEE.
2015. ROSPlan: Planning in the Robot Operating System.
In Brafman, R. I.; Domshlak, C.; Haslum, P.; and Zilber-
stein, S., eds., Proceedings of the Twenty-Fifth International
Conference on Automated Planning and Scheduling, ICAPS
2015, Jerusalem, Israel, June 7-11, 2015, 333–341. AAAI
Press.
Cimatti, A.; Griggio, A.; Magnago, E.; Roveri, M.; and
Tonetta, S. 2019. Extending nuXmv with Timed Transition
Systems and Timed Temporal Properties. In Dillig, I.; and
Tasiran, S., eds., Computer Aided Verification - 31st Interna-
tional Conference, CAV 2019, New York City, NY, USA, July
15-18, 2019, Proceedings, Part I, volume 11561 of Lecture
Notes in Computer Science, 376–386. Springer.
Cimatti, A.; Roveri, M.; and Bertoli, P. 2004. Conformant
planning via symbolic model checking and heuristic search.
Artificial Intelligence, 159(1): 127–206.
Dechter, R.; Meiri, I.; and Pearl, J. 1991. Temporal con-
straint networks. Artificial intelligence.
Fox, M.; and Long, D. 2003. PDDL2.1: An extension to
PDDL for expressing temporal planning domains. Journal
of artificial intelligence research.
Gario, M.; and Micheli, A. 2015. pySMT: a Solver-Agnostic
Library for Fast Prototyping of SMT-Based Algorithms. In
SMT Workshop.
Gat, E. 1998. On three-layer architectures. Artificial intelli-
gence and mobile robots, 195: 210.
Ghallab, M.; Nau, D. S.; and Traverso, P. 2004. Automated
planning - theory and practice. Elsevier. ISBN 978-1-
55860-856-6.
Gigante, N.; Micheli, A.; Montanari, A.; and Scala, E. 2022.
Decidability and complexity of action-based temporal plan-
ning over dense time. Artif. Intell., 307: 103686.
Gigante, N.; Micheli, A.; and Scala, E. 2022. On the Ex-
pressive Power of Intermediate and Conditional Effects in
Temporal Planning. In Kern-Isberner, G.; Lakemeyer, G.;