0% found this document useful (0 votes)
38 views20 pages

Dynamic Vehicle Routing For Robotic Systems: Proceedings of The IEEE September 2011

This document summarizes a paper on dynamic vehicle routing algorithms for robotic systems. It discusses how recent advancements in robotics and networking enable groups of robots to complete tasks in uncertain, dynamically changing environments where new task requests arise over time. Specifically, it contrasts static vehicle routing problems, where all tasks are known upfront, with dynamic vehicle routing problems, where tasks arrive over time. For dynamic problems, the paper surveys algorithms that blend ideas from optimization, control, queueing theory and more to enable real-time task allocation and vehicle routing as new requests occur. The goal is to minimize expected wait times for tasks or maximize the fraction of tasks serviced successfully.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views20 pages

Dynamic Vehicle Routing For Robotic Systems: Proceedings of The IEEE September 2011

This document summarizes a paper on dynamic vehicle routing algorithms for robotic systems. It discusses how recent advancements in robotics and networking enable groups of robots to complete tasks in uncertain, dynamically changing environments where new task requests arise over time. Specifically, it contrasts static vehicle routing problems, where all tasks are known upfront, with dynamic vehicle routing problems, where tasks arrive over time. For dynamic problems, the paper surveys algorithms that blend ideas from optimization, control, queueing theory and more to enable real-time task allocation and vehicle routing as new requests occur. The goal is to minimize expected wait times for tasks or maximize the fraction of tasks serviced successfully.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228639118

Dynamic Vehicle Routing for Robotic Systems

Article  in  Proceedings of the IEEE · September 2011


DOI: 10.1109/JPROC.2011.2158181

CITATIONS READS
156 186

5 authors, including:

Emilio Frazzoli Ketan Savla


ETH Zurich University of Southern California
432 PUBLICATIONS   19,549 CITATIONS    66 PUBLICATIONS   1,683 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Mobility on Demand View project

collision checking certificates View project

All content following this page was uploaded by Emilio Frazzoli on 01 June 2014.

The user has requested enhancement of the downloaded file.


1

Dynamic Vehicle Routing for Robotic Systems


Francesco Bullo Emilio Frazzoli Marco Pavone Ketan Savla Stephen L. Smith

Abstract—Recent years have witnessed great advancements in a request for close-range observation by one of the UAVs
the science and technology of autonomy, robotics and networking. is generated. In response to this request, a UAV visits the
This paper surveys recent concepts and algorithms for dynamic location to gather close-range information and investigates the
vehicle routing (DVR), that is, for the automatic planning of
optimal multi-vehicle routes to perform tasks that are generated cause of the alarm. Each request for close-range observation
over time by an exogenous process. We consider a rich variety might include priority levels or time windows during which the
of scenarios relevant for robotic applications. We begin by inspection must occur and it might require an on-site service
reviewing the basic DVR problem: demands for service arrive time. In summary, from a control algorithmic viewpoint, each
at random locations at random times and a vehicle travels to time a new request arises, the UAVs need to decide which
provide on-site service while minimizing the expected wait time
of the demands. Next, we treat different multi-vehicle scenarios vehicle will inspect that location and along which route. Thus,
based on different models for demands (e.g., demands with the problem is to design algorithms that enable real-time task
different priority levels and impatient demands), vehicles (e.g., allocation and vehicle routing.
motion constraints, communication and sensing capabilities), and Accordingly, this paper surveys allocation and routing al-
tasks. The performance criterion used in these scenarios is gorithms that typically blend ideas from receding-horizon re-
either the expected wait time of the demands or the fraction
of demands serviced successfully. In each specific DVR scenario, source allocation, distributed optimization, combinatorics and
we adopt a rigorous technical approach that relies upon methods control. The key novelty in our approach is the simultaneous
from queueing theory, combinatorial optimization and stochastic introduction of stochastic, combinatorial and queueing aspects
geometry. First, we establish fundamental limits on the achievable in the distributed coordination of robotic networks.
performance, including limits on stability and quality of service. Static vehicle routing: In the recent past, considerable
Second, we design algorithms, and provide provable guarantees
on their performance with respect to the fundamental limits. efforts has been devoted to the problem of how to coop-
eratively assign and schedule demands for service that are
defined over an extended geographical area [1], [2], [3], [4],
I. I NTRODUCTION [5]. In these papers, the main focus is in developing distributed
This survey presents a joint algorithmic and queueing ap- algorithms that operate with knowledge about the demands
proach to the design of cooperative control and task allocation locations and with limited communication between robots.
strategies for networks of uninhabited vehicles and robots. However, the underlying mathematical model is static, in that
The approach enables groups of robots to complete tasks in no new demands arrive over time. Thus, the centralized version
uncertain and dynamically changing environments, where new of the problem fits within the framework of the static vehicle
task requests are generated in real-time. Applications include routing problem (see [6] for a thorough introduction to this
surveillance and monitoring missions, as well as transportation problem), whereby: (i) a team of m vehicles is required to
networks and automated material handling. service a set of n demands in a 2-dimensional space; (ii) each
As a motivating example, consider the following scenario: demand requires a certain amount of on-site service; (iii) the
a sensor network is deployed in order to detect suspicious goal is to compute a set of routes that optimizes the cost
activity in a region of interest. (Alternatively, the sensor of servicing (according to some quality of service metric)
network is replaced by a high-altitude sensory-rich aircraft the demands. In general, most of the available literature on
loitering over the region.) In addition to the sensor network, routing for robotic networks focuses on static environments
a team of unmanned aerial vehicles (UAVs) is available and and does not properly account for scenarios in which dynamic,
each UAV is equipped with close-range high-resolution on- stochastic and adversarial events take place.
board sensors. Whenever a sensor detects a potential event, Dynamic vehicle routing: The problem of planning routes
through service demands that arrive during a mission exe-
This research was partially supported by AFOSR award FA 8650-07-2- cution is known as the “dynamic vehicle routing problem”
3744, ARO MURI award W911NF-05-1-0219, NSF awards ECCS-0705451
and CMMI-0705453, and ONR award N00014-07-1-0721. (abbreviated as the DVR problem in the operations research
F. Bullo is with the Center for Control, Dynamical Systems and Com- literature). See Figure 1 for an illustration of DVR. There
putation and with the Department of Mechanical Engineering, University of are two key differences between static and dynamic vehicle
California, Santa Barbara, CA 93106 ([email protected]).
E. Frazzoli and M. Pavone are with the Laboratory for Information and De- routing problems. First, planning algorithms should actually
cision Systems, Department of Aeronautics and Astronautics, Massachusetts provide policies (in contrast to pre-planned routes) that pre-
Institute of Technology, Cambridge, MA 02139 ({pavone,frazzoli}@mit.edu). scribe how the routes should evolve as a function of those
K. Savla is with the Laboratory for Information and Decision Systems,
Massachusetts Institute of Technology, Cambridge, MA 02139 (ksavla@mit. inputs that evolve in real-time. Second, dynamic demands (i.e.,
edu). demands that vary over time) add queueing phenomena to the
S. L. Smith is with the Computer Science and Artificial Intelligence combinatorial nature of vehicle routing. In such a dynamic
Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139
([email protected]). setting, it is natural to focus on steady-state performance
The authors are listed in alphabetical order. instead of optimizing the performance for a single demand.
2

for the demands is large). Here, by network changes we mean


changes in the number of vehicles, the arrival rate of demands,
and the characterization of the on-site service requirement.
In Section V we discuss time-constrained and prioritized
service. For time-constrained DVR problems, we establish
upper and lower bounds on the optimal number of vehicles for
a given level of service quality (defined as the desired fraction
Fig. 1. An illustration of dynamic vehicle routing for a robotic system. From of demands that must receive service within the deadlines).
panel #1 to #2: vehicles are assigned to customers and select routes. Panel Additionally, we rigorously characterize two service policies:
#3: the DVR problem is how to re-allocate and re-plan routes when new
customers appear. in light load the DVR problem with time constraints is
closely related to a particular facility location problem, and in
moderate and heavy load, static vehicle routing methods, such
Additionally, system stability in terms of the number of as solutions of traveling salesman problems, can provide good
waiting demands is an issue to be addressed. performance. We then study DVR problems in which demands
Algorithmic queueing theory for DVR: The objective of have an associated level of priority (or importance). The
this work is to present a joint algorithmic and queueing ap- problem is characterized by the number of different priority
proach to the design of cooperative control and task allocation classes n and their relative levels of importance. We provide
strategies for networks of uninhabited vehicles required to lower bounds on the optimal performance and a service policy
operate in dynamic and uncertain environments. This approach which is guaranteed to perform within a factor 2n2 of the
is based upon the pioneering work of Bertsimas and Van optimal in the heavy-load.
Ryzin [7], [8], [9], who introduced queueing methods to solve We then study the implications of vehicle motion constraints
the simplest DVR problem (a vehicle moves along straight in Section VI. We focus on the Dubins vehicle, namely,
lines and visits demands whose time of arrival, location a nonholonomic vehicle that is constrained to move along
and on-site service are stochastic; information about demand paths of bounded curvature without reversing direction. For
location is communicated to the vehicle upon demand arrival); m Dubins vehicles, the DVR problem with arrival rate λ and
see also the earlier related work [10]. with uniform spatial distribution has the following properties:
Starting with these works [7], [8], [9] and integrating ideas the system time√is (i) of the order λ2 /m3 in heavy-load, (ii)
from dynamics, combinatorial optimization, teaming, and dis- of the order 1/ m in the √light-load if the vehicle density is
tributed algorithms, we have recently developed a systematic small, and of the order 1/ 3 m in the light-load if the density
approach to tackle complex dynamic routing problems for of the vehicles is high.
robotic networks. We refer to this approach as “algorithmic In Section VII we discuss the case when vehicles are
queueing theory” for dynamic vehicle routing. The power of heterogeneous, each capable of providing a specific type of
algorithmic queueing theory stems from the wide spectrum of service. Each demand may require several different services,
aspects, critical to the routing of robotic networks, for which implying that collaborative teams of vehicles must be formed
it enables a rigorous study; specific examples taken from our to service a demand. We present three simple policies for this
work in the past few years include complex models for the problem. For each policy we show that there is a broad class
demands such as time constraints [11], [12], service priori- of system parameters for which the policy’s performance is
ties [13], and translating demands [14], problems concerning within a constant factor of the optimal.
robotic implementation such as adaptive and decentralized Finally, in Section VIII we summarize other recent results
algorithms [15], [16], complex vehicle dynamics [17], [18], in DVR and draw our conclusions.
limited sensing range [19], and team forming [20], and even
integration of humans in the design space [21]. II. A LGORITHMIC A PPROACHES TO DVR P ROBLEMS
Survey content: In this work we provide a detailed In this section we review possible approaches to DVR
account of algorithmic queueing theory for DVR, with an problems and motivate our proposed algorithmic queueing
emphasis on robotic applications. We start in Section II by theory approach.
reviewing the possible approaches to dynamic vehicle routing
problems. Then, in Section III, we describe the foundations of
algorithmic queueing theory, which lie on the aforementioned A. One-Step Sequential Optimization
works of Bertsimas and Van Ryzin. In the following four A naive approach to DVR is to re-optimize every time a
sections we discuss some of our recent efforts in applying new demand arrives, by using an algorithm that is optimal for
algorithmic queueing theory to realistic dynamic routing prob- the corresponding static vehicle routing problem. However,
lems for robotic systems. this approach can lead to highly undesirable behaviors as the
Specifically, in Section IV we present routing policies for following example shows. Assume that a unit-velocity vehicle
DVR problems that (i) are spatially distributed, scalable to provides service along a line segment of unit length (see figure
large networks, and adaptive to network changes, (ii) have 2(a)). New demands arrive either at endpoint x = 0 or at
remarkably good performance guarantees in both the light- endpoint x = 1. Assume that the objective is to minimize
load regime (i.e., when the arrival rate for the demands is the average waiting time of the demands (as it is common
small) and in the heavy-load regime (i.e., when the arrival rate in the DVR literature); hence, at any time, a re-optimization
3

the performance of an online algorithm is compared to the


performance of a corresponding offline algorithm (i.e., an
0 0.5 1 0 0.5 1
algorithm that has a priori knowledge of the entire input) in
(a) A new demand arrives at x = 1. (b) A new demand arrives at x = 0 the worst case scenario. Specifically, an online algorithm is
just before the vehicle reaches x = c-competitive if its cost on any problem instance is at most c
0.5.
times the cost of an optimal offline algorithm:
Costonline (I) ≤ c Costoptimal offline (I), ∀ problem instances I.
0 0.5 1 0 0.33 0.5 1
In the recent past, dynamic vehicle routing problems have
(c) The vehicle re-optimizes its route (d) A new demand arrives at x = been studied in this framework, under the name of the online
and reverses its motion. 1 and the vehicle after re-optimizing traveling repairman problem [23], [24], [25].
reverses its motion
While the online algorithm approach applied to DVR has
Fig. 2. Example where re-optimization causes a vehicle to travel forever led to numerous results and interesting insights, it leaves some
without providing service to any demand. The vehicle is represented by a questions unanswered, especially in the context of robotic
blue chevron object, a newly arrived demand is represented by a black circle,
and old demands are represented by grey circles. networks. First, competitive analysis is a worst-case analysis,
hence, the results are often overly pessimistic for normal prob-
lem instances. Moreover, in many applications there is some
Pn
algorithm provides a route that minimizes j=1 Wj , where n probabilistic problem structure (e.g., distribution of the inter-
is the number of outstanding demands at that time, and Wj arrival times, spatial distribution of future demands, distribu-
is the waiting time for the jth demand. Assume that at time tion of on-site service times etc.), that can be advantageously
0 the vehicle is at x = 0 and a new demand arrives at x = 1. exploited by the vehicles. In online algorithms, this additional
Hence, the vehicle travels immediately toward that demand. information is not taken into account. Second, competitive
Assume that just before reaching x = 1/2 a new demand analysis is used to bound the performance relative to the
arrives at x = 0. It is easy to show that the optimal strategy optimal offline algorithm, and thus it does not give an absolute
is to reverse motion and provide service first to the demand measure of performance. In other words, an optimal online
at x = 0. However, assume that just before reaching x = 1/3 algorithm is an algorithm with minimum “cost of causality” in
a new demand arrives at x = 1. It is easy to show that the the worst-case scenario, but not necessarily with the minimum
optimal strategy is to reverse motion and provide service first worst-case cost. Finally, many important real-world constraints
to the demands at x = 1. In general, let k, n be positive for DVR, such as time windows, priorities, differential con-
integers and let εk =Pk/(2k + 1). Assume that just before straints on vehicle’s motion and the requirement of teams to
n−1
time t2n−1 = 1/2 + k=1 (1 − 2εk ) a new demand arrives fulfill a demand “have so far proved to be too complex to be
at x = 0, and that just before time t2n = t2n−1 + 1/2 − εn considered in the online framework” [26, page 206]. Some of
a new demand arrives at x = 1. (Assume that at time t0 = 0 these drawbacks have been recently addressed by [27] where
the vehicle is at x = 0 and a new demand arrives at x = 1.) a combined stochastic and online approach is proposed for a
It is possible to show that at each new arrival the optimal general class of combinatorial optimization problems and is
strategy ensuing from a re-optimization algorithm is to reverse analyzed under some technical assumptions.
motion before one of the two endpoints is reached. Note that This discussion motivates an alternative approach for DVR
limn→+∞ tn = +∞, hence the vehicle will travel forever in the context of robotic networks, based on probabilistic
without servicing any demand! modeling, and average-case analysis.
This example therefore illustrates the pitfalls of the straight-
forward application of static routing and sequential re- C. Algorithmic Queueing Theory
optimization algorithms to dynamic problems. Broadly speak- Algorithmic queueing theory embeds the dynamic vehicle
ing, we argue that DVR problems require tailored routing routing problem within the framework of queueing theory and
algorithms with provable performance guarantees. There are overcomes most of the limitations of the online algorithm
currently two main algorithmic approaches that allow both a approach; in particular, it allows to take into account several
rigorous synthesis and an analysis of routing algorithms for real-world constraints, such as time constraints and differential
DVR problems; we review these two approaches next. constraints on vehicles’ dynamics. We call this approach
algorithmic queueing theory since its objective is to synthesize
B. Online Algorithms an efficient control policy, whereas in traditional queueing
theory the objective is usually to analyze the performance of a
An online algorithm is one that operates based on input in- specific policy. Here, an efficient policy is one whose expected
formation given up to the current time. Thus, these algorithms performance is either optimal or optimal within a constant
are designed to operate in scenarios where the entire input is factor.1 Algorithmic queueing theory basically consists of the
not known at the outset, and new pieces of the input should
be incorporated as they become available. The distinctive 1 The expected performance of a policy is the expected value of the

feature of the online algorithm approach is the method that is performance over all possible inputs (i.e., demand arrival sequences). A policy
performs within a constant factor κ of the optimal if the ratio between the
used to evaluate the performance of online algorithms, which policy’s expected performance and the optimal expected performance is upper
is called competitive analysis [22]. In competitive analysis, bounded by κ.
4

following steps: therein. The m-median of the set Q with density ϕ is the
(i) queueing model of the robotic system and analysis of global minimizer
its structure; ∗
Pm (Q) = arg min Hm (P, Q).
(ii) establishment of fundamental limitations on perfor- P ∈Qm
mance, independent of algorithms; and
(iii) design of algorithms that are either optimal or constant- We let Hm ∗
(Q) = Hm (Pm ∗
(Q), Q) be the global minimum
factor away from optimal, possibly in specific asymp- of Hm . The set of critical points of Hm contains all arrays
totic regimes. (p1 , . . . , pm ) with distinct entries and with the property that
each point pk is simultaneously the generator of the Voronoi
Finally, the proposed algorithms are evaluated via numerical,
cell Vk (P ) and the median of Vk (P ). We refer to such Voronoi
statistical and experimental studies, including Monte-Carlo
diagrams as median Voronoi diagrams. It is possible to show
comparisons with alternative approaches.
that a median Voronoi diagram always exists for any bounded
In order to make the model tractable, customers are usually
convex domain Q and density ϕ. More properties of the multi-
considered “statistically independent” and their arrival process
median function are discussed in Section C of the Appendix.
is assumed stationary (with possibly unknown parameters).
Because these assumptions can be unrealistic in some scenar-
ios, this approach has its own limitations. The aim of this B. Queueing Model for DVR
paper is to show that algorithmic queueing theory, despite
Here we review the model known in the literature as the
these disadvantages, is a very useful framework for the design
m-vehicle Dynamic Traveling Repairman Problem (m-DTRP)
of routing algorithms for robotic networks and a valuable
and introduced in [7], [8].
complement to the online algorithm approach.
Consider m vehicles free to move, at a constant speed
v, within the environment Q (even though we are assuming
III. A LGORITHMIC Q UEUEING T HEORY FOR DVR Q ⊂ R2 , the extension to three-dimensional environments is
In this section we describe algorithmic queueing theory. We often straightforward). The vehicles are identical, and have
start with a short review of some fundamental concepts from unlimited range and demand servicing capacity.
the locational optimization literature, and then we introduce Demands are generated according to a homogeneous (i.e.,
the general approach. time-invariant) spatio-temporal Poisson process, with time
intensity λ ∈ R>0 , and spatial density ϕ : Q → R>0 . In other
A. Preliminary Tools words, demands arrive to Q according to a Poisson process
with intensity λ, and their locations {Xj ; j ≥ 1} are i.i.d.
The Euclidean Traveling Salesman Problem (in short, TSP) (i.e., independent and identically distributed) and distributed
is formulated as follows: given a set D of n points in Rd , according to a density ϕ whose support is Q. A demand’s
find a minimum-length tour (i.e., a closed path that visits all location becomes known (is realized) at its arrival epoch; thus,
points exactly once) of D. More properties of the TSP tour at time t we know with certainty the locations of demands that
can be found in Section A of the Appendix. In this paper, we arrived prior to time t, but future demand locations form an
will present policies that require real-time solutions of TSPs i.i.d. sequence. The density ϕ satisfies:
over possibly large point sets; this can indeed be achieved by Z Z
using efficient approximation algorithms presented in Section P [Xj ∈ S] = ϕ(x) dx ∀S ⊆ Q, and ϕ(x) dx = 1.
B of the Appendix. S Q
Let the environment Q ⊂ R2 be a bounded, convex set At each demand location, vehicles spend some time s ≥ 0
(the following concepts can be similarly defined in higher in on-site service that is i.i.d. and generally distributed with
dimensions). Let P = (p1 , . . . , pm ) be an array of m distinct finite first and second moments denoted by s̄ > 0 and s2 . A
points in Q. The Voronoi diagram of Q generated by P is realized demand is removed from the system after one of the
an array of sets, denoted by V(P ) = (V1 (P ), . . . , Vm (P )), vehicles has completed its on-site service. We define the load
defined by factor % := λs̄/m.
Vi (P ) = {x ∈ Q| kx − pi k ≤ kx − pj k, ∀j ∈ {1, . . . , m}}, The system time of demand j, denoted by Tj , is defined as
the elapsed time between the arrival of demand j and the time
where k · k denotes the Euclidean norm in R2 . We refer to P one of the vehicles completes its service. The waiting time of
as the set of generators of V(P ), and to Vi (P ) as the Voronoi demand j, Wj , is defined by Wj = Tj − sj . The steady-state
cell or the region of dominance of the ith generator. system time is defined by T := lim supj→∞ E [Tj ]. A policy
The expected distance between a random point q, generated for routing the vehicles is said to be stable if the expected
according to a probability density function ϕ, and the closest number of demands in the system is uniformly bounded at
point in P is given by all times. A necessary condition for the existence of a stable
Hm (P, Q) := E mink∈{1,...,m} kpk − qk .
  policy is that % < 1; we shall assume % < 1 throughout the
paper. When we refer to light-load conditions, we consider
The function Hm is known in the locational optimization the case % → 0+ , in the sense that λ → 0+ ; when we refer to
literature as the continuous Weber function or the continu- heavy-load conditions, we consider the case % → 1− , in the
ous multi-median function; see [28], [29] and the references sense that λ → (m/s̄)− .
5

Let P be the set of all causal, stable, and time-invariant where βTSP,2 ' 0.7120 ± 0.0002 (for more detail on the
routing policies and T π be the system time of a particular constant βTSP,2 , we refer the reader to Appendix A).
policy π ∈ P. The m-DTRP is then defined as the problem Within the class of spatially biased policies in P, the
of finding a policy π ∗ ∈ P (if one exists) such that optimal system time is lower bounded by
∗ 3
T := T π∗ = inf T π .
R
π∈P ∗
2
βTSP,2 λ Q ϕ2/3 (x)dx
TB ≥ as % → 1− . (3)
In general, it is difficult to characterize the optimal achiev- 2 m2 v 2 (1 − %)2

able performance T and to compute the optimal policy π ∗ Both bounds (2) and (3) are tight: there exist policies whose
for arbitrary values of the problem parameters λ, m, etc. It system times, in the limit % → 1− , attain these bounds;
is instead possible and useful to consider particular ranges of therefore the inequalities in (2) and (3) could indeed be
parameter values and, specifically, asymptotic regimes such replaced by equalities. We present asymptotically optimal
as the light-load and the heavy-load regimes. For the purpose policies for the heavy-load case below. It is shown in [9] that
of characterizing asymptotic performance, we briefly review the lower bound in equation (3) is always less than or equal
some useful notation. For f, g : N → R, f ∈ O(g) to the lower bound in equation (2) for all densities ϕ.
(respectively, f ∈ Ω(g)) if there exist N0 ∈ N and k ∈ R>0 We conclude with some remarks. First, it is possible to show
such that |f (N )| ≤ k|g(N )| for all N ≥ N0 (respectively, (see [9], Proposition 1) that a uniform spatial density function
|f (N )| ≥ k|g(N )| for all N ≥ N0 ). If f ∈ O(g) and leads to the worst possible performance and that any deviation
f ∈ Ω(g), then the notation f ∈ Θ(g) is used. from uniformity in the demand distribution will strictly lower
the optimal mean system time in both the unbiased and biased
C. Lower Bounds on the System Time case. Additionally, allowing biased service results in a strict
reduction of the optimal expected system time for any non-
As in many queueing problems, the analysis of the DTRP uniform density ϕ. Finally, when the density is uniform there
problem for all the values of the load factor % in (0, 1) is is nothing to be gained by providing biased service.
difficult. In [7], [8], [9], [30], lower bounds for the optimal
steady-state system time are derived for the light-load case
(i.e., % → 0+ ), and for the heavy-load case (i.e., % → 1− ). D. Centralized and Ad-Hoc Policies
Subsequently, policies are designed for these two limiting In this section we present centralized, ad-hoc policies that
regimes, and their performance is compared to the lower are either optimal in light-load or optimal in heavy-load. Here,
bounds. we say that a policy is ad-hoc if it performs “well” only for
For the light-load case, a tight lower bound on the system a limited range of values of %. In light-load, the SQM policy

time is derived in [8]. In the light-load case, the lower bound provides optimal performance (i.e., lim%→0+ T SQM /T = 1):
on the system time is strongly related to the solution of the The m Stochastic Queue Median (SQM) Pol-
m-median problem: icy [8] — Locate one vehicle at each of the m
∗ 1 ∗ median locations for the environment Q. When
T ≥ H (Q) + s̄, as % → 0+ . (1) demands arrive, assign them to the vehicle corre-
v m
sponding to the nearest median location. Have each
The bound is tight: there exist policies whose system times, vehicle service its respective demands in First-Come,
in the limit % → 0+ , attain this bound; we present such First-Served (FCFS) order returning to its median
asymptotically optimal policies for the light-load case below. after each service is completed.
Two lower bounds exist for the heavy-load case [9], [30]
This policy, although optimal in light-load, has two charac-
depending on whether one is interested in biased policies or
teristics that limit its application to robotic networks: First,
unbiased policies.
it quickly becomes unstable as the load increases, i.e., there
Definition III.1 (Spatially biased and unbiased policies). Let exists %c < 1 such that for all % > %c the system time T SQM
X be the location of a randomly chosen demand and W be is infinite (hence, this policy is ad-hoc). Second, a central
its wait time. A policy π is said to be entity needs to compute the m-median locations and assign
(i) spatially unbiased if for every pair of sets S1 , S2 ⊆ Q them to the vehicles (hence, from this viewpoint the policy is
centralized).
E [W |X ∈ S1 ] = E [W |X ∈ S2 ]; and In heavy-load, the UTSP policy provides optimal unbiased

performance (i.e., lim%→1− T UTSP /T U = 1):
(ii) spatially biased if there exist sets S1 , S2 ⊆ Q such that
The Unbiased TSP (UTSP) Policy [9] — Let r be
E [W |X ∈ S1 ] > E [W |X ∈ S2 ]. a fixed positive, large integer. From a central point
in the interior of Q, subdivide theRservice region into
Within the class of spatially unbiased policies in P, the r wedges Q1 , . . . , Qr such that Qk ϕ(x)dx = 1/r,
optimal system time is lower bounded by k ∈ {1, . . . , r}. Within each subregion, form sets
R 2 of demands of size n/r (n is a design parameter).

2
βTSP,2 λ Q
ϕ1/2
(x)dx As sets are formed, deposit them in a queue and
TU ≥ as % → 1− , (2)
2 m v 2 (1 − %)2
2 service them FCFS with the first available vehicle
6

by forming a TSP on the set and following it in the dependencies among the inter-demand travel distances,
an arbitrary direction. Optimize over n (see [9] for the analysis of the NN policy is difficult and no rigorous
details). results have been obtained so far [7]; in particular, there are
It is possible to show that, as % → 1− , no rigorous results about its stability properties. Simulation
R 2 experiments show that the NN policy performs like a biased
  2
m βTSP,2 λ Q
ϕ 1/2
(x)dx policy and is not optimal in the light-load case orin the heavy-
T UTSP ≤ 1 + ; (4) load case [7], [9]. Therefore, the NN policy lacks provable
r 2 m2 v 2 (1 − %)2
performance guarantees (in particular about stability), and does
thus, letting r → ∞, the lower bound in (2) is achieved. not seem to achieve optimal performance in light-load or in
The same paper [9] presents an optimal biased policy. heavy-load.
This policy, called Biased TSP (BTSP) Policy, relies on an In [15], we study decentralized and adaptive routing policies
even finer partition of the environment and requires ϕ to be that are optimal in light-load and that are optimal unbiased
piecewise constant. algorithms in heavy-load. The key idea we pursue is that of
Although both the UTSP and the BTSP policies are optimal partitioning policies:
within their respective classes, they have two characteristics
Definition IV.1 (Partitioning policies). Given a policy π for
that limit their application to robotic networks: First, in the
the 1-DTRP and m vehicles, a π-partitioning policy is a family
UTSP policy, to ensure stability, n should be chosen so that
of multi-vehicle policies such that
(see [9], page 961)
R 2 (i) the environment Q is partitioned into m openly disjoint
λ2 βTSP,2
2
ϕ 1/2
(x) dx subregions Qk , k ∈ {1, . . . , m}, whose union is Q,
Q
n> ; (ii) one vehicle is assigned to each subregion (thus, there
m2 v 2 (1 − %)2 is a one-to-one correspondence between vehicles and
therefore, to ensure stability over a wide range of values of subregions),
%, the system designer is forced to select a large value for (iii) each vehicle executes the single-vehicle policy π in order
n. However, if during the execution of the policy the load to service demands that fall within its own subregion.
factor turns out to be only moderate, demands have to wait for
Because Definition IV.1 does not specify how the environ-
an excessively large set to be formed, and the overall system
ment is actually partitioned, it describes a family of policies
performance deteriorates significantly. Similar considerations
(one for each partitioning strategy) for the m-DTRP. The SQM
hold for the BTSP policy. Hence, these two policies are ad-
policy, which is optimal in light-load, is indeed a partitioning
hoc. Second, both policies require a centralized data structure
policy whereby Q is partitioned according to a median Voronoi
(the demands’ queue is shared by the vehicles); hence, both
diagram and each vehicle executes inside its own Voronoi
policies are centralized.
region the policy “service FCFS and return to the median
Remark III.2 (System time bounds in heavy-load with zero after each service completion.” Moreover, specific partitioning
service time). If s̄ = 0, then the heavy-load regime is defined policies, which will be characterized in Theorem IV.2, are
as λ/m → +∞, and all the performance bounds we provide optimal or within a constant factor of the optimal in heavy-
in this and in the next two sections hold by simply substituting load.
% = 0. For example, equation (2) reads In the following,R given two functions ϕj : Q → R>0 ,
R 2 j ∈ {1, 2}, with Q ϕj (x) dx = cj , an m-partition (i.e.,

2
βTSP,2 λ Q
ϕ 1/2
(x)dx a partition into m subregions)R is simultaneously equitable
TU ≥ as λ/m → +∞.  with respect to ϕ1 and ϕ2 if Qi ϕj (x) dx = cj /m for all
2 m2 v 2
i ∈ {1, . . . , m} and j ∈ {1, 2}. Theorem 12 in [31] shows that,
IV. ROUTING FOR ROBOTIC N ETWORKS : given two such functions ϕj , j ∈ {1, 2}, there always exists
D ECENTRALIZED AND A DAPTIVE P OLICIES an m-partition that is simultaneously equitable with respect to
In this section we first discuss routing algorithms that ϕ1 and ϕ2 , and whose subregions Qi are convex. Then, the
are both adaptive and amenable to decentralized implemen- following results characterize the optimality of two classes of
tation; then, we present a decentralized and adaptive routing partitioning policies [15].
algorithm that does not require any explicit communication Theorem IV.2 (Optimality of partitioning policies). Assume
between the vehicles while still being optimal in the light- π ∗ is a single-vehicle, unbiased optimal policy in the heavy-
load case. load regime (i.e., % → 1− ). For m vehicles,
(i) a π ∗ -partitioning policy based on an m-partition which
A. Decentralized and Adaptive Policies is simultaneously equitable with respect to ϕ and ϕ1/2
is an optimal unbiased policy in heavy-load.
Here, we say that a policy is adaptive if it performs
(ii) a π ∗ -partitioning policy based on an m-partition which
“well” for every value of % in the range [0, 1). A candidate
is equitable with respect to ϕ does not achieve, in
decentralized and adaptive control policy is the simple Nearest
general, the optimal unbiased performance, however it
Neighbor (NN) policy: at each service completion epoch, each
is always within a factor m of it in heavy-load.
vehicle chooses to visit next the closest unserviced demand,
if any, otherwise it stops at the current position. Because of The above results lead to the following strategy: First, for
7

the 1-DTRP, one designs an adaptive and unbiased (in heavy- a partition that is equitable with respect to ϕ and represents a
load) control policy with provable performance guarantees. “good” approximation of a median Voronoi diagram (see [32]
Then, by using decentralized algorithms for environment par- for details on the metrics that we use to judge “closeness”
titioning, such as those recently developed in [32], one extends to median Voronoi diagrams). Moreover, if an m-median of
such single-vehicle policy to a decentralized and adaptive Q that induces a Voronoi partition that is equitable with
multi-vehicle policy. respect to ϕ exists, the algorithm will locally converge to
Consider, first, the single vehicle case. it. This partitioning algorithm is related to the classic Lloyd
The single-vehicle Divide & Conquer (DC) Policy algorithm from vector quantization theory, and exploits the
— Compute an r-partition {Qk }rk=1 of Q that is unique features of power diagrams, a generalization of Voronoi
simultaneously equitable with respect to ϕ and ϕ1/2 . diagrams.
Let P̃1∗ be the point minimizing the sum of distances Accordingly, we define the multi-vehicle Divide & Conquer
to demands serviced in the past (if no points have policy as follows.
been visited in the past, P̃1∗ is set to be a random The multi-vehicle Divide & Conquer (m-DC)
point in Q), and let D be the set of outstanding Policy — The vehicles run the decentralized parti-
demands waiting for service. If D = ∅, move tioning algorithm discussed above (see [32] for more
to P̃1∗ . If, instead, D 6= ∅, randomly choose a details) and assign themselves to the subregions (this
k ∈ {1, . . . , r} and move to subregion Qk ; compute part is indeed a by-product of the algorithm in [32]).
the TSP tour through all demands in subregion Qk Simultaneously, each vehicle executes the single-
and service all demands in Qk by following this vehicle DC policy inside its own subregion.
TSP tour. If D 6= ∅ repeat the service process in The m-DC policy is within a factor m of the optimal
subregion k + 1 (modulo r). unbiased performance in heavy-load (since the algorithm in
This policy is unbiased in heavy-load. In particular, if r → [32] always provides a partition that is equitable with respect
+∞, the policy (i) is optimal in light-load and achieves to ϕ), and stabilizes the system in every load condition. In
optimal unbiased performance in heavy-load, and (ii) is stable general, the m-DC policy is only suboptimal in light-load;
in every load condition. It is possible to show that with r = 10 note, however, that the computation of the global minimum of
the DC policy is already guaranteed to be within 10% of the the Weber function Hm (which is non-convex for m > 1) is
optimal (for unbiased policies) performance in heavy-load. If, difficult for m > 1 (it is NP-hard for the discrete version of
instead, r = 1, the policy (i) is optimal in light-load and the problem); therefore, for m > 1, suboptimality has also to
within a factor 2 of the optimal unbiased performance in be expected from any practical implementation of the SQM
heavy-load, (ii) is stable in every load condition, and (iii) its policy. If an m-median of Q that induces a Voronoi partition
implementation does not require the knowledge of ϕ. This last that is equitable with respect to ϕ exists, the m-DC will locally
property implies that, remarkably, when r = 1, the DC policy converge to it, thus we say that the m-DC policy is “locally”
adapts to all problem data (both % and ϕ). It is worth noting optimal in light-load.
that when r = 1 and ϕ is constant over Q the DC policy is Note that, when the density is uniform, a partition that is
similar to the generation policy presented in [33]. equitable with respect to ϕ is also equitable with respect to
The optimality of the SQM policy and Theorem IV.2(i) ϕ1/2 ; therefore, when the density is uniform the m-DC policy
suggest the following decentralized and adaptive multi-vehicle is arbitrarily close to optimality in heavy-load (see Theorem
version of the DC policy: IV.2(i)).
(i) compute an m-median of Q that induces a Voronoi The m-DC policy adapts to arrival rate λ, expected on-site
partition that is equitable with respect to ϕ and ϕ1/2 , service s̄, and vehicle’s velocity v; however, it requires the
(ii) assign one vehicle to each Voronoi region, knowledge of ϕ.
(iii) each vehicle executes the single-vehicle DC policy in Tables I and II provide a synoptic view of the results
order to service demands that fall within its own subre- available so far; in particular, our policies are compared with
gion, by using the median of the subregion instead of the best unbiased policy available in the literature, i.e., the
P̃1∗ . UTSP policy with r → ∞. In Table I, an asterisk * signals
that the result is heuristic. Note that there are currently no
For a given Q and ϕ, if there exists an m-median of Q that results about decentralized and adaptive routing policies that
induces a Voronoi partition that is equitable with respect to ϕ are optimal in light-load and that are optimal biased algorithms
and ϕ1/2 , then the above policy is optimal both in light-load in heavy-load.
and arbitrarily close to optimality in heavy-load, and stabilizes
the system in every load condition. There are two main issues
with the above policy, namely (i) existence of an m-median of B. A Policy with No Explicit Inter-vehicle Communication
Q that induces a Voronoi partition that is equitable with respect A common theme in cooperative control is the investigation
to ϕ and ϕ1/2 , and (ii) how to compute it. In [32], we showed of the effects of different communication and information
that for some choices of Q and ϕ a median Voronoi diagram sharing protocols on the system performance. Clearly, the
that is equitable with respect to ϕ and ϕ1/2 fails to exist. ability to access more information at each single vehicle can
Additionally, in [32], we presented a decentralized partitioning not decrease the performance level; hence, it is commonly
algorithm that, for any possible choice of Q and ϕ, provides believed that providing better communication among vehicles
8

TABLE I
P OLICIES FOR THE 1-DTRP

Properties DC Policy, r → ∞ DC Policy, r = 1 RH Policy [15] UTSP Policy, r → ∞


Light-load performance optimal optimal optimal not optimal
Heavy-load performance optimal within 100 % of the optimal within 100% of the optimal* optimal
Adaptive to λ, s̄, and v yes yes yes no
Adaptive to ϕ no yes yes no

TABLE II
P OLICIES FOR THE m-DTRP

Properties m-DC Policy, r → ∞ UTSP Policy, r → ∞


Light-load performance “locally” optimal not optimal
Heavy-load performance optimal for uniform ϕ, within m of optimal unbiased in general optimal
Adaptive to λ, s̄, and v yes no
Adaptive to ϕ no no
Distributed yes no

will improve the system’s performance. In [16], we propose different cost function, it can be proved that the critical points
a policy for the DVR that does not rely on dedicated com- reached by this algorithm are no worse than the critical points
munication links between vehicles, but only on the vehicles’ reached knowing a priori the distribution ϕ.
knowledge of outstanding demands. An example is when Interestingly, the NC policy can be regarded as a learning
outstanding demands broadcast their location, but vehicles are algorithm in the context of the following game [16]. The
not aware of one another. We show that, under light load service requests are considered as resources and the vehicles
conditions, the inability of vehicles to communicate explicitly as selfish entities. The resources offer rewards in a continuous
does not limit the steady-state performance. In other words, fashion and the vehicles can collect these rewards by traveling
the information contained in the outstanding demands (and to the resource locations. Every resource offers reward at a
hence the effects of others on them) is sufficient to provide, in unit rate when there is at most one vehicle present at its
light load conditions, the same convergence properties attained location and the life of the resource ends as soon as more
when vehicles are able to communicate explicitly. than one vehicle are present at its location. This setup can be
The No (Explicit) Communication (NC) Policy — understood to be an extreme form of congestion game, where
Let D be the set of outstanding demands waiting for the resource cannot be shared between vehicles and where the
service. If D = ∅, move to the point minimizing the resource expires at the first attempt to share it. The total reward
average distance to demands serviced in the past by for vehicle i from a particular resource is the time difference
each vehicle. If there is no unique minimizer, then between its arrival and the arrival of the next vehicle, if i
move to the nearest one. If, instead, D 6= ∅, move is the first vehicle to reach the location of the resource, and
towards the nearest outstanding demand location. zero otherwise. The utility function of vehicle i is then defined
to be the expected value of reward, where the expectation is
In the NC policy, whenever one or more service requests
taken over the location of the next resource. Hence, the goal of
are outstanding, all vehicles will be pursuing a demand; in
every vehicle is to select their reference location to maximize
particular, when only one service request is outstanding, all
the expected value of the reward from the next resource. In
vehicles will move towards it. When the demand queue is
[16], we prove that the median locations, as a choice for
empty, vehicles will either (i) stop at the current location, if
reference positions, are an efficient pure Nash equilibrium for
they have visited no demands yet, or (ii) move to their ref-
this game. Moreover, we prove that by maximizing their own
erence point, as determined by the set of demands previously
utility function, the vehicles also maximize the common global
visited.
utility function, which is the negative of the average wait time
In [16], we prove that the system time provided by the NC
for service requests.
policy converges to a critical point (either a saddle point or a
local minimum) of Hm ∗
(Q) with high probability as λ → 0+ .
Let us underline that, in general, the achieved critical point V. ROUTING FOR ROBOTIC N ETWORKS : T IME
strictly depends on the initial positions of the vehicles inside C ONSTRAINTS AND P RIORITIES
the environment Q. We can not exclude that the algorithm so In many vehicle routing applications, there are strict service
designed will converge indeed to a saddle point instead of a requirements for demands. This can be modeled in two ways.
local minimum. This is due to the fact that the algorithm does In the first case, demands have (possibly stochastic) deadlines
not follow the steepest direction of the gradient of the function on their waiting times. In the second case, demands have
Hm , but just the gradient with respect to one of the variables. different urgency or “threat” levels, which capture the relative
On the other hand, since the algorithm is based on a sequence importance of each demand. In this section we study these
of demands and at each phase we are trying to minimize a two related problems and provide routing policies for both
9

scenarios. We discuss hard time constraints in Section V-A Given an information structure, we then study the following
and priorities in Section V-B. optimization problem OPT :
In this section we focus only on the case of a uniform spatial
OPT : min |π|, subject to lim Pπ [Wj < Gj ] ≥ φd ,
density ϕ. However, the algorithms we present below extend π j→∞
directly to non-uniform density. One simply replaces the equal
where |π| is the number of vehicles used by π (the existence of
area partitions with simultaneously equitable (with respect to
the limit limj→∞ Pπ [Wj < Gj ] and equivalent formulations
ϕ and ϕ1/2 ) partitions, as described for the DC policy in
in terms of time averages are discussed in [12], [34]). Let m∗
Section IV. The presentation, on the other hand, would become
denote the optimal cost for the problem OPT (for a given
more involved, and thus we restrict our attention to uniform
information structure).
densities.
In principle, one should study the problem OPT for each
of the possible information structures. In [12], instead, we
A. Time Constraints considered the following strategy: first, we derived a lower
bound that is valid under the most informative information
In [11], [12] we introduced and analyzed DVR with time
structure (this implies validity under any information struc-
constraints. Specifically, the setup is the same as that of the m-
ture), then we presented and analyzed two service policies that
DTRP, but now each demand j waits for the beginning of its
are amenable to implementation under the least informative
service no longer than a stochastic patience time Gj , which is
information structure (this implies implementability under any
generally distributed according to a distribution function FG .
information structure). Such approach gives general insights
A vehicle can start the on-site service for the jth demand only
into the problem OPT .
within the stochastic time window [Aj , Aj + Gj ), where Aj
1) Lower Bound: We next present a lower bound for the
is the arrival time of the jth demand. If the on-site service for
optimization problem OPT that holds under any information
the jth demand is not started before the time instant Aj + Gj ,
structure. Let P = (p1 , . . . , pm ) and define
then the jth demand is considered lost; in other words, such
1 kx − xk k
Z  
demand leaves the system and never returns. If, instead, the
Lm (P, Q) := 1 − FG min dx.
on-site service for the jth demand is started before the time |Q| Q k∈{1,...,m} v
instant Aj + Gj , then the demand is considered successfully
Theorem V.1 (Lower bound on OPT ). Under any informa-
serviced. The waiting time of demand j, denoted again by
tion structure, the optimal cost for the minimization problem
Wj , is the elapsed time between the arrival of demand j
OPT is lower bounded by the optimal cost for the minimiza-
and the time either one of the vehicles starts its service or
tion problem
such demand departs from the system due to impatience,
whichever happens first. Hence, the jth demand is considered min m
m∈N>0
serviced if and only if Wj < Gj . Accordingly, we denote by (5)
Pπ [Wj < Gj ] the probability that the jth demand is serviced subject to sup Lm (P, Q) ≥ φd .
P ∈Qm
under a routing policy π. The aim is to find the minimum
number of vehicles needed to ensure that the steady-state The proof of this lower bound relies on some nearest-
probability that a demand is successfully serviced is larger neighbor arguments. Algorithms to find the solution to the
than a desired value φd ∈ (0, 1), and to determine the policy minimization problem in equation (5) have been presented in
the vehicles should execute to ensure that such objective is [12].
attained. 2) The Nearest-Depot Assignment (NDA) Policy: We next
Formally, define the success factor of a policy π as present the Nearest-Depot Assignment (NDA) policy, which
φπ := limj→+∞ Pπ [Wj < Gj ]. We identify four types of requires the least amount of information and is optimal in
information on which a control policy can rely: 1) Arrival light-load.
time and location: we assume that the information on arrivals The Nearest-Depot Assignment (NDA) Policy —
and locations of demands is immediately available to control Let P̃m ∗
(Q) := arg maxP ∈Qm Lm (P, Q) (if there
policies; 2) On-site service: the on-site service requirement of are multiple maxima, pick one arbitrarily), and let
demands may either (i) be available, or (ii) be available only p̃∗k be the location of the depot for the kth vehicle,
through prior statistics, or (iii) not be available to control poli- k ∈ {1, . . . , m}. Assign a newly arrived demand
cies; 3) Patience time: the patience time of demands may either to the vehicle whose depot is the nearest to that
(i) be available, or (ii) be available only through prior statistics; demand’s location, and let Dk be the set of out-
4) Departure notification: the information that a demand leaves standing demands assigned to vehicle k. If the set
the system due to impatience may or may not be available Dk is empty, move to p̃∗k ; otherwise, visit demands
to control policies (if the patience time is available, such in Dk in first-come, first-served order, by taking the
information is clearly available). Hence, several information shortest path to each demand location. Repeat.
structures are relevant. The least informative case is when on- In [12] we prove that the NDA policy is optimal in light-load
site service requirements and departure notifications are not under any information structure. Note that the NDA policy is
available, and patience times are available only through prior very similar to the SQM policy described in section III-D;
statistics; the most informative case is when on-site service the only difference is that the depot locations are now the
requirements and patience times are available. maximizers of Lm , instead of the minimizers of Hm .
10

3) The Batch (B) Policy: Finally, we present the Batch (B) demand is
policy, which is well-defined for any information structure, n
1 X
however it is particularly tailored for the least informative case Tπ = λα T π,α ,
Λ α=1
and is most effective in moderate and heavy-loads.
The Batch (B) Policy — Partition Q into m equal Pn
where Λ := α=1 λα , and T π,α is the expected system time
area regions Qk , k ∈ {1, . . . , m}, and assign one ve- of α-demands (under routing policy π). The average system
hicle to each region. Assign a newly arrived demand time per demand is the standard cost functional for queueing
that falls in Qk to the vehicle responsible for region systems with multiple
k, and let Dk be the set of locations of outstanding Pn classes of demands. Notice that we
can write T π = α=1 cα T π,α with cα = λα /Λ. Thus, if
demands assigned to vehicle k. For each vehicle- we aim to assign distinct importance levels, we can model
region pair k: if the set Dk is empty, move to the priority among classes by allowing any convex combination
median (the “depot”) of Qk ; otherwise, compute a of T π,1 , . . . , T π,n . If cα > λα /Λ, then the system time of α-
TSP tour through all demands in Dk and vehicle’s demands is being weighted more heavily than in the average
current position, and service demands by following case. In other words, the quantity cα Λ/λα gives the priority of
the TSP tour, skipping demands that are no longer α-demands compared to that given in the average system time
outstanding. Repeat. case. Without loss of generality we can assume that priority
Note that this policy is basically a simplified version of the classes are labeled so that
m-DC policy (with r = 1).
c1 c2 cn
The following theorem characterizes the batch policy, under ≥ ≥ ··· ≥ , (7)
the assumption of zero on-site service, and assuming the least λ1 λ2 λn
informative information structure.
implying that if α < β for some α, β ∈ {1, . . . , n}, then the
Theorem V.2 (Vehicles required by batch policy). Assuming priority of α-demands is at least as high as that of β-demands.
zero on-site service, the batch policy guarantees a success The problem is as follows. Consider
Pn a set of coefficients
factor at least as large as φd if the number of vehicles is cα > 0, α ∈ {1, . . . , n}, with α=1 cα = 1, and satisfying
equal to or larger than: expression (7). Determine the policy π (if it exists) which
n o minimizes the cost
min m sup (1−FG (θ))(1−2g(m)/θ) ≥ φd ,

θ∈R>0 n
X
 2 q  T π,c := cα T π,α .
4 β̄ 2
where g(m) := 12 β̄v2 |Q| mλ2 + β̄v4 |Q|2 m
λ2
4 + 8 v 2 |Q| m ,
1 α=1

and where β̄ is a constant that depends on the shape of the In the light-load case where % → 0+ we can use existing
service regions. policies to solve the problem. This is summarized in the
Furthermore, in [11] we show that when (i) the system is following remark.
in heavy-load, (ii) φd tends to one, and (iii) the deadlines are
Remark V.3 (Light-load regime). In light-load, it can be
deterministic, the batch policy requires a number of vehicles
verified that the Stochastic Queue Median policy (see Sec-
that is within a factor 3.78 of the optimal.
tion III-D) provides optimal performance. That is, the vehicles
can simply ignore the priorities and service the demands in
B. Priorities the FCFS order, returning to their median locations between
each service. 
In this section we look at a DVR problem in which
demands for service have different levels of importance. The 1) Lower Bound in Heavy-Load: In this section we present
service vehicles must then prioritize, providing a quality of a lower bound on the weighted system time T π,c for every
service which is proportional to each demand’s importance. policy π.
We introduced this problem in [13]. Formally, we assume the
environment Q ⊂ R2 , with area |Q|, contains m vehicles, each Theorem V.4 (Heavy-load lower bound). The system time of
with maximum speed v. Demands of type α ∈ {1, . . . , n}, any policy π is lower bounded by
called α-demands, arrive in the environment according to a 2 n  n
βTSP,2 |Q| X 
Poisson process with rate λα . Upon arrival, demands assume
X
T π,c ≥ cα + 2 c j λα , (8)
an independently and uniformly distributed location in Q. An 2m2 v 2 (1 − %)2 α=1 j=α+1
α-demand requires on-site service with finite mean s̄α .
For this problem the load factor can be written as as % → 1− , where c1 , . . . , cn satisfy expression (7).
n
1 X Remark V.5 (Lower bound for all % ∈ [0, 1)). Lower
% := λα s̄α . (6) bound (8) holds only in heavy-load. We can also obtain a
m α=1
lower bound that is valid for all values of %. However, in the
The condition % < 1 is necessary for the existence of a stable heavy-load limit it is less tight than bound (8). Under the
policy. For a stable policy π, the average system time per labeling in expression (7), this general bound for any policy
11

we obtain the following result.


Theorem V.6 (SQ policy performance). As % → 1− , the
system time of the SQ policy is within a factor 2n2 of
the optimal system time. This factor is independent of the
arrival rates λ1 , . . . , λn , coefficients c1 , . . . , cn , service times
s̄1 , . . . , s̄n , and the number of vehicles m.
4) Heuristic Improvements: We now present two heuristic
improvements on the SQ policy. The first improvement, called
the queue merging heuristic, is guaranteed to never increase
the upper bound on the expected system time, and in certain
Fig. 3. A representative simulation of the SQ policy for one vehicle and two
priority classes. Circle shaped demands are high priority, and diamond shaped instances it significantly decreases the upper bound. To moti-
are low priority. The vehicle is marked by a chevron shaped object and TSP vate the modification, consider the case when all classes have
tour is shown in a solid line. The left-figure shows the vehicle computing a tour equal priority (i.e., c1 /λ1 = · · · = cn /λn ), and we use the
through class 2 demands. The right-figure shows the vehicle after completing
the class 2 tour and computing a new tour through all outstanding class 1 probability assignment pα = cα for each class α. Then, the
demands. upper bound for the Separate Queues policy is n times larger
than if we (i) ignore priorities, (ii) merge the n classes into a
single class, and (iii) run the SQ policy on the merged class
π is
(i.e., at each iteration, service all outstanding demands in Q
n  n
γ 2 |Q| X X   via the TSP tour).
T π,c ≥ c α + 2 cj λ α
m v (1 − %) α=1
2 2 2
j=α+1
Motivated by this discussion, we define a merge configu-
n
ration to be a partition of n classes {1, . . . , n} into ` sets
mc1 X C1 , . . . , C` , where ` ∈ {1, . . . , n}. The idea is to run the Sepa-
− + cα s̄α , (9)
2λ1 α=1
rate Queues policy P on the ` classes, where class i ∈ {1, . . . , `}
√ has arrival rate α∈Ci λα and convex combination coefficient
where % ∈ [0, 1) and γ = 2/(3 2π) ≈ 0.266.  P
{C1 , . . . , C` }, and
α∈Ci cα . Given a merge configurationP
2) The Separate Queues Policy: In this section we present using the probability assignment pi = α∈Ci cα for each
the Separate Queues (SQ) policy. This policy utilizes a prob- class i ∈ {1, . . . , `}, the analysis leading to (10) can easily be
ability distribution p = [p1 , . . . , pn ], where pα > 0 for modified to yield an upper bound of
each α ∈ {1, . . . , n}, defined over the priority classes. The  2
2 ` sX
distribution p is a set of parameters to be used to optimize βTSP,2 |Q|` X X
cα λβ  . (11)
m2 v 2 (1 − %)2 i=1

performance.
α∈Ci β∈Ci
Separate Queues (SQ) Policy — Partition Q into
m equal area regions and assign one vehicle to The SQ-policy with merging can be summarized as follows:
each region. For each vehicle, if region contains no Separate Queues (SQ) with Merging Policy —
demands, then move to median location of region Find the merge configuration {C1 , . . . , C` } which
until a demand arrives. Otherwise, select a class minimizes equation (11). Run the Separate Queues
according to the distribution p. Compute a TSP
tour through all demands in region of the selected P on ` classes, where class i has arrival
policy
rate α∈Ci λα and convex combination coefficient
class and service all of these demands by following P
α∈Ci cα .
the TSP tour. When tour is completed, repeat by Now, to minimize equation (11) in the SQ with Merging
selecting a new class. policy, one must search over all possible partitions of a set
Figure 3 shows an illustrative example of the SQ policy. In of n elements. The number of partitions is given by the Bell
the first frame the vehicle is servicing only class 2 (diamond Number and thus search becomes infeasible for more than
shaped) demands, whereas in the second frame, the vehicle is approximately 10 classes. However, one can also limit the
servicing class 1 (circle shaped) demands. search space in order to increase the number of classes that
3) Performance of the SQ Policy: By upper bounding the can be considered as in [13].
expected steady-state number of demands in each class, we are The second heuristic improvement for the SQ policy which
able to obtain the following expression for the system time of can be used in implementation is called the tube heuristic. The
the SQ policy in heavy-load: heuristic improvement is as follows:
n n
!2
2
βTSP,2 |Q| X cα X p The Tube Heuristic — When following a tour,
T SQ,c ≤ 2 2 λi pi . (10)
m v (1 − %)2 α=1 pα i=1 service all newly arrived demands that lie within
distance  > 0 of the tour.
Thus, we can minimize this upper bound by appropriately
The idea behind the heuristic is to utilize the fact that some
selecting the probability distribution p = [p1 , . . . , pn ]. With
newly arrived demands will be “close” to the demands in
the selection
the current service batch, and thus can be serviced with
pα := cα for each α ∈ {1, . . . , n}, minimal additional travel cost. Analysis of the tube heuristic is
12

complicated by the fact that it introduces correlation between


demand locations.
The parameter  should be chosen such that the total tour
length is not increased by more than, say, 10%. A rough
calculation shows that  should scale as
s
µ|Q|
∼ ,
total expected number of demands
where µ is the fractional increase in tour length (e.g., 10%).
Numerical simulations have shown that this heuristic, with an Q
appropriately chosen value of , improves the SQ performance
Fig. 4. Illustration of the Median Circling policy. The squares represent
by a factor of approximately 2. In a more sophisticated im- Pm∗ (Q), the m-median of Q. Each vehicle loiters about its respective
plementation we define an α for each α ∈ {1, . . . , n}, where generator at a radius ρ. The regions of dominance are the Voronoi partition
generated by Pm ∗ (Q). In this figure, a demand has appeared in the subregion
the magnitude of α is proportional to the probability pα .
roughly in the upper-right quarter of the domain. The vehicle responsible for
this subregion has left its loitering orbit and is en route to service the demand.
VI. ROUTING FOR ROBOTIC N ETWORKS :
C ONSTRAINTS ON V EHICLE M OTION
In this section, we consider the m-DTRP described in the The lower bound (12) follows from equation (1); however
earlier sections with the addition of differential constraints on this bound is obtained by approximating the Dubins distance
the vehicle’s motion [18]. In particular, we concentrate on (i.e., the length of the shortest feasible path for a Dubins
vehicles that are constrained to move on the plane at constant vehicle) with the Euclidean distance. The lower bound (13) is
speed v > 0 along paths with a minimum radius of curvature obtained by explicitly taking into account the Dubins turning
ρ > 0. Such vehicles, often referred to as Dubins vehicles, cost. Although the first two lower bounds of Theorem VI.1
have been extensively studied in the robotics and control are valid for any λ, they are particularly useful in the light-
literature [35], [36], [37]. Moreover, the Dubins vehicle model load regime. The lower bound (14) is valid and useful in the
is widely accepted as a reasonably accurate model to represent heavy-load regime.
aircraft kinematics, e.g., for air traffic control [38], [39], and
UAV mission planning purposes [40], [4], [41]. Accordingly,
the DVR problem studied in this section will be referred to B. Routing Policies for the m-Dubins DTRP
as the m-Dubins DTRP. In this section we focus only on the
We start by considering two policies that are particularly
case of a uniform spatial density ϕ.
efficient in light load. The first light-load policy, called the Me-
A feasible path for the Dubins vehicle (called Dubins path)
dian Circling policy, imitates the optimal policy for Euclidean
is defined as a path that is twice differentiable almost every-
vehicles, assigning static regions of responsibility. As usual,
where, and such that its radius of curvature is bounded below
let Pm ∗
(Q) be the m-median of Q. The policy is formally
by ρ. Since a Dubins vehicle can not stop, we only consider
described as follows.
zero on-site service time. Hence, the generic load factor % =
λs̄/m, as defined in subsection III-B, becomes inappropriate The Median Circling (MC) Policy — Let the loi-
for this setup and, in accordance with Remark III.2, the heavy- tering orbits for the vehicles be circular trajectories
load regime is defined as λ/m → +∞. We correspondingly of radius ρ centered at entries of Pm ∗
(Q), with each
define the light-load regime as λ/m → 0+ . vehicle allotted one trajectory. Each vehicle visits the
demands in the Voronoi region Vi (Pm ∗
(Q)) in the
A. Lower Bounds order in which they arrive. When no demands are
available, the vehicle returns to its loitering orbit;
In this section we provide lower bounds on the system time the direction in which the orbit is followed is not
for the m-Dubins DTRP. important, and can be chosen in such a way that the
Theorem VI.1 (System time lower bounds). The optimal orbit is reached in minimum time.
system time for the m-Dubins DTRP satisfies the following An illustration of the MC policy is shown in Figure 4.
lower bounds: We next introduce a second light-load policy, namely the
∗ H ∗ (Q) Strip Loitering policy, which is more efficient than the MC
T ≥ m , (12)
v policy when the nonholonomic vehicle density is large and
1/3 √
3 3
3
relies on dynamic regions of responsibility for the vehicles. An

∗ m
lim inf T ≥ , (13) illustration of the Strip Loitering policy is shown in Figure 5.
dρ →+∞ ρ|Q| 4v
∗m
3
81 ρ|Q| The Strip Loitering (SL) Policy — Bound the
lim inf T 2 ≥ , (14) environment Q with a rectangle of minimum height,
λ
m →+∞
λ 64 v 3
where height denotes the smaller of the two side
ρ2 m
where |Q| is the area of Q and dρ := |Q| is the nonholonomic lengths of a rectangle. Let R and S be the width
vehicle density. and height of this bounding rectangle, respectively.
13

Q target
d1 d2
Fig. 5. Illustration of the Strip Loitering policy. The trajectory providing
closure of the loitering path (along which the vehicles travel from the end of
the last strip to the beginning of the first strip) is not shown here for clarity
of the drawing.
Fig. 6. Close-up of the Strip Loitering policy with construction of the point
of departure and the distances d1 , and d2 for a given demand, at the instant
of appearance.
Divide Q into strips of width r, where
( 2/3 )
4 RS + 10.38ρS
r = min √ , 2ρ .
3 ρ m
ρ
Orient the strips along the side of length R. Con-
struct a closed Dubins path, henceforth referred to as
the loitering path, which runs along the longitudinal
bisector of each strip, visiting all strips in top-to- Bρ (!)
p− p+
bottom sequence, making U-turns between strips at
the edges of Q, and finally returning to the initial !
configuration. The m vehicles are allotted loitering
positions on this path, equally spaced, in terms of
path length.
When a demand arrives, it is allocated to the closest
Fig. 7. An illustration for the construction of the bead for a given ρ and `.
vehicle among those that lie within the same strip
as the demand and that have the demand in front of
them. When a vehicle has no outstanding demands, where height denotes the smaller of the two side
the vehicle returns to its loitering position as follows. lengths of a rectangle. Let R and S be the width
(We restrict the exposition to the case when a vehicle and height of this bounding rectangle, respectively.
has only one outstanding demand Fig. when
2. Construction itsthe “bead” Bρ (!).
it leaves of The figure shows how the upper half of the boundary is constructed, the bottom
Tile the plane with identical beads Bρ (`) with ` =
loitering position and no more demands are allotted min{CBTA v/λ, 4ρ}, where
to it before it returns to its loitering position; other √  −1
cases can be handled similarly.) After making a 7 − 17 7πρS
CBTA = 1+ .
left turn of length d2 (as shownNext, in Figure
we study the probability of targets belonging4 to a given 3|Q|
6) to bead. Consider a bead B entirely
service the demand, the vehicle makes a right turn
The beads are oriented to be along the width of the
of length 2d2 followed by another andleft turn of
assume n length
points are uniformly randomly generated Q. Thevehicle th
bounding rectangle. TheinDubins probability
visitsthat
all the i point is s
d2 , and then returns to the loitering path. However,
beads intersecting Q in a row-by-row fashion in top-
the vehicle has fallen behind in the loitering path
to-bottom sequence, Area(Bρ (!)) one demand
by a distance 4(d2 − ρ sin dρ2 ). To rectify this, as it µ(!) =servicing at least .
in every nonempty bead.Area(Q) This process is repeated
nears the end of the current strip, it takes its U-turn
indefinitely.
a distance 2(d2 − ρ sin dρ2 ) early.
Furthermore, the probability that Theexactly k outis of
BT policy the n points
extended to the are sampledBead
m-vehicle in BTiling
has a binomial d
Note that the loitering path must cover Q, but it need not
(mBT) policy in the following way (see Figure 8).
Q. The bounding
cover the entire bounding box of indicating with nBbox the is
total number of points sampled in B,
merely a construction used to place an upper bound on the The m-vehicle Bead Tiling (mBT) Policy — Di-
vide the environment into regions ! "of dominance with
total path length. n
The MC and SL policies will be proven to be efficient in Pr[nB = k| n samples] = µk (1 − µ)n−k .
k
light-load. We now propose the Bead Tiling policy which will
be proven to be efficient in heavy-load. If theAbead
key component
length ! is ofchosen as a function of n in such a way that ν = n · µ(!(n)) is a constant
the algorithm is the construction of a novel geometric set,
tuned to the kinetic constraints of the for Dubins
large n vehicle, called distribution is [31] the Poisson distribution of mean ν, that is,
of the binomial
the bead [17]. The construction of a bead Bρ (`) for a given
ρ and an additional parameter ` > 0 is illustrated in Figure 7. ν k −ν
We start with the policy for a single vehicle. lim Pr[n B = k| n samples] = Q e .
n→+∞ k!
The Bead Tiling (BT) Policy — Bound the en-
vironment Q with a rectangle of minimum height, Fig. 8. An illustration of the mBT policy.
C. The Recursive Bead-Tiling Algorithm

In this section, we design a novel algorithm that computes a Dubins path through a point set in Q
14

lines parallel to the bead rows. Let the area and Theorem VI.3 together with Theorem VI.1 implies that the
height of the i-th vehicle’s region be denoted with mBT policy is within a constant factor of the optimal in heavy-
|Q|i and Si . Place the subregion dividers in such a load, and thatthe optimal system time in this case belongs to
way that Θ λ2 /(mv)3 .
7 1

7
 It is instructive to compare the scaling of the optimal system
|Q|i + πρSi = |Q| + πρS time with respect to λ, m and v for the m-DTRP and for the
3 m 3
m-Dubins DTRP. Such comparison is shown in Table III. One
for all i ∈ {1, . . . , m}. Allocate one subregion to
every vehicle and let each vehicle execute the BT TABLE III
A COMPARISON BETWEEN THE SCALING OF THE OPTIMAL SYSTEM TIME
policy in its own region. FOR THE m-DTRP AND FOR THE m-D UBINS DTRP.

C. Analysis of Routing Policies m-DTRP m-Dubins DTRP



Θ λ/(mv)2 Θ λ2 /(mv)3
` ´ ` ´
We now present the performance analysis of the routing T
policies we introduced in the previous section. (λ/m → +∞) [8] [18]
∗ √ ´ √ ´
Θ 1/(v m) (dρ → 0+ )
` `
T Θ 1/(v m)
Theorem VI.2 (MC policy performance in light-load). The √ ´
(λ/m → 0+ )
`
[42] Θ 1/(v 3 m) (dρ → +∞)
MC policy is a stabilizing policy in light-load, i.e., as λ/m →
[18]
0+ . The system time of the Median Circling policy in light-
load satisfies, as λ/m → 0+ ,
can observe that in heavy-load the optimal system time for the
T MC m-Dubins DTRP is of the order λ2 /(mv)3 , whereas for the
lim sup ≤ 1 + 25 dρ ,
p

λ
m →0
+ T m-DTRP it is of the order λ/(mv)2 . Therefore, our analysis
rigorously establishes the following intuitive fact: bounded-
and, in particular,
curvature constraints make the optimal system much more
T MC sensitive to increases in the demand generation rate. Perhaps
lim lim sup ∗ = 1.
dρ →0+ λ + T less intuitive is the fact that the optimal system time is also
m →0
more sensitive with respect to the number of vehicles and the
Theorem VI.2 implies that the MC policy is optimal in vehicle speed in the m-Dubins DTRP as compared to the m-
the asymptotic regime where λ/m → 0+ and dρ → 0+ . DTRP.
Hence, the MC policy is particularly efficient in light-load for A close observation of the system time in the light-load case
low values of the nonholonomic vehicle density. Moreover, shows that the territorial MC policy is optimal as dρ → 0+
Theorem VI.2 together with Theorem VI.1 and Equation (21) and the gregarious SL policy is constant-factor optimal as
(provided in the Appendix) implies that the optimal system dρ → +∞. This suggests the existence of a phase transition in
time in √the aforementioned asymptotic regime belongs to the optimal policy for the light-load scenario as one increases
Θ (1/(v m)). the number of vehicles for a fixed ρ and Q (recall that
We now characterize the performance of the SL policy. dρ = ρ2 m/|Q|). It is desirable to study the fundamental
Theorem VI.3 (SL policy performance in light-load). The SL factors driving this transition, ignoring its dependence on the
policy is a stabilizing policy in light-load, i.e., when λ/m → shape of Q. Towards this end, envision an infinite number of
0+ . Moreover, the system time of the SL policy satisfies, as vehicles operating on the unbounded plane. In this case, the
λ/m → 0+ , configuration Pm ∗
(Q) yielding the minimum of the function
  1/3 Hm is that in which the Voronoi partition induced by Pm ∗
(Q) is
2
1.238 ρRS+10.38ρ S


 v
 m + R+S+6.19ρ
mv
a network of regular hexagons [42]. Moreover, in this scenario,
  the SL policy reduces to vehicles moving straight on infinite
T SL ≤ for m ≥ 0.471 RS
2 +
10.38S
,

 ρ ρ strips. In this setup, it is observed that the phase transition
 RS+10.38ρS + R+S+6.19ρ + 1.06ρ otherwise.

can be characterized by a critical value of the dimensionless
4ρmv mv v
parameter of the nonholonomic density [18], estimated to
Theorem VI.3 together with Theorem VI.1 implies that be dunbd ≈ 0.0587. An alternate interpretation is that the
ρ
the SL policy is within a constant factor of the optimal in transition occurs when each vehicle is responsible for a region
the asymptotic regime where λ/m → 0+ and dρ → +∞. of area 5.42 times that of a minimum turning-radius disk. This
Moreover, in such asymptotic
√ regime the optimal system time critical value of dρ obtained for unbounded domain has been
belongs to Θ (1/(v 3 m)). found to be very close to the values obtained for bounded
Finally, we characterize the performance of the mBT policy. domains [18]. This result provides a system architect with
Theorem VI.4 (mBT policy performance in heavy-load). The valuable information to decide upon the optimal strategy in
mBT policy is a stabilizing policy. Moreover, the system time the light-load scenario for given problem parameters. Similar
for the mBT policy satisfies the following phase transition phenomena for other vehicles have been
3 studied in [43].
m3 7πρS

ρ|Q|
lim T mBT 2 ≤ 71 3 1+ .
λ
m →+∞
λ v 3|Q|
15

VII. ROUTING FOR ROBOTIC N ETWORKS : concern for the constant factors. It turns out that this analysis
T EAM F ORMING FOR C OOPERATIVE TASKS provides substantial insight into the performance of different
Here we study demands (or tasks) that require the si- team forming policies. This type of asymptotic analysis is
multaneous services of several vehicles [20]. In particular, frequently performed in computational complexity [45] and
consider m vehicles, each capable of providing one of k ad-hoc networking [46].
services. We assume that there are mj > 0 vehicles capable
of providing service j (called vehicles
Pkof service-type j), for A. Three Team Forming Policies
each j ∈ {1, . . . , k}, and thus m := j=1 mj . We now present three team forming policies.
In addition, we assume there are K different types of tasks.
The Complete Team (CT) Policy — Form m/k
Tasks of type α ∈ {1, . . . , K} arrive according to a Poisson
teams of k vehicles, where each team contains one
process with rate λα , and assume aPlocation i.i.d. uniformly
K vehicle of each service-type. Each team meets and
in Q.2 The total arrival rate is λ := α=1 λα . Each task-type
moves as a single entity. As tasks arrive, service
α requires a subset of the k services. We record the required
them by one of the m/k teams according to the
services in a zero-one (column) vector Rα ∈ {0, 1}k . The jth
UTSP policy.
entry of Rα is 1 if service j is required for task-type α, and
0 otherwise. The on-site service time for each task-type α has For the second policy, recall that the vector R1K records in
mean s̄α . To complete a task of type α, a team of vehicles its jth entry the number of task-types that require service j,
capable of providing the required services must travel to the where 1K is a K × 1 vector of ones. Thus, if
task location and remain there simultaneously for the on-site R1K ≤ [m1 , . . . , mk ]T (17)
service time. We refer to this problem as the dynamic team
forming problem [20]. component-wise, then there are enough vehicles of each
As a motivating example, consider the scenario given in service-type to create
Section I where each demand (or task) corresponds to an event
$ ( )%
mj
that requires close-range observation. The sensors required mTST := min j ∈ {1, . . . , k}
eTj R1K
to properly assess each event will depend on that event’s
properties. In particular, a event may require several different dedicated teams for each task-type, where ej is the jth vector
sensing modalities, such as electro-optical, infra-red, synthetic of the standard basis of Rk . Thus, when equation (17) is
aperture radar, foliage penetrating radar, and moving target satisfied, we have the following policy.
indication radar [44]. One solution would be to equip each The Task-Specific Team (TT) Policy — For each
UAV with all sensing modalities (or services). However, in of the K task-types, create mTST teams of vehicles,
many cases, most events will require only a few sensing where there is one vehicle in the team for each
modalities. Thus, we might increase our efficiency by having a service required by the task-type. Service each task
larger number of UAVs, each equipped with a single modality, by one of its mTST corresponding teams, according
and then forming the appropriate sensing team to observe each to the UTSP policy.
event. The task-specific team policy can be applied only when
We restrict our attention to task-type unbiased policies; there is a sufficient number of vehicles of each service-type.
policies π for which the system time of each task (denoted by The following policy requires only a single vehicle of each
T π,α ) is equal, and thus T π,1 = T π,2 = · · · = T π,K =: T π . service type. The policy partitions the task-types into groups,
We seek policies π which minimize the expected system time where each group is chosen such that there is a sufficient
of tasks T π . Policies of this type are amenable to analysis number of vehicles to create a dedicated team for each task-
because the task-type unbiased constraint collapses the feasible type in the group. The task-specific team policy is then run on
set of system times from a subset of RK to a subset of R. each group sequentially. The groups are defined via a service
Defining the matrix schedule which is a partition of the K task-types into L ≤ K
R := [R1 · · · RK ] ∈ {0, 1}k×K , (15) time slots, such that each task-type appears in precisely one
time slot, and the task-types in each time slot are pairwise
a necessary condition for stability is disjoint (i.e., in a given time slot, each service appears in
R[λ1 s̄1 · · · λK s̄K ]T < [m1 · · · mk ]T (16) at most one task-type).3 We now formally present the third
policy.
component-wise. Note that this condition is akin to the “load The Scheduled Task-Specific Team (STT) Policy
factor” in Subsection III-B. However, the space of load factors — Partition Q into mCT := min{m1 , . . . , mk } equal
is much richer, and thus light and heavy-load are no longer area regions and assign one vehicle of each service-
simply defined. To simplify the problem we take an alterna- type to each region. In each region form a queue for
tive approach. We study the performance as the number of each of the K task-types. For each time slot in the
vehicles becomes very large, i.e., m → +∞. In addition, schedule and each task-type in the time slot, create
we simply look at the order of the expected delay, without
3 Computing an optimal schedule is equivalent to solving a vertex coloring
2 As in Section V, the algorithms in this section extend directly to a non- problem, which is NP-hard. However, an approximate schedule can be
uniform spatial density by utilizing simultaneously equitably partitions. computed via known vertex coloring heuristics; see [20] for more details.
16

$
a team containing one vehicle for each required '!

service. For each team, service the first n tasks (n is (


'!
a design parameter) in the corresponding queue by

System Time T
following an optimal TSP tour. When the end of the '!
#

service schedule is reached, repeat. Optimize over n

12345
(see [20] for details). T'!'
ord

!
T'!
min
B. Performance of Policies
!'
To analyze the performance of the policies we make the fol- '!
! !"# !"$ !"% !"& Bcrit
'
lowing simplifying assumptions: (A1) There are n/k vehicles Throughput
)*+,-.*/-0Bm

of each service-type. (A2) The arrival rate is λ/K for each


task-type. (A3) The on-site service time has mean s̄ and is Fig. 9. The canonical throughput vs. system time profile for the dynamic
team forming problem. The semi-log plot is for parameter values of T min = 1,
upper bounded by smax for all task-types. (A4) There exists T ord = 10, and Bcrit = 1. If Bm ≥ Bcrit , then the system time is +∞.
p ∈ [1/k, 1] such that for each j ∈ {1, . . . , k} the service
j appears in p K of the K task-types. Thus, each task will
require service j with probability p. of each upper and lower bound. In this table, k is the number
With these assumptions, the stability condition in equa- of services, K is number of task-types, L is the length of
tion (16) simplifies to the service schedule, and p is the probability that a task-type
λ 1 requires service j for each j ∈ {1, . . . , k}.
< . (18)
m pks̄ TABLE IV
A COMPARISON OF THE CANONICAL THROUGHPUT VS . SYSTEM TIME
We say that λ is the total throughput of the system (i.e., the PARAMETERS FOR THE THREE POLICIES . H ERE p K ≤ L ≤ K IS THE
total number of tasks served per unit time), and Bm := λ/m SCHEDULE LENGTH .
is the per-vehicle throughput.
Finally, we study the system time as the number of vehicles T min T ord Bcrit

m becomes large. As m increases, if the density of vehicles is ` ∗´
Lower bound T k k 1
pks̄
to remainpconstant, then the environment must grow. In fact, √ 1
√ CT Policy k k ks̄
the ratio |Q|/v must scale as m, [47]. In [2] this scaling √ 1
TT Policy pkK pkK C s̄pk
is referred to as a critical environment. Thus we will study the √ K
STT Policy L k Lk smax Lk
performance in the asymptotic regime where (i) the number of
vehicles m → +∞; (ii) on-site service times are independent
of m; (iii) |Q(m)|/(mv 2 (m)) → constant > 0. From these results we draw several conclusions. First, if the
To characterize the system time as a function of the per- throughput is very √low, then the CT Policy has an expected
vehicle throughput Bm we introduce the canonical throughput system time of Θ( k), which is within a constant factor of
vs. system time profile fT min ,T ord ,Bcrit : R>0 → R>0 ∪ {+∞} the optimal. In addition, if p is close to one and each task
which has the form requires nearly every service, then CT is within a constant
factor of the optimal in terms of capacity and system time.
max T min , T ord (Bm /Bcrit ) , if Bm < Bcrit ,
  
 Second, if p ∼ 1/k and each task requires few services, then
Bm 7→ (1 − Bm /Bcrit )2 the capacity of CT is sub-optimal, and the capacity of both
+∞,


if Bm ≥ Bcrit . TT and STT are within a constant factor of optimal. However,
(19) the system time of the TT and STT policies may be much
This profile (see Figure 9) is described by the three positive higher than the lower bound when the number of task-types
parameters T min , T ord and Bcrit , where T ord ≥ T min . These K is very large. Third, the TT policy performs at least as well
parameters have the following interpretation: as the STT policy, both in terms of capacity and system time.
• T min is the minimum achievable system time for any Thus, one should use the TT policy if there is a sufficient
positive throughput. number of vehicles of each service-type. However, if p ∼ 1/k
• Bcrit is the maximum achievable throughput (or capacity). and if resources are limited such that the TT policy cannot

• T ord is the system time when operating at (3 − 5)/2 ≈ be used, then the STT Policy should be used to maximize
38% of capacity Bcrit . Additionally, T ord captures the capacity.
order of the system time when operating at a constant
fraction of capacity. VIII. S UMMARY AND F UTURE D IRECTIONS
For each of the three policies π, we can write the system In this paper we presented a joint algorithmic and queueing
time as T π ∈ O fT min ,T ord ,Bcrit (Bm ) for some values of T min , approach to the design of cooperative control, task allocation
T ord , and Bcrit . In addition, we can write  the lower bound and dynamic routing strategies for networks of uninhabited

in the form T ∈ Ω fT min ,T ord ,Bcrit (Bm ) . We summarize the vehicles required to operate in dynamic and uncertain environ-
corresponding parameter values for the policies and the lower ments. The approach integrates ideas from dynamics, combi-
bound in Table IV. We refer the reader to [20] for the proof natorial optimization, teaming, and distributed algorithms. We
17

have presented dynamic vehicle routing algorithms with prov- [4] R. W. Beard, T. W. McLain, M. A. Goodrich, and E. P. Anderson,
able performance guarantees for several important problems. “Coordinated target assignment and intercept for unmanned air vehicles,”
IEEE Transactions on Robotics and Automation, vol. 18, no. 6, pp. 911–
These include adaptive and decentralized implementations, 922, 2002.
demands with time constraints and priority levels, vehicles [5] G. Arslan, J. R. Marden, and J. S. Shamma, “Autonomous vehicle-target
with motion constraints, and team forming. These results assignment: A game theoretic formulation,” ASME Journal on Dynamic
Systems, Measurement, and Control, vol. 129, no. 5, pp. 584–596, 2007.
complement those from the online algorithms literature, in [6] P. Toth and D. Vigo, eds., The Vehicle Routing Problem. Monographs
that they characterize average case performance (rather than on Discrete Mathematics and Applications, SIAM, 2001.
worst-case), and exploit probabilistic knowledge about future [7] D. J. Bertsimas and G. J. van Ryzin, “A stochastic and dynamic vehicle
demands. routing problem in the Euclidean plane,” Operations Research, vol. 39,
pp. 601–615, 1991.
Dynamic vehicle routing is an active area of research and, [8] D. J. Bertsimas and G. J. van Ryzin, “Stochastic and dynamic vehicle
in recent years, several directions have been pursued which routing in the Euclidean plane with multiple capacitated vehicles,”
were not covered in this paper. In [14], [48], we consider Operations Research, vol. 41, no. 1, pp. 60–76, 1993.
[9] D. J. Bertsimas and G. J. van Ryzin, “Stochastic and dynamic ve-
moving demands. The work focuses on demands that arrive hicle routing with general interarrival and service time distributions,”
on a line segment, and move in a perpendicular direction Advances in Applied Probability, vol. 25, pp. 947–978, 1993.
at fixed speed. The problem has applications in perimeter [10] H. N. Psaraftis, “Dynamic programming solution to the single vehicle
many-to-many immediate request dial-a-ride problem,” Transportation
defense as well as robotic pick-and-place operations. In [19], Science, vol. 14, no. 2, pp. 130–154, 1980.
a setup is considered where the information on outstanding [11] M. Pavone, N. Bisnik, E. Frazzoli, and V. Isler, “A stochastic and
demands is provided to the vehicles through limited-range on- dynamic vehicle routing problem with time windows and customer im-
board sensors, thus adding a search component to the DVR patience,” ACM/Springer Journal of Mobile Networks and Applications,
vol. 14, no. 3, pp. 350–364, 2009.
problem with full information. The work in [49] and [50] [12] M. Pavone and E. Frazzoli, “Dynamic vehicle routing with stochastic
considers the dynamic pickup and delivery problem, where time constraints,” in Proc. IEEE Conf. on Robotics and Automation,
each demand consists of a source-destination pair, and the (Anchorage, Alaska), May 2010. To Appear.
[13] S. L. Smith, M. Pavone, F. Bullo, and E. Frazzoli, “Dynamic vehicle
vehicles are responsible for picking up a message at the source, routing with priority classes of stochastic demands,” SIAM Journal on
and delivering it to the destination. In [8], the authors consider Control and Optimization, vol. 48, no. 5, pp. 3224–3245, 2010.
the case in which each vehicle can serve at most a finite [14] S. D. Bopardikar, S. L. Smith, F. Bullo, and J. P. Hespanha, “Dynamic
vehicle routing for translating demands: Stability analysis and receding-
number of demands before returning to a depot for refilling. In horizon policies,” IEEE Transactions on Automatic Control, vol. 55,
[51], a DVR problem is considered involving vehicles whose no. 11, 2010. (Submitted, Mar 2009) to appear.
dynamics can be modeled by state space models that are [15] M. Pavone, E. Frazzoli, and F. Bullo, “Adaptive and distributed algo-
affine in control and have an output in R2 . Finally, in [21] rithms for vehicle routing in a stochastic and dynamic environment,”
IEEE Transactions on Automatic Control, 2009. Provisionally Accepted,
we consider a setup where the servicing of a demand needs http://arxiv.org/abs/0903.3624.
to be done by a vehicle under the supervision of a remotely [16] A. Arsie, K. Savla, and E. Frazzoli, “Efficient routing algorithms for
located human operator. multiple vehicles with no explicit communications,” IEEE Transactions
on Automatic Control, vol. 54, no. 10, pp. 2302–2317, 2009.
The dynamic vehicle routing approach presented in this [17] K. Savla, E. Frazzoli, and F. Bullo, “Traveling Salesperson Problems for
paper provides a new way of studying robotic systems in dy- the Dubins vehicle,” IEEE Transactions on Automatic Control, vol. 53,
namically changing environments. We have presented results no. 6, pp. 1378–1391, 2008.
[18] J. J. Enright, K. Savla, E. Frazzoli, and F. Bullo, “Stochastic and dynamic
for a wide variety of problems. However, this is by no means routing problems for multiple UAVs,” AIAA Journal of Guidance,
a closed book. There is great potential for obtaining more Control, and Dynamics, vol. 34, no. 4, pp. 1152–1166, 2009.
general performance guarantees by developing methods to deal [19] J. J. Enright and E. Frazzoli, “Cooperative UAV routing with limited
with correlation between demand positions. In addition, there sensor range,” in AIAA Conf. on Guidance, Navigation and Control,
(Keystone, CO), Aug. 2006.
are many other key problems in robotic systems that could [20] S. L. Smith and F. Bullo, “The dynamic team forming problem:
benefit from being studied from the perspective presented in Throughput and delay for unbiased policies,” Systems & Control Letters,
this paper. Some examples include search and rescue missions, vol. 58, no. 10-11, pp. 709–715, 2009.
[21] K. Savla, T. Temple, and E. Frazzoli, “Human-in-the-loop vehicle
force protection, map maintenance, and pursuit-evasion. routing policies for dynamic environments,” in IEEE Conf. on Decision
and Control, (Cancún, México), pp. 1145–1150, Dec. 2008.
[22] D. D. Sleator and R. E. Tarjan, “Amortized efficiency of list update and
ACKNOWLEDGMENTS paging rules,” Communications of the ACM, vol. 28, no. 2, pp. 202–208,
The authors wish to thank Alessandro Arsie, Shaunak D. 1985.
[23] S. O. Krumke, W. E. de Paepe, D. Poensgen, and L. Stougie, “News
Bopardikar, and John J. Enright for numerous helpful discus- from the online traveling repairmain,” Theoretical Computer Science,
sions about topics related to this paper. vol. 295, no. 1-3, pp. 279–294, 2003.
[24] S. Irani, X. Lu, and A. Regan, “On-line algorithms for the dynamic
traveling repair problem,” Journal of Scheduling, vol. 7, no. 3, pp. 243–
R EFERENCES 258, 2004.
[1] B. J. Moore and K. M. Passino, “Distributed task assignment for [25] P. Jaillet and M. R. Wagner, “Online routing problems: Value of
mobile agents,” IEEE Transactions on Automatic Control, vol. 52, no. 4, advanced information and improved competitive ratios,” Transportation
pp. 749–753, 2007. Science, vol. 40, no. 2, pp. 200–210, 2006.
[2] S. L. Smith and F. Bullo, “Monotonic target assignment for robotic [26] B. Golden, S. Raghavan, and E. Wasil, The Vehicle Routing Prob-
networks,” IEEE Transactions on Automatic Control, vol. 54, no. 9, lem: Latest Advances and New Challenges, vol. 43 of Operations
pp. 2042–2057, 2009. Research/Computer Science Interfaces. Springer, 2008.
[3] M. Alighanbari and J. P. How, “A robust approach to the UAV task [27] P. Van Hentenryck, R. Bent, and E. Upfal, “Online stochastic optimiza-
assignment problem,” International Journal on Robust and Nonlinear tion under time constraints,” Annals of Operations Research, 2009. To
Control, vol. 18, no. 2, pp. 118–134, 2008. appear.
18

[28] P. K. Agarwal and M. Sharir, “Efficient algorithms for geometric [54] A. G. Percus and O. C. Martin, “Finite size and dimensional dependence
optimization,” ACM Computing Surveys, vol. 30, no. 4, pp. 412–458, of the Euclidean traveling salesman problem,” Physical Review Letters,
1998. vol. 76, no. 8, pp. 1188–1191, 1996.
[29] Z. Drezner, ed., Facility Location: A Survey of Applications and Meth- [55] R. C. Larson and A. R. Odoni, Urban Operations Research. Prentice
ods. Series in Operations Research, Springer, 1995. Hall, 1981.
[30] H. Xu, Optimal Policies for Stochastic and Dynamic Vehicle Routing [56] D. Applegate, R. Bixby, V. Chvátal, and W. Cook, “On the solution of
Problems. PhD thesis, Massachusetts Institute of Technology, Cam- traveling salesman problems,” in Documenta Mathematica, Journal der
bridge, MA, 1995. Deutschen Mathematiker-Vereinigung, (Berlin, Germany), pp. 645–656,
[31] S. Bespamyatnikh, D. Kirkpatrick, and J. Snoeyink, “Generalizing ham Aug. 1998. Proceedings of the International Congress of Mathemati-
sandwich cuts to equitable subdivisions,” Discrete & Computational cians, Extra Volume ICM III.
Geometry, vol. 24, pp. 605–622, 2000. [57] N. Christofides, “Worst-case analysis of a new heuristic for the trav-
[32] M. Pavone, A. Arsie, E. Frazzoli, and F. Bullo, “Equitable par- eling salesman problem,” Tech. Rep. 388, Carnegie Mellon University,
titioning policies for robotic networks,” IEEE Transactions on Pittsburgh, PA, Apr. 1976.
Automatic Control, 2009. Provisionally Accepted, available at [58] S. Arora, “Nearly linear time approximation scheme for Euclidean TSP
http://arxiv.org/abs/0903.5267. and other geometric problems,” in IEEE Symposium on Foundations of
[33] J. D. Papastavrou, “A stochastic and dynamic routing policy using Computer Science, (Miami Beach, FL), pp. 554–563, Oct. 1997.
branching processes with state dependent immigration,” European Jour- [59] S. Lin and B. W. Kernighan, “An effective heuristic algorithm for the
nal of Operational Research, vol. 95, pp. 167–177, 1996. traveling-salesman problem,” Operations Research, vol. 21, pp. 498–
[34] M. Pavone, Dynamic Vehicle Routing for Robotic Networks. Dept. of 516, 1973.
Aeronautics and Astronautics, Massachusetts Institute of Technology, [60] N. Megiddo and K. J. Supowit, “On the complexity of some common
Cambridge, MA, 2010. geometric location problems,” SIAM Journal on Computing, vol. 13,
[35] L. E. Dubins, “On curves of minimal length with a constraint on no. 1, pp. 182–196, 1984.
average curvature and with prescribed initial and terminal positions and [61] C. H. Papadimitriou, “Worst-case and probabilistic analysis of a geo-
tangents,” American Journal of Mathematics, vol. 79, pp. 497–516, 1957. metric location problem,” SIAM Journal on Computing, vol. 10, no. 3,
[36] S. M. LaValle, Planning Algorithms. Cambridge University Press, 2006. 1981.
Available at http://planning.cs.uiuc.edu.
[37] U. Boscain and B. Piccoli, Optimal Syntheses for Control Systems on
2-D Manifolds. Mathématiques et Applications, Springer, 2004. A PPENDIX
[38] L. Pallottino and A. Bicchi, “On optimal cooperative conflict resolution
for air traffic management systems,” IEEE Transactions on Intelligent A. Asymptotic Properties of the Traveling Salesman Problem
Transportation Systems, vol. 1, no. 4, pp. 221–231, 2000. in the Euclidean Plane
[39] C. Tomlin, I. Mitchell, and R. Ghosh, “Safety verification of conflict res-
olution manoeuvres,” IEEE Transactions on Intelligent Transportation Let D be a set of n points in Rd and let TSP(D) denote
Systems, vol. 2, no. 2, pp. 110–120, 2001. the minimum length of a tour through all the points in D;
[40] P. Chandler, S. Rasmussen, and M. Pachter, “UAV cooperative path
planning,” in AIAA Conf. on Guidance, Navigation and Control, (Denver, by convention, TSP(∅) = 0. Assume that the locations of the
CO), Aug. 2000. n points are random variables independently and identically
[41] C. Schumacher, P. R. Chandler, S. J. Rasmussen, and D. Walker, “Task distributed in a compact set Q; in [52], [53] it is shown that
allocation for wide area search munitions with variable path length,” in
American Control Conference, (Denver, CO), pp. 3472–3477, 2003. there exists a constant βTSP,d such that, almost surely,
[42] E. Zemel, “Probabilistic analysis of geometric location problems,”
TSP(D)
Z
Annals of Operations Research, vol. 1, no. 3, pp. 215–238, 1984.
[43] K. Savla and E. Frazzoli, “On endogenous reconfiguration for mobile
lim = βTSP,d ϕ̄(q)1−1/d dq, (20)
n→+∞ n1−1/d Q
robotic networks,” in Workshop on Algorithmic Foundations of Robotics,
(Guanajuato, Mexico), Dec. 2008. where ϕ̄ is the density of the absolutely continuous part of the
[44] E. K. P. Chong, C. M. Kreucher, and A. O. Hero III, “Monte-Carlo-
based partially observable Markov decision process approximations point distribution. For the case d = 2, the constant βTSP,2 has
for adaptive sensing,” in Int. Workshop on Discrete Event Systems, been estimated numerically as βTSP,2 ' 0.7120±0.0002 [54].
(Göteborg, Sweden), pp. 173–180, May 2008. Notice that the bound (20) holds for all compact sets: the
[45] B. Korte and J. Vygen, Combinatorial Optimization: Theory and Al-
gorithms, vol. 21 of Algorithmics and Combinatorics. Springer, 4 ed., shape of the set only affects the convergence rate to the limit.
2007. According to [55], if Q is a “fairly compact and fairly convex”
[46] P. Gupta and P. R. Kumar, “The capacity of wireless networks,” IEEE set in the plane, then equation (20) provides a “good” estimate
Transactions on Information Theory, vol. 46, no. 2, pp. 388–404, 2000.
[47] V. Sharma, M. Savchenko, E. Frazzoli, and P. Voulgaris, “Transfer time of the optimal TSP tour length for values of n as low as 15.
complexity of conflict-free vehicle routing with no communications,”
International Journal of Robotics Research, vol. 26, no. 3, pp. 255–
272, 2007. B. Tools for Solving TSPs
[48] S. L. Smith, S. D. Bopardikar, and F. Bullo, “A dynamic boundary
guarding problem with translating demands,” in IEEE Conf. on Decision The TSP is known to be NP-complete, which suggests that
and Control, (Shanghai, China), pp. 8543–8548, Dec. 2009. there is no general algorithm capable of finding the optimal
[49] H. A. Waisanen, D. Shah, and M. A. Dahleh, “A dynamic pickup and tour in an amount of time polynomial in the size of the input.
delivery problem in mobile networks under information constraints,”
IEEE Transactions on Automatic Control, vol. 53, no. 6, pp. 1419–1433, Even though the exact optimal solution of a large TSP can be
2008. very hard to compute, several exact and heuristic algorithms
[50] M. R. Swihart and J. D. Papastavrou, “A stochastic and dynamic model and software tools are available for the numerical solution of
for the single-vehicle pick-up delivery problem,” European Journal of
Operational Research, vol. 114, pp. 447–464, 1999. TSPs.
[51] S. Itani, E. Frazzoli, and M. A. Dahleh, “Dynamic travelling repairperson The most advanced TSP solver to date is arguably
problem for dynamic systems,” in IEEE Conf. on Decision and Control, concorde [56]. Polynomial-time algorithms are available
(Cancún, México), pp. 465–470, Dec. 2008.
[52] J. Beardwood, J. Halton, and J. Hammersly, “The shortest path through for constant-factor approximations of TSP solutions, among
many points,” in Proceedings of the Cambridge Philosophy Society, which we mention Christofides’ algorithm [57]. On a more
vol. 55, pp. 299–327, 1959. theoretical side, Arora proved the existence of polynomial-
[53] J. M. Steele, “Probabilistic and worst case analyses of classical problems
of combinatorial optimization in Euclidean space,” Mathematics of time approximation schemes for the TSP, providing a (1 + ε)
Operations Research, vol. 15, no. 4, p. 749, 1990. constant-factor approximation for any ε > 0 [58].
19

A modified version of the Lin-Kernighan heuristic [59]


is implemented in linkern; this powerful solver yields
approximations in the order of 5% of the optimal tour cost very
quickly for many instances. For example, in our numerical
experiments on a machine with a 2GHz Intel Core Duo
processor, approximations of random TSPs with 10,000 points
typically required about twenty seconds of CPU time.4
We presented algorithms that require online solutions of
possibly large TSPs. Practical implementations of the algo-
rithms would rely on heuristics or on polynomial-time approx-
imation schemes, such as Lin-Kernighan’s or Christofides’. If
a constant-factor approximation algorithm is used, the effect
on the asymptotic performance guarantees of our algorithms
can be simply modeled as a scaling of the constant βTSP,d .

C. Properties of the Multi-Median Function


In this subsection, we state certain useful properties of the
multi-median function, introduced in Section III-A. The multi-
median function can be written as
Hm (P, Q) := E mink∈{1,...,m} kpk − qk
 

Xm Z
= kpk − qkϕ(q) dq,
k=1 Vk (P )

where V(P ) = (V1 (P ), . . . , Vm (P )) is the Voronoi partition


of the set Q generated by the points P . It is straightforward
to show that the map P 7→ H1 (P, Q) is differentiable and
strictly convex on Q. Therefore, it is a simple computational
task to compute P1∗ (Q). On the other hand, when m > 1, the
map P 7→ Hm (P, Q) is differentiable (whenever (p1 , . . . , pm )
are distinct) but not convex, thus making the solution of the
continuous m-median problem hard in the general case. It is
known [28], [60] that the discrete version of the m-median
problem is NP-hard for d ≥ 2. However, one can provide
tight bounds on the map m 7→ Hm ∗
(Q). This problem is
studied thoroughly in [61] for square regions and in [42] for
more general compact√ regions. It is p shown that, in the limit
m → +∞, Hm ∗
(Q) m → chex |Q| almost surely [42],
where chex ≈ 0.377 is the first moment of a hexagon of unit
area about its center. This optimal asymptotic value is achieved
by placing the m points at the centers of the hexagons in a
regular hexagonal lattice within Q (the honeycomb heuristic).
Working towards the above result, it is also shown [42], [18]
that for any m ∈ N:
r r
|Q| |Q|
0.3761 ≤ Hm (Q) ≤ c(Q)

, (21)
m m
where c(Q) is a constant depending on the shape of Q.

4 The concorde and linkern solvers are freely available for academic
research use at www.tsp.gatech.edu/concorde/index.html.

View publication stats

You might also like