Chapter 2
Chapter 2
QUEUES
Simulation is often used in the analysis of queueing models. In a simple but typical
queueing model, shown in Figure 2.1, customers arrive from time to time and join a queue
(waiting line), are eventually served, and finally leave the system. The term customer refers to
any type of entity that can be viewed as requesting service from a system. Therefore, many
service facilities, production systems, repair and maintenance facilities, communications and
computer systems, and transport and material-handling systems can be viewed as queueing
systems.
Queueing models, whether solved mathematically or analyzed through simulation, provide the
analyst with a powerful tool for designing and evaluating the performance of queueing systems.
Typical measures of system performance include server utilization (percentage of time a server
is busy), length of waiting lines, and delays of customers. Quite often, when designing or
attempting to improve a queueing system, the analyst (or decision maker) is involved in
tradeoffs between server utilization and customer satisfaction in terms of line lengths and
delays. Queueing theory and simulation analysis are used to predict these measures of system
performance as a function of the input parameters. The input parameters include the arrival rate
of customers, the service demands of customers, the rate at which a server works, and the
number and arrangement of servers.
The key elements of a queueing system are the customers and servers. The term customer can
refer to people, machines, trucks, mechanics, patients, pallets, airplanes, e-mail, cases, orders,
or dirty clothes-anything that arrives at a facility and requires service. The term server might
refer to receptionists, repair personnel, mechanics, medical personnel, automatic storage and
retrieval machines (e.g., cranes), runways at an airport, automatic packers, order pickers, CPUs
in a computer, or washing machines-any resource (person, machine, etc.) that provides the
requested service. Although the terminology employed will be that of a customer arriving at a
service facility, sometimes the server moves to the customer; for example, a repair person
moving to a broken machine. This in no way invalidates the models but is merely a matter of
terminology.
They may be constant or of random duration. In the latter case is usually characterized as a
sequence of independent and identically distributed random variables.
Sometimes services are identically distributed for all customers of a given type or class or
priority, whereas customers of different types might have completely different service time
distributions. In addition, in some systems, service times depend upon the time of day or upon
the length of the waiting line. For example, servers might work faster than usual when the
waiting line is long, thus effectively reducing the service times.
Service Mechanism (m)
A queueing system consists of a number of service centers and interconnecting queues. Each
service center consists of some number of servers, m, working in parallel; that is, upon getting
to the head of the line, a customer takes the first available server. Parallel service mechanisms
are either single server (m = 1), multiple server (1 < m < ∞), or unlimited servers (m = ∞). A
self-service facility is usually characterized as having an unlimited number of servers.
System Capacity (N)
In many queueing systems, there is a limit to the number of customers that may be in the
waiting line or system. For example, an automatic car wash might have room for only 10 cars
to wait in line to enter the mechanism. An arriving customer who finds the system full does not
enter but returns immediately to the calling population.
Queue discipline refers to the logical ordering of customers in a queue and determines which
customer will be chosen for service when a server becomes free. Common queue disciplines
include first-in-first-out (FIFO), last-in-first-out (LIFO), service in random order (SIRO),
shortest processing time first (SPT), and service according to priority (PR). In a manufacturing
system, queue disciplines are sometimes based on due dates and on expected processing time
for a given type of job.
KENDALL NOTATION
A/S/m/N/K/SD
For example, M/M / l/∞/∞ indicates a single-server system that has unlimited queue capacity
and an infinite population of potential arrivals. The interarrival times and service times are
exponentially distributed. When N and K are infinite, they may be dropped from the notation.
For example, M /M /1/∞/∞ is often shortened to M /M/ l.
Performance Metrics
λ= Mean arrival rate = In some systems, this can be a function of the state of the system. For
example, it can depend upon the number of jobs already in the system.
s = service time per job.
μ= mean service rate per server, = 1/E[S]. Total service rate for m servers is mμ.
n= number of jobs in the system. This is also called queue length.
Notice that this includes jobs currently receiving service as well as those waiting in the queue.
nq=number of jobs waiting to receive service. This is always less than n, since it does not
include the jobs currently receiving service.
ns=number of jobs receiving service
r=response time or the time in the system. This includes both the time waiting for service and
the time receiving service.
w=waiting time, that is, the time interval between arrival time and the instant the service
begins.
Figure 2.2 Performance Metrics
Here, m is the number of servers. This stability condition does not apply to the finite population
and the finite buffer systems. In the finite population systems, the queue length is always finite;
the system can never become unstable.
2. Number in System versus Number in Queue: The number of jobs in the system is always
equal to the sum of the number in the queue and the number receiving service:
n = nq + ns
Notice that n, nq, and ns, are random variables. In particular, this equality leads to the following
relationship among their means:
E[n]=E[nq] + E[ns]
The mean number of jobs in the system is equal to the sum of the mean number in the queue
and the mean number in service.
3. Number versus Time: If jobs are not lost due to insufficient buffers, the mean number of jobs
in a system is related to its mean response time as follows:
Mean number of jobs in system = arrival rate × mean response time
Similarly, Mean number of jobs in queue = arrival rate × mean waiting time Little’s law.
4. Time in System versus Time in Queue: The time spent by a job in a queueing system is, equal
to the sum of the time waiting in the queue and the time receiving service:
r=w+s
Notice that r, w, and s are random variables. In particular, this equality leads to the following
relationship among their means:
E[r] = E[w] + E[s]
That is, the mean response time is equal to the sum of the mean waiting time and the mean
service time. If the service rate is independent of the number of jobs in the queue, we have
QUEUEING NETWORKS
We dealt with a single isolated queueing system. It is a natural extension for us now to
look at collection of interactive queueing systems, networks of queues, where the departures
of some queues form the arrivals of others. The analysis of a queueing network is much more
complicated and involved due to the interactions among various queues and we have to
examine them as a whole. The state of one queue is generally dependent of the others because
of feedback loops. From the network topology point of view, queueing networks can be
categorized into two generic classes, namely, open queueing networks and closed queueing
networks.
Open Queueing Networks In an open queueing network, customers arrive from external
sources outside the domain of interest, go through several queues or even revisit a particular
queue more than once and finally leave the system. The total sum of arrival rates is equal to
the total departure rate under steady state conditions.
Closed Queueing Networks A closed queueing network is one in which customers neither
arrive at nor depart from the system. The existing customers in the network simply circulate
through various queues and may revisit a particular queue more than once as in case of open
queueing networks.