Advanced Database Systems
Spring 2025
Lecture #12:
Query Evaluation: Processing Models
R&G: Chapter 14
2
P ROCESSING M ODEL
Processing model defines how the DBMS executes a query plan
Different trade-offs for different workloads
Three main approaches:
Iterator model
Vectorised (batch) model
Materialisation model
3
I TERATOR M ODEL
Each query plan operator implements three functions:
open() – initialise the operator’s internal state
next() – return either the next result tuple or a null marker if there are no more tuples
close() – clean up all allocated resources
Each operator instance maintains an internal state
Any operator can be input to any other (composability)
Since they all implement the same interface
Also called Volcano or Pipeline Model
Goetz Graefe. Volcano – An Extensible and Parallel Query Evaluation System. IEEE TKDE 1994
4
I TERATOR M ODEL
Top-down plan processing
The whole plan is initially reset by calling open() on the root operator
The open() call is forwarded through the plan by the operators themselves
Control returns to the query processor
The root is requested to produce its next() result record
Operators forward the next() request as needed. As soon as the next result record is
produced, control returns to the query processor again
Used in almost every DBMS
5
I TERATOR M ODEL
Query processor uses the following routine to evaluate a query plan
Function eval(q )
[Link]()
r = [Link]()
while r != EOF do
/* deliver record r (print, ship to DB client) */
emit(r )
r = [Link]()
/* resource deallocation now */
[Link]()
Output control (e.g., LIMIT) works easily with this model
6
E XAMPLE : S ELECTION σ p ( ON -THE - FLY )
A streaming operator: small amount of work per tuple
Predicate p stored in internal state
open() close()
[Link]() [Link]()
next()
while (r = [Link]()) != EOF do
if p(r) return r
return EOF
8
E XAMPLE : H EAP S CAN
Leaf of the query plan, often includes a selection predicate
open( )
heap = open heap file for this relation // file handle
cur_page = heap.first_page() // first page
cur_slot = cur_page.first_slot() // first slot on that page
next( )
if cur_page == NULL return EOF
current = tuple at (cur_page, cur_slot) // tuple to be returned
cur_slot = cur_slot.advance() // advance slot for subseq. calls
if cur_slot == NULL // advance to next page, first slot
cur_page = cur_page.advance()
if cur_page != NULL
cur_slot = cur_page.first_slot() close( )
return current [Link]()
9
E XAMPLE : N ESTED L OOPS J OIN
Volcano-style implementation of nested loops join R ⋈p S
open( ) next( )
left_child.open() while r != EOF do
right_child.open() while (s = right_child.next()) != EOF do
r = left_child.next() if p(r,s) return <r,s>
/* reset inner join input */
right_child.close()
close( ) right_child.open()
left_child.close() r = left_child.next()
right_child.close() return EOF
10
E XAMPLE : S ORT (2- PASS )
open( )
// first, all of pass 0, a blocking call
[Link]()
repeatedly call [Link]() and generate the sorted runs on disk, until child gives EOF
// second, set up for pass 1, assumes enough buffers to merge
open each sorted run file and load one page per run into input buffer for pass 1
next( ) // pass 1 merge (assumes enough buffers to merge)
output = min tuple across all buffers
if min tuple was last one in its buffer
fetch next page from that run into buffer
return output // (or EOF if no tuples remain)
close( )
deallocate the runs files
[Link]()
11
I TERATOR M ODEL
for t in [Link](): emit returns SELECT [Link], [Link]
emit(projection(t)) control to caller FROM R, S
WHERE [Link] = [Link]
AND [Link] > 100
for t1 in [Link]():
buildHashTable(t1)
for t2 in [Link]():
if probe(t2): emit(t1⋈ t2)
π [Link], [Link]
for t in [Link](): ⋈ [Link] = [Link]
if evalPred(t): emit(t)
σ value > 100
for t in R: for t in S:
emit(t) emit(t) R S
12
I TERATOR M ODEL
for t in [Link](): SELECT [Link], [Link]
1 emit(projection(t)) FROM R, S
WHERE [Link] = [Link]
AND [Link] > 100
for t1 in [Link]():
2 buildHashTable(t1)
for t2 in [Link]():
if probe(t2): emit(t1⋈ t2)
π [Link], [Link]
for t in [Link](): 4 ⋈ [Link] = [Link]
if evalPred(t): emit(t)
σ value > 100
3 for t in R: for t in S: 5
emit(t) emit(t) R S
13
I TERATOR M ODEL
Allows for tuple pipelining
The DBMS process a tuple through as many operators as possible
before having to retrieve the next tuple
Reduces memory requirements and response time since each chunk
of input is propagated to the output immediately
Some operators will block until children emit all of their tuples
E.g., sorting, hash join, grouping and duplicate elimination over
unsorted input, subqueries
The data is typically buffered (“materialised”) on disk
14
I TERATOR M ODEL
+ Nice & simple interface
+ Allows for easy combination of operators
– Next called for every single tuple & operator
– Virtual call via function pointer
Degrades branch prediction of modern CPUs
– Poor code locality and complex bookkeeping
Each operator keeps state to know where to resume
15
V ECTORISATION M ODEL
Like Iterator Model, each operator implements a next() function
Each operator emits a batch of tuples instead of a single tuple
The operator’s internal loop processes multiple tuples at a time
The size of the batch can vary based on hardware and query properties
Ideal for OLAP queries
Greatly reduces the number of invocations per operator
Operators can use vectorised (SIMD) instructions to process batches of tuples
16
V ECTORISATION M ODEL
out = { }
1 for t in [Link](): SELECT [Link], [Link]
[Link](projection(t))
if |out| > n: emit(out) FROM R, S
WHERE [Link] = [Link]
AND [Link] > 100
2 out = { }
for t1 in [Link]():
buildHashTable(t1)
for t2 in [Link]():
if probe(t2): [Link](t1⋈ t2)
π [Link], [Link]
if |out| > n: emit(out)
out = { }
for t in [Link](): 4
⋈ [Link] = [Link]
if evalPred(t): [Link](t)
if |out| > n: emit(out) σ value > 100
out = { } out = { }
3 5
for t in R:
[Link](t)
for t in S:
[Link](t) R S
if |out| > n: emit(out) if |out| > n: emit(out)
17
M ATERIALISATION M ODEL
Each operator processes its input all at once and then emits its output
The operator “materialises” its output as a single result
Bottom-up plan processing
Data not pulled by operators but pushed towards them
Leads to better code and data locality
Better for OLTP workloads
OLTP queries typically only access a small number of tuples at a time
Not good for OLAP queries with large intermediate results
18
M ATERIALISATION M ODEL
5 out = { } SELECT [Link], [Link]
for t in [Link]():
[Link](projection(t)) FROM R, S
WHERE [Link] = [Link]
AND [Link] > 100
4 out = { }
for t1 in [Link]():
buildHashTable(t1)
for t2 in [Link]():
if probe(t2): [Link](t1⋈ t2)
π [Link], [Link]
out = { }
for t in [Link](): 3
⋈ [Link] = [Link]
if evalPred(t): [Link](t)
σ value > 100
1 out = { } out = { }
2
for t in R:
[Link](t)
for t in S:
[Link](t)
R S
19
P ROCESSING M ODELS : S UMMARY
Iterator / Volcano
Direction: Top-Down
Emits: Single Tuple
Target: General Purpose
Vectorised Materialisation
Direction: Top-Down Direction: Bottom-Up
Emits: Tuple Batch Emits: Entire Tuple Set
Target: OLAP Target: OLTP