Common Table Expressions
in MariaDB 10.2
Igor Babaev | Principal Software Engineer, MariaDB
Sergei Petrunia | Senior Software Engineer, MariaDB
{igor,sergey}@mariadb.com
M|17, April 10-11th
, 2017
2
Common Table Expressions

A standard SQL feature

Two kinds of CTEs
●
Recursive
●
Non-recursive

Supported by Oracle, MS SQL Server, PostgreSQL, SQLite, …

Available in MariaDB 10.2 (since Sept, 2016)

MySQL

Available in MySQL-8.0.0-labs-optimizer tree (Sept, 2016)

Available in MySQL 8.0.1 (appeared on github as of Apr 9, 2017)
3
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
4
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
5
CTE name
CTE Body
CTE Usage
with engineers as (
select *
from employees
where dept='Engineering'
)
select *
from engineers
where ...
WITH
CTE syntax

Similar to DERIVED
tables

“Query-local VIEWs”
6
select *
from
(
select *
from employees
where
dept='Engineering'
) as engineers
where
...
with engineers as (
select *
from employees
where dept='Engineering'
)
select *
from engineers
where
...
CTEs are like derived tables
7
with engineers as (
select * from employees
where dept in ('Development','Support')
),
eu_engineers as (
select * from engineers where country IN ('NL',...)
)
select
...
from
eu_engineers;
Use case #1: CTEs refer to CTEs

More readable than nested FROM(SELECT …)
8
with engineers as (
select * from employees
where dept in ('Development','Support')
),
select *
from
engineers E1
where not exists (select 1
from engineers E2
where E2.country=E1.country
and E2.name <> E1.name);
Use case #2: Multiple uses of CTE

Anti-self-join
9
select *
from
sales_product_year CUR,
sales_product_year PREV,
where
CUR.product=PREV.product and
CUR.year=PREV.year + 1 and
CUR.total_amt > PREV.total_amt
with sales_product_year as (
select
product,
year(ship_date) as year,
sum(price) as total_amt
from
item_sales
group by
product, year
)
Use case #2: example 2

Year-over-year comparisons
10
select *
from
sales_product_year S1
where
total_amt > (select
0.1*sum(total_amt)
from
sales_product_year S2
where
S2.year=S1.year)
with sales_product_year as (
select
product,
year(ship_date) as year,
sum(price) as total_amt
from
item_sales
group by
product, year
)
Use case #2: example 3

Compare individuals against their group
11
Conclusions so far

Non-recursive CTEs are “Query-local VIEWs”

One CTE can refer to another

Better than nested FROM (SELECT …)

Can refer to a CTE from multiple places

Better than copy-pasting FROM(SELECT …)

CTE adoption

TPC-H (1999) - no CTEs

SQL:1999 introduces CTEs

TPC-DS (2011) - 38 of 100 queries use CTEs.
12
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
13
with engineers as (
select * from employees
where
dept='Engineering' or dept='Support'
)
select
...
from
engineers,
other_table, ...
Base algorithm: materialize in a temporary table

Always works

Often not optimal
14
with engineers as (
select * from employees
where
dept='Development'
)
select
...
from
engineers E,
support_cases SC
where
E.name=SC.assignee and
SC.created='2016-09-30' and
E.location='Amsterdam'
select
...
from
employees E,
support_cases SC
where
E.dept='Development' and
E.name=SC.assignee and
SC.created='2016-09-30' and
E.location='Amsterdam'
Optimization #1: CTE Merging

Join optimizer can pick any plan

e.g. support employee→
15
Optimization #1: CTE Merging (2)

Requirement

CTE is just a JOIN : no GROUP BY, DISTINCT, etc

Output

CTE is merged into parent’s join

Optimizer can pick the best query plan

This is the same as ALGORITHM=MERGE for VIEWs
16
with sales_per_year as (
select
year(order.date) as year
sum(order.amount) as sales
from
order
group by
year
)
select *
from sales_per_year
where
year in ('2015','2016')
with sales_per_year as (
select
year(order.date) as year
sum(order.amount) as sales
from
order
where
year in ('2015','2016')
group by
year
)
select *
from sales_per_year
Optimization #2: condition pushdown
17
Condition pushdown summary

Used when merging is not possible

CTE has a GROUP BY

Makes temp. table smaller

Allows to filter out whole GROUP-BY groups

Besides CTEs, works for derived tables and VIEWs

Based on Galina Shalygina’s GSOC 2016 project:

“Pushing conditions into non-mergeable views and derived
tables in MariaDB”
18
with product_sales as (
select
product_name,
year(sale_date),
count(*) as count
from
product_sales
group by product, year)
select *
from
product_sales P1,
product_sales P2
where
P1.year = 2010 AND
P2.year = 2011 AND ...
Optimization #3: CTE reuse
The idea

Fill the CTE once

Then use multiple times

Hard to do with condition
pushdown
19
CTE Merge Condition pushdown CTE reuse
MariaDB 10.2 ✔ ✔ ✘
MS SQL Server ✔ ✔ ✘
PostgreSQL ✘ ✘ ✔
MySQL 8.0.0-labs-optimizer ✔ ✘ ✔*
CTE Optimizations summary

Merge and condition pushdown are most important

MariaDB supports them, like MS SQL.

PostgreSQL’s approach is *weird*

“CTEs are optimization barriers”

MySQL’s labs tree: “try merging, otherwise reuse”
Recursive CTEs
21
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
22
wheel
boltcapnut
tire valve
rimtirespokes
Chicago
Nashville Atlanta
Orlando
Recursive CTEs

SQL is poor at “recursive” data structures/algorithms

First attempt: Oracle’s CONNECT BY syntax (80’s)

Superseded by Recursive CTEs

SQL:1999, implementations in 2007-2009
●
Trees ●
Graphs
23
Recursive part
Anchor part
Recursive use of CTE
“recursive”
Recursive CTE syntax
with recursive ancestors as (
select * from folks
where name = 'Alex'
union [all]
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
24
Sister AmyAlex
Mom Dad
Grandpa Bill
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
Recursive CTE computation

Consider a dataset
25
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
+------+--------------+--------+--------+
Result table
Step #1: execute the anchor part
Computation
26
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
+------+--------------+--------+--------+
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
Step #2: execute the recursive part
Computation
Result table
27
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
+------+--------------+--------+--------+
Result table
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
Step #2: Add results the result table
Computation
28
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
Result table+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
+------+--------------+--------+--------+
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
Computation
Step #3: Execute the recursive part again
29
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
+------+--------------+--------+--------+
Result table
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
Computation
●
Step #2: Add results the result table
●
Dad and Mom are already there
30
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
Result table
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
+------+--------------+--------+--------+
Computation
Step #4: Execute the recursive part again
31
with recursive ancestors as (
select * from folks
where name = 'Alex'
union
select f.*
from folks as f, ancestors AS a
where
f.id = a.father or f.id = a.mother
)
select * from ancestors;
Result table
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
| 98 | Sister Amy | 20 | 30 |
+------+--------------+--------+--------+
+------+--------------+--------+--------+
| id | name | father | mother |
+------+--------------+--------+--------+
| 100 | Alex | 20 | 30 |
| 20 | Dad | 10 | NULL |
| 30 | Mom | NULL | NULL |
| 10 | Grandpa Bill | NULL | NULL |
+------+--------------+--------+--------+
Computation
●
Step #4: No [new] results
●
The process finishes.
32
1. Compute anchor_part
2. Compute recursive_part
to get the new data
3. if (new data is non-empty)
goto 2;
with recursive R as (
select anchor_part
union [all]
select recursive_part
from R, …
)
select …
Summary so far
33
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
34
bus_routes
+------------+------------+
| origin | dst |
+------------+------------+
| New York | Boston |
| Boston | New York |
| New York | Washington |
| Washington | Boston |
| Washington | Raleigh |
+------------+------------+
New York
Boston Washington
Raleigh
Transitive closure
35
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_dst, bus_routes
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
Transitive closure
New York
Boston Washington
Raleigh
● bus_dst is where one can be
● Start from New York (with a datatype trick)
36
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_dst, bus_routes
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
Transitive closure
New York
Boston Washington
Raleigh
● Put into the work table
+------------+
| dst |
+------------+
| New York |
+------------+
37
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_dst, bus_routes
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
+------------+
| dst |
+------------+
| New York |
+------------+
Transitive closure
New York
Boston Washington
Raleigh
● Join bus_dst with bus_routes.
● New destinations: Boston, Washington
38
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_routes, bus_dst
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
+------------+
Transitive closure
New York
Boston Washington
Raleigh
● Add new destinations to the temp. table
39
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_routes, bus_dst
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
+------------+
Transitive closure
New York
Boston Washington
Raleigh
●
Join bus_dst with bus_routes
– Raleigh, Boston, New York
40
with recursive bus_dst as
(
select origin as dst
from bus_routes
where origin='New York'
union
select bus_routes.dst
from bus_routes, bus_dst
where
bus_dst.dst= bus_routes.origin
)
select * from bus_dst
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
| Raleigh |
+------------+
Transitive closure
New York
Boston Washington
Raleigh
●
Join bus_dst with bus_routes
– Raleigh, Boston, New York
41
Summary so far

Can compute transitive closure

UNION prevents loops.
New York
Boston Washington
Raleigh
42
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
43
bus_routes
+------------+------------+
| origin | dst |
+------------+------------+
| New York | Boston |
| Boston | New York |
| New York | Washington |
| Washington | Boston |
| Washington | Raleigh |
+------------+------------+
Computing “Paths”
New York
Boston Washington
Raleigh
●
Want paths like New York Washington Raleigh→ →
44
with recursive paths (cur_path, cur_dest) as
(
select origin, origin
from bus_routes
where origin='New York'
union
select
concat(paths.cur_path, ',',
bus_routes.dest),
bus_routes.dest
from paths, bus_routes
where
paths.cur_dest= bus_routes.origin and
locate(bus_routes.dest, paths.cur_path)=0
)
select * from paths
Computing “Paths”
New York
Boston Washington
Raleigh
Collect a path
Don’t construct loops
45
select
concat(paths.cur_path, ',',
bus_routes.dest),
bus_routes.dest
from paths, bus_routes
where
paths.cur_dest= bus_routes.origin and
locate(bus_routes.dest, paths.cur_path)=0
+-----------------------------+------------+
| cur_path | cur_dest |
+-----------------------------+------------+
| New York | New York |
| New York,Boston | Boston |
| New York,Washington | Washington |
| New York,Washington,Boston | Boston |
| New York,Washington,Raleigh | Raleigh |
+-----------------------------+------------+
Computing “Paths”
New York
Boston Washington
Raleigh
46
How recursion stops

Tree or Directed Acyclic Graph walking

Execution is guaranteed to stop

Computing transitive closure

Use UNION

Computing “Paths” over graph with loops

Put condition into WHERE to stop loops/growth

Safety measure: @@max_recursive_iterations

Like in SQL Server

MySQL-8.0: @@max_execution_time ?
47
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
48
with recursive R as (
select anchor_part
union [all]
select recursive_part
from R, …
)
select …
[Non-]linear recursion
The SQL standard requires
that recursion is linear:

recursive_part must refer to
R only once

No self-joins

Not from subqueries

Not from inner side of an
outer join

...
49
R
x a
+y
+b
Linearity of SELECT statements
50
with recursive R as (
select anchor_part
union [all]
select recursive_part
from R, …
)
select …
Linear recursion

New data is generated by “wave-front” elements

Contents of R are always growing
51
Plan

Non-recursive CTEs

Use cases

Optimizations

Recursive CTEs

Basics

Transitive closure

Paths

(Non-)linear recursion

Mutual recursion
52
with recursive C1 as (
select …
from anchor_table
union
select …
from C2
),
C2 as (
select …
from C1
)
select ...
Mutual recursion

Multiple CTEs refer to each
other

Useful for “bi-partite” graphs

MariaDB supports it

No other database does
53
Modules and objects
M1
v3
v9v9 v4

A module consumes objects and produces other objects
v
objects
v3
v9
v4
...
m
modules
m1
...
m1
...
(m, v)
m1,v3
...
m1,v9
module_arguments
module_results
(m, v)
m1,v4
...
54
Modules and objects
M1 M2 M3
v3
v9v9 v4
v7 v1
v6 v10

A module consumes objects and produces other objects
55
Modules and objects

What objects can be produced from objects v3, v9, v7
M1 M2 M3
v3
v9v9 v4
v7 v1
v6 v10
56
Query part #1: objects produced from modules
with recursive
reached_objects as
(
select v, "init"
from objects
where v in ('v3','v7','v9')
union
select module_results.v, module_results.m
from module_results, applied_modules
where module_results.m = applied_modules.m
),
57
Query part #2: modules ready to be applied
applied_modules as
(
select * from modules where 1=0
union
select modules.m
from modules left join (
module_arguments left join reached_objects
on module_arguments.v = reached_objects.v )
on reached_objects.v is null and
modules.m = module_arguments.m
where module_arguments.m is null
)
select * from reached_objects;
58
Query result
+------+------+
| v | init |
+------+------+
| v3 | init |
| v7 | init |
| v9 | init |
| v4 | m1 |
| v1 | m2 |
| v6 | m2 |
| v10 | m3 |
+------+------+
59
Further plans

MariaDB 10.3

Non-recursive CTEs

Temporary table re-use

Recursive CTEs

10.3 feature: SELECT … EXCEPT SELECT ….

Make recursive CTEs support EXCEPT in addition to UNION.
60
Conclusions

MariaDB 10.2 has Common Table Expressions

Both Recursive and Non-recursive are supported

Non-recursive

“Query-local VIEWs”

Competitive set of query optimizations

Recursive

Useful for tree/graph-walking queries

Mutual and non-linear recursion is supported.
Thanks!
Q&A

Common Table Expressions in MariaDB 10.2

  • 1.
    Common Table Expressions inMariaDB 10.2 Igor Babaev | Principal Software Engineer, MariaDB Sergei Petrunia | Senior Software Engineer, MariaDB {igor,sergey}@mariadb.com M|17, April 10-11th , 2017
  • 2.
    2 Common Table Expressions  Astandard SQL feature  Two kinds of CTEs ● Recursive ● Non-recursive  Supported by Oracle, MS SQL Server, PostgreSQL, SQLite, …  Available in MariaDB 10.2 (since Sept, 2016)  MySQL  Available in MySQL-8.0.0-labs-optimizer tree (Sept, 2016)  Available in MySQL 8.0.1 (appeared on github as of Apr 9, 2017)
  • 3.
    3 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 4.
    4 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 5.
    5 CTE name CTE Body CTEUsage with engineers as ( select * from employees where dept='Engineering' ) select * from engineers where ... WITH CTE syntax  Similar to DERIVED tables  “Query-local VIEWs”
  • 6.
    6 select * from ( select * fromemployees where dept='Engineering' ) as engineers where ... with engineers as ( select * from employees where dept='Engineering' ) select * from engineers where ... CTEs are like derived tables
  • 7.
    7 with engineers as( select * from employees where dept in ('Development','Support') ), eu_engineers as ( select * from engineers where country IN ('NL',...) ) select ... from eu_engineers; Use case #1: CTEs refer to CTEs  More readable than nested FROM(SELECT …)
  • 8.
    8 with engineers as( select * from employees where dept in ('Development','Support') ), select * from engineers E1 where not exists (select 1 from engineers E2 where E2.country=E1.country and E2.name <> E1.name); Use case #2: Multiple uses of CTE  Anti-self-join
  • 9.
    9 select * from sales_product_year CUR, sales_product_yearPREV, where CUR.product=PREV.product and CUR.year=PREV.year + 1 and CUR.total_amt > PREV.total_amt with sales_product_year as ( select product, year(ship_date) as year, sum(price) as total_amt from item_sales group by product, year ) Use case #2: example 2  Year-over-year comparisons
  • 10.
    10 select * from sales_product_year S1 where total_amt> (select 0.1*sum(total_amt) from sales_product_year S2 where S2.year=S1.year) with sales_product_year as ( select product, year(ship_date) as year, sum(price) as total_amt from item_sales group by product, year ) Use case #2: example 3  Compare individuals against their group
  • 11.
    11 Conclusions so far  Non-recursiveCTEs are “Query-local VIEWs”  One CTE can refer to another  Better than nested FROM (SELECT …)  Can refer to a CTE from multiple places  Better than copy-pasting FROM(SELECT …)  CTE adoption  TPC-H (1999) - no CTEs  SQL:1999 introduces CTEs  TPC-DS (2011) - 38 of 100 queries use CTEs.
  • 12.
    12 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 13.
    13 with engineers as( select * from employees where dept='Engineering' or dept='Support' ) select ... from engineers, other_table, ... Base algorithm: materialize in a temporary table  Always works  Often not optimal
  • 14.
    14 with engineers as( select * from employees where dept='Development' ) select ... from engineers E, support_cases SC where E.name=SC.assignee and SC.created='2016-09-30' and E.location='Amsterdam' select ... from employees E, support_cases SC where E.dept='Development' and E.name=SC.assignee and SC.created='2016-09-30' and E.location='Amsterdam' Optimization #1: CTE Merging  Join optimizer can pick any plan  e.g. support employee→
  • 15.
    15 Optimization #1: CTEMerging (2)  Requirement  CTE is just a JOIN : no GROUP BY, DISTINCT, etc  Output  CTE is merged into parent’s join  Optimizer can pick the best query plan  This is the same as ALGORITHM=MERGE for VIEWs
  • 16.
    16 with sales_per_year as( select year(order.date) as year sum(order.amount) as sales from order group by year ) select * from sales_per_year where year in ('2015','2016') with sales_per_year as ( select year(order.date) as year sum(order.amount) as sales from order where year in ('2015','2016') group by year ) select * from sales_per_year Optimization #2: condition pushdown
  • 17.
    17 Condition pushdown summary  Usedwhen merging is not possible  CTE has a GROUP BY  Makes temp. table smaller  Allows to filter out whole GROUP-BY groups  Besides CTEs, works for derived tables and VIEWs  Based on Galina Shalygina’s GSOC 2016 project:  “Pushing conditions into non-mergeable views and derived tables in MariaDB”
  • 18.
    18 with product_sales as( select product_name, year(sale_date), count(*) as count from product_sales group by product, year) select * from product_sales P1, product_sales P2 where P1.year = 2010 AND P2.year = 2011 AND ... Optimization #3: CTE reuse The idea  Fill the CTE once  Then use multiple times  Hard to do with condition pushdown
  • 19.
    19 CTE Merge Conditionpushdown CTE reuse MariaDB 10.2 ✔ ✔ ✘ MS SQL Server ✔ ✔ ✘ PostgreSQL ✘ ✘ ✔ MySQL 8.0.0-labs-optimizer ✔ ✘ ✔* CTE Optimizations summary  Merge and condition pushdown are most important  MariaDB supports them, like MS SQL.  PostgreSQL’s approach is *weird*  “CTEs are optimization barriers”  MySQL’s labs tree: “try merging, otherwise reuse”
  • 20.
  • 21.
    21 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 22.
    22 wheel boltcapnut tire valve rimtirespokes Chicago Nashville Atlanta Orlando RecursiveCTEs  SQL is poor at “recursive” data structures/algorithms  First attempt: Oracle’s CONNECT BY syntax (80’s)  Superseded by Recursive CTEs  SQL:1999, implementations in 2007-2009 ● Trees ● Graphs
  • 23.
    23 Recursive part Anchor part Recursiveuse of CTE “recursive” Recursive CTE syntax with recursive ancestors as ( select * from folks where name = 'Alex' union [all] select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors;
  • 24.
    24 Sister AmyAlex Mom Dad GrandpaBill +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ Recursive CTE computation  Consider a dataset
  • 25.
    25 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | +------+--------------+--------+--------+ Result table Step #1: execute the anchor part Computation
  • 26.
    26 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | +------+--------------+--------+--------+ +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ Step #2: execute the recursive part Computation Result table
  • 27.
    27 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | +------+--------------+--------+--------+ Result table +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ Step #2: Add results the result table Computation
  • 28.
    28 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; Result table+------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | +------+--------------+--------+--------+ +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ Computation Step #3: Execute the recursive part again
  • 29.
    29 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | +------+--------------+--------+--------+ Result table +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ Computation ● Step #2: Add results the result table ● Dad and Mom are already there
  • 30.
    30 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; Result table +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | +------+--------------+--------+--------+ Computation Step #4: Execute the recursive part again
  • 31.
    31 with recursive ancestorsas ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother ) select * from ancestors; Result table +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | | 98 | Sister Amy | 20 | 30 | +------+--------------+--------+--------+ +------+--------------+--------+--------+ | id | name | father | mother | +------+--------------+--------+--------+ | 100 | Alex | 20 | 30 | | 20 | Dad | 10 | NULL | | 30 | Mom | NULL | NULL | | 10 | Grandpa Bill | NULL | NULL | +------+--------------+--------+--------+ Computation ● Step #4: No [new] results ● The process finishes.
  • 32.
    32 1. Compute anchor_part 2.Compute recursive_part to get the new data 3. if (new data is non-empty) goto 2; with recursive R as ( select anchor_part union [all] select recursive_part from R, … ) select … Summary so far
  • 33.
    33 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 34.
    34 bus_routes +------------+------------+ | origin |dst | +------------+------------+ | New York | Boston | | Boston | New York | | New York | Washington | | Washington | Boston | | Washington | Raleigh | +------------+------------+ New York Boston Washington Raleigh Transitive closure
  • 35.
    35 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_dst, bus_routes where bus_dst.dst= bus_routes.origin ) select * from bus_dst Transitive closure New York Boston Washington Raleigh ● bus_dst is where one can be ● Start from New York (with a datatype trick)
  • 36.
    36 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_dst, bus_routes where bus_dst.dst= bus_routes.origin ) select * from bus_dst Transitive closure New York Boston Washington Raleigh ● Put into the work table +------------+ | dst | +------------+ | New York | +------------+
  • 37.
    37 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_dst, bus_routes where bus_dst.dst= bus_routes.origin ) select * from bus_dst +------------+ | dst | +------------+ | New York | +------------+ Transitive closure New York Boston Washington Raleigh ● Join bus_dst with bus_routes. ● New destinations: Boston, Washington
  • 38.
    38 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin ) select * from bus_dst +------------+ | dst | +------------+ | New York | | Boston | | Washington | +------------+ Transitive closure New York Boston Washington Raleigh ● Add new destinations to the temp. table
  • 39.
    39 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin ) select * from bus_dst +------------+ | dst | +------------+ | New York | | Boston | | Washington | +------------+ Transitive closure New York Boston Washington Raleigh ● Join bus_dst with bus_routes – Raleigh, Boston, New York
  • 40.
    40 with recursive bus_dstas ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin ) select * from bus_dst +------------+ | dst | +------------+ | New York | | Boston | | Washington | | Raleigh | +------------+ Transitive closure New York Boston Washington Raleigh ● Join bus_dst with bus_routes – Raleigh, Boston, New York
  • 41.
    41 Summary so far  Cancompute transitive closure  UNION prevents loops. New York Boston Washington Raleigh
  • 42.
    42 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 43.
    43 bus_routes +------------+------------+ | origin |dst | +------------+------------+ | New York | Boston | | Boston | New York | | New York | Washington | | Washington | Boston | | Washington | Raleigh | +------------+------------+ Computing “Paths” New York Boston Washington Raleigh ● Want paths like New York Washington Raleigh→ →
  • 44.
    44 with recursive paths(cur_path, cur_dest) as ( select origin, origin from bus_routes where origin='New York' union select concat(paths.cur_path, ',', bus_routes.dest), bus_routes.dest from paths, bus_routes where paths.cur_dest= bus_routes.origin and locate(bus_routes.dest, paths.cur_path)=0 ) select * from paths Computing “Paths” New York Boston Washington Raleigh Collect a path Don’t construct loops
  • 45.
    45 select concat(paths.cur_path, ',', bus_routes.dest), bus_routes.dest from paths,bus_routes where paths.cur_dest= bus_routes.origin and locate(bus_routes.dest, paths.cur_path)=0 +-----------------------------+------------+ | cur_path | cur_dest | +-----------------------------+------------+ | New York | New York | | New York,Boston | Boston | | New York,Washington | Washington | | New York,Washington,Boston | Boston | | New York,Washington,Raleigh | Raleigh | +-----------------------------+------------+ Computing “Paths” New York Boston Washington Raleigh
  • 46.
    46 How recursion stops  Treeor Directed Acyclic Graph walking  Execution is guaranteed to stop  Computing transitive closure  Use UNION  Computing “Paths” over graph with loops  Put condition into WHERE to stop loops/growth  Safety measure: @@max_recursive_iterations  Like in SQL Server  MySQL-8.0: @@max_execution_time ?
  • 47.
    47 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 48.
    48 with recursive Ras ( select anchor_part union [all] select recursive_part from R, … ) select … [Non-]linear recursion The SQL standard requires that recursion is linear:  recursive_part must refer to R only once  No self-joins  Not from subqueries  Not from inner side of an outer join  ...
  • 49.
    49 R x a +y +b Linearity ofSELECT statements
  • 50.
    50 with recursive Ras ( select anchor_part union [all] select recursive_part from R, … ) select … Linear recursion  New data is generated by “wave-front” elements  Contents of R are always growing
  • 51.
    51 Plan  Non-recursive CTEs  Use cases  Optimizations  RecursiveCTEs  Basics  Transitive closure  Paths  (Non-)linear recursion  Mutual recursion
  • 52.
    52 with recursive C1as ( select … from anchor_table union select … from C2 ), C2 as ( select … from C1 ) select ... Mutual recursion  Multiple CTEs refer to each other  Useful for “bi-partite” graphs  MariaDB supports it  No other database does
  • 53.
    53 Modules and objects M1 v3 v9v9v4  A module consumes objects and produces other objects v objects v3 v9 v4 ... m modules m1 ... m1 ... (m, v) m1,v3 ... m1,v9 module_arguments module_results (m, v) m1,v4 ...
  • 54.
    54 Modules and objects M1M2 M3 v3 v9v9 v4 v7 v1 v6 v10  A module consumes objects and produces other objects
  • 55.
    55 Modules and objects  Whatobjects can be produced from objects v3, v9, v7 M1 M2 M3 v3 v9v9 v4 v7 v1 v6 v10
  • 56.
    56 Query part #1:objects produced from modules with recursive reached_objects as ( select v, "init" from objects where v in ('v3','v7','v9') union select module_results.v, module_results.m from module_results, applied_modules where module_results.m = applied_modules.m ),
  • 57.
    57 Query part #2:modules ready to be applied applied_modules as ( select * from modules where 1=0 union select modules.m from modules left join ( module_arguments left join reached_objects on module_arguments.v = reached_objects.v ) on reached_objects.v is null and modules.m = module_arguments.m where module_arguments.m is null ) select * from reached_objects;
  • 58.
    58 Query result +------+------+ | v| init | +------+------+ | v3 | init | | v7 | init | | v9 | init | | v4 | m1 | | v1 | m2 | | v6 | m2 | | v10 | m3 | +------+------+
  • 59.
    59 Further plans  MariaDB 10.3  Non-recursiveCTEs  Temporary table re-use  Recursive CTEs  10.3 feature: SELECT … EXCEPT SELECT ….  Make recursive CTEs support EXCEPT in addition to UNION.
  • 60.
    60 Conclusions  MariaDB 10.2 hasCommon Table Expressions  Both Recursive and Non-recursive are supported  Non-recursive  “Query-local VIEWs”  Competitive set of query optimizations  Recursive  Useful for tree/graph-walking queries  Mutual and non-linear recursion is supported.
  • 61.