Impacts of Sharding, Partitioning, Encoding, and
Sorting on Distributed Query Performance
Nga Tran
Staff Engineer, InfluxData
July 14, 2021
● InfluxData - Staff Engineer
● Tableau/Salesforce (2 years)
○ Sr. Manager of Automatic Statistics
● Vertica RDBMS (over a decade)
○ Engineer of Query Optimizer
○ Director of Engineering (R&D)
● ELCA (4 years)
Outline
● Non-distributed vs Distributed Databases
● Splitting Data to Gain Query Performance
○ Sharding, Partitioning, Encoding, and Sorting
● Impacts of different data setups on Query Performance
Distributed Database
Non-Distributed DB: 1-node cluster
● 1 machine
● Data is loaded & then queried on that node
Distributed DB: Cluster of many nodes
● Several machines shares the work
● Data is horizontally split between nodes
● Data is queried from all nodes
Node
Non-Distributed DB
Node 1 Node 2 Node n
N nodes, each plays the same role and talks to each other
Distributed DB
Row 1
Row 2
……..
Row a
Row a+1
Row a+2
………..
Row b
Row x+1
Row x+2
………..
Row n
Distributed Database
Non-Distributed DB: 1-node cluster
● 1 machine
● Data is loaded & then queried on that node
Distributed DB: Cluster of many nodes
● Several machines shares the work
● Data is horizontally split between nodes
● Data is queried from all nodes
→ How to split data to gain query performance?
Node
Non-Distributed DB
Node 1 Node 2 Node n
N nodes, each plays the same role and talks to each other
Distributed DB
Row 1
Row 2
……..
Row a
Row a+1
Row a+2
………..
Row b
Row x+1
Row x+2
………..
Row n
Splitting Data to Gain Query Performance
● Sharding
○ Horizontally split a table into N non-overlapping shards
■ → each node will (equally) share 1/n of the workload:
● Load 1/n data to each node
● Query: join & group-by on each node share 1/n workload
● Partitioning
○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning
out, local parallelism
● Encoding
○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
Splitting Data to Gain Query Performance
● Sharding
○ Horizontally split a table into N non-overlapping shards
■ → each node will (equally) share 1/n of the workload:
● Load 1/n data to each node
● Query: join & group-by on each node share 1/n workload
● Partitioning
○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning
out, local parallelism
● Encoding
○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
→ Let us dig into examples
Line_Item
o_okey o_date o_pri
1 2021.05.01 2
2 2021.05.01 1
3 2021.05.02 1
4 2021.05.02 3
5 2021.05.02 1
Examples: Two tables Order & Line_Item
Order
l_okey l_name l_price l_shipdate
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
2 pot 20 2021.05.01
2 pan 25 2021.05.04
3 shirt 30 2021.05.10
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
5 kayak 200 2021.05.05
5 lifevest 20 2021.05.02
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
3 shirt 30 2021.05.2
5 kayak 200 2021.05.07
5 lifevest 20 2021.05.02
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Examples: 2-node cluster
Node 1 Node 2
Order Line_Item Line_Item
Order
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
Partitioned : Order: (o_date) & Line_Item: (l_shipdate)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
3 shirt 30 2021.05.2
5 lifevest 20 2021.05.02
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 desk 100 2021.05.07
1 mouse 10 2021.05.07
5 kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Examples: 2-node cluster
Node 1 Node 2
Order Line_Item Line_Item
Order
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
Partitioned : Order: (o_date) & Line_Item: (l_shipdate)
Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
(3,1) shirt 30 2021.05.2
(5,1) lifevest 20 2021.05.02
(1, 2) chair 50 2021.05.03
monitor 130 2021.05.03
(1,2) desk 100 2021.05.07
mouse 10 2021.05.07
(5,1) kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
(2,1) pot 20 2021.05.01
(2,1) pan 25 2021.05.04
(4,1) bike 120 2021.05.04
(4,1) helmet 30 2021.05.10
Examples: 2-node cluster
Node 1 Node 2
Order Line_Item Line_Item
Order
Impacts of the setups on query performance
Examples: Query
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and
l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Examples: Query - Do the shards help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and
l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
3 shirt 30 2021.05.2
5 kayak 200 2021.05.07
5 lifevest 20 2021.05.02
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Back to Shard setup
Node 1 Node 2
Order Line_Item Line_Item
Order
Examples: Query - Do the shards help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date, o_pri
ORDER BY
revenue desc, o_date;
YES
● Join: l_okey = o_key
○ → all odd keys in node 1 and even keys in node 2
○ → Node 1 and node 2 join data on their local node. No need to shuffle data between nodes before
joining.
● Group By: l_okey, o_date, o_pri
○ → Similarly, same group-by keys are in the same nodes. Each node can aggregate data without the
need to reshuffle data
Examples: Query - Do the shards help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_key, o_date, o_pri
ORDER BY
revenue desc, o_date;
What if Order not sharded on o_okey & Line_item not sharded on l_okey?
Examples: Query - Do the shards help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_key, o_date, o_pri
ORDER BY
revenue desc, o_date;
What if Order not sharded on o_okey & Line_item not sharded on l_okey?
● Join: l_okey = o_key
○ → Need to reshuffle data so same join keys land on the same nodes before joining. Many ways:
■ Reshard on the fly both Order on o_okey and Line_Item on l_okey
■ Broadcast small table (o_okey) to other nodes
● Group By: l_okey, o_date, o_pri
○ → If after the join the data is shared on l_okey, nothing is needed. Otherwise, either:
■ Reshard data on l_okey to 2 nodes
■ Send everything to one node to do the final group-by
Examples: Query - Do the shards help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_key, o_date, o_pri
ORDER BY
revenue desc, o_date;
What if Order not sharded on o_okey & Line_item not sharded on l_okey?
● → Not sharded on join keys will lead to extra on-the-fly reshard or broadcast cost
● → Not already (re-)sharded on group-by keys before the group-by operator will lead to either
○ Reshard or
○ The final node has to do all the group-by work
Examples: Query - Do the partitions help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
Partitioned : Order: (o_date) & Line_Item: (l_shipdate)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
3 shirt 30 2021.05.2
5 lifevest 20 2021.05.02
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 desk 100 2021.05.07
1 mouse 10 2021.05.07
5 kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Back to Partition Setup
Node 1 Node 2
Order Line_Item Line_Item
Order
Examples: Query - Do the partitions help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Yes
● Filter: o_date < 2021.05.02 and l_shipdate > 2021.05.03
○ → Prune partitions not in the filter ranges
Examples: Query - Do the partitions help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
Examples: Query - Do the partitions help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
● → nothing to prune early, we have to scan all column data and apply the filter ranges
Examples: Query - Do the encoding & sorting help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
Partitioned : Order: (o_date) & Line_Item: (l_shipdate)
Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
l_okey l_name l_price l_shipdate
(3,1) shirt 30 2021.05.2
(5,1) lifevest 20 2021.05.02
(1, 2) chair 50 2021.05.03
monitor 130 2021.05.03
(1,2) desk 100 2021.05.07
mouse 10 2021.05.07
(5,1) kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
l_okey l_name l_price l_shipdate
(2,1) pot 20 2021.05.01
(2,1) pan 25 2021.05.04
(4,1) bike 120 2021.05.04
(4,1) helmet 30 2021.05.10
Back to Encoding and Sorting Setup
Node 1 Node 2
Order Line_Item Line_Item
Order
Examples: Query - Do the encoding & sorting help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
Yes
● Join: l_okey = o_key
○ → use fast & more memory efficient merge join because data already sorted on the join keys
○ → l_okey can be kept in RLE during join
● Group By: l_okey, o_date,o_pri
○ → Group-by key is sorted and no need doing hash groupby, simply group data as we get new batches until we reach
higher value
Examples: Query - Do the encoding & sorting help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
Examples: Query - Do the encoding & sorting help?
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;
What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
● → use hash join instead (usually slower and requires more memory than merge join)
● → use hash-group-by method (similarly, usually slower and requires more memory than pipe-lined group-by)
● → If there are only a few line items per order, the RLE won’t save much space
Database Designer:
● Topic for another talk
● Startup: Ottertune https://ottertune.com
○ Database Optimization on Autopilot
How to design sharding, partitioning, encoding, and sorting
for a combination of queries?
So what we have demonstrated today?
● Sharding
○ Horizontally split a table into N non-overlapping shards
■ → each node will (equally) share 1/n of the workload:
● Load 1/n data to each node
● Query: join & group-by on each node share 1/n workload
● Partitioning
○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning
out, local parallelism
● Encoding
○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
→ Can you think of examples for the cases we have not covered?
Thank you

Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance

  • 1.
    Impacts of Sharding,Partitioning, Encoding, and Sorting on Distributed Query Performance Nga Tran Staff Engineer, InfluxData July 14, 2021
  • 2.
    ● InfluxData -Staff Engineer ● Tableau/Salesforce (2 years) ○ Sr. Manager of Automatic Statistics ● Vertica RDBMS (over a decade) ○ Engineer of Query Optimizer ○ Director of Engineering (R&D) ● ELCA (4 years)
  • 3.
    Outline ● Non-distributed vsDistributed Databases ● Splitting Data to Gain Query Performance ○ Sharding, Partitioning, Encoding, and Sorting ● Impacts of different data setups on Query Performance
  • 4.
    Distributed Database Non-Distributed DB:1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
  • 5.
    Distributed Database Non-Distributed DB:1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes → How to split data to gain query performance? Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
  • 6.
    Splitting Data toGain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
  • 7.
    Splitting Data toGain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Let us dig into examples
  • 8.
    Line_Item o_okey o_date o_pri 12021.05.01 2 2 2021.05.01 1 3 2021.05.02 1 4 2021.05.02 3 5 2021.05.02 1 Examples: Two tables Order & Line_Item Order l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 2 pot 20 2021.05.01 2 pan 25 2021.05.04 3 shirt 30 2021.05.10 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 5 kayak 200 2021.05.05 5 lifevest 20 2021.05.02
  • 9.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 10.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 11.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 12.
    Impacts of thesetups on query performance
  • 13.
    Examples: Query SELECT l_okey, sum(l_price)as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 14.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 15.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Shard setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 16.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date, o_pri ORDER BY revenue desc, o_date; YES ● Join: l_okey = o_key ○ → all odd keys in node 1 and even keys in node 2 ○ → Node 1 and node 2 join data on their local node. No need to shuffle data between nodes before joining. ● Group By: l_okey, o_date, o_pri ○ → Similarly, same group-by keys are in the same nodes. Each node can aggregate data without the need to reshuffle data
  • 17.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey?
  • 18.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● Join: l_okey = o_key ○ → Need to reshuffle data so same join keys land on the same nodes before joining. Many ways: ■ Reshard on the fly both Order on o_okey and Line_Item on l_okey ■ Broadcast small table (o_okey) to other nodes ● Group By: l_okey, o_date, o_pri ○ → If after the join the data is shared on l_okey, nothing is needed. Otherwise, either: ■ Reshard data on l_okey to 2 nodes ■ Send everything to one node to do the final group-by
  • 19.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● → Not sharded on join keys will lead to extra on-the-fly reshard or broadcast cost ● → Not already (re-)sharded on group-by keys before the group-by operator will lead to either ○ Reshard or ○ The final node has to do all the group-by work
  • 20.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 21.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Partition Setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 22.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Filter: o_date < 2021.05.02 and l_shipdate > 2021.05.03 ○ → Prune partitions not in the filter ranges
  • 23.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
  • 24.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate? ● → nothing to prune early, we have to scan all column data and apply the filter ranges
  • 25.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 26.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Back to Encoding and Sorting Setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 27.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Join: l_okey = o_key ○ → use fast & more memory efficient merge join because data already sorted on the join keys ○ → l_okey can be kept in RLE during join ● Group By: l_okey, o_date,o_pri ○ → Group-by key is sorted and no need doing hash groupby, simply group data as we get new batches until we reach higher value
  • 28.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
  • 29.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey? ● → use hash join instead (usually slower and requires more memory than merge join) ● → use hash-group-by method (similarly, usually slower and requires more memory than pipe-lined group-by) ● → If there are only a few line items per order, the RLE won’t save much space
  • 30.
    Database Designer: ● Topicfor another talk ● Startup: Ottertune https://ottertune.com ○ Database Optimization on Autopilot How to design sharding, partitioning, encoding, and sorting for a combination of queries?
  • 31.
    So what wehave demonstrated today? ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Can you think of examples for the cases we have not covered?
  • 32.