Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance

Impacts of Sharding, Partitioning, Encoding, and
Sorting on Distributed Query Performance
Nga Tran
Staff Engineer, InﬂuxData
July 14, 2021

● InfluxData - Staff Engineer
● Tableau/Salesforce (2 years)
○ Sr. Manager of Automatic Statistics
● Vertica RDBMS (over a decade)
○ Engineer of Query Optimizer
○ Director of Engineering (R&D)
● ELCA (4 years)

Outline
● Non-distributed vs Distributed Databases
● Splitting Data to Gain Query Performance
○ Sharding, Partitioning, Encoding, and Sorting
● Impacts of different data setups on Query Performance

Distributed Database
Non-Distributed DB: 1-node cluster
● 1 machine
● Data is loaded & then queried on that node
Distributed DB: Cluster of many nodes
● Several machines shares the work
● Data is horizontally split between nodes
● Data is queried from all nodes
Node
Non-Distributed DB
Node 1 Node 2 Node n
N nodes, each plays the same role and talks to each other
Distributed DB
Row 1
Row 2
……..
Row a
Row a+1
Row a+2
………..
Row b
Row x+1
Row x+2
………..
Row n

Distributed Database
Non-Distributed DB: 1-node cluster
● 1 machine
● Data is loaded & then queried on that node
Distributed DB: Cluster of many nodes
● Several machines shares the work
● Data is horizontally split between nodes
● Data is queried from all nodes
→ How to split data to gain query performance?
Node
Non-Distributed DB
Node 1 Node 2 Node n
N nodes, each plays the same role and talks to each other
Distributed DB
Row 1
Row 2
……..
Row a
Row a+1
Row a+2
………..
Row b
Row x+1
Row x+2
………..
Row n

Splitting Data to Gain Query Performance
● Sharding
○ Horizontally split a table into N non-overlapping shards
■ → each node will (equally) share 1/n of the workload:
● Load 1/n data to each node
● Query: join & group-by on each node share 1/n workload
● Partitioning
○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning
out, local parallelism
● Encoding
○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by

Splitting Data to Gain Query Performance
● Sharding
● Partitioning
● Encoding
→ Let us dig into examples

Line_Item
o_okey o_date o_pri
1 2021.05.01 2
2 2021.05.01 1
3 2021.05.02 1
4 2021.05.02 3
5 2021.05.02 1
Examples: Two tables Order & Line_Item
Order
l_okey l_name l_price l_shipdate
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
2 pot 20 2021.05.01
2 pan 25 2021.05.04
3 shirt 30 2021.05.10
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
5 kayak 200 2021.05.05
5 lifevest 20 2021.05.02

Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
3 shirt 30 2021.05.2
5 kayak 200 2021.05.07
5 lifevest 20 2021.05.02
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Examples: 2-node cluster
Node 1 Node 2
Order Line_Item Line_Item
Order

Partitioned : Order: (o_date) & Line_Item: (l_shipdate)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
3 shirt 30 2021.05.2
5 lifevest 20 2021.05.02
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 desk 100 2021.05.07
1 mouse 10 2021.05.07
5 kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Node 1 Node 2
Order

Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
(3,1) shirt 30 2021.05.2
(5,1) lifevest 20 2021.05.02
(1, 2) chair 50 2021.05.03
monitor 130 2021.05.03
(1,2) desk 100 2021.05.07
mouse 10 2021.05.07
(5,1) kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
(2,1) pot 20 2021.05.01
(2,1) pan 25 2021.05.04
(4,1) bike 120 2021.05.04
(4,1) helmet 30 2021.05.10
Node 1 Node 2
Order

Impacts of the setups on query performance

Examples: Query
SELECT
l_okey, sum(l_price) as revenue, o_date, o_pri
FROM
customer, orders, lineitem
WHERE
l_okey = o_key and o_date < 2021.05.02 and
l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date,o_pri
ORDER BY
revenue desc, o_date;

Examples: Query - Do the shards help?
SELECT
FROM
WHERE
l_okey = o_key and o_date < 2021.05.02 and
l_shipdate > 2021.05.03
GROUP BY
ORDER BY

o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
1 desk 100 2021.05.07
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 mouse 10 2021.05.07
3 shirt 30 2021.05.2
5 kayak 200 2021.05.07
5 lifevest 20 2021.05.02
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Back to Shard setup
Node 1 Node 2
Order

SELECT
FROM
WHERE
l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03
GROUP BY
l_okey, o_date, o_pri
ORDER BY
YES
● Join: l_okey = o_key
○ → all odd keys in node 1 and even keys in node 2
○ → Node 1 and node 2 join data on their local node. No need to shuffle data between nodes before
joining.
● Group By: l_okey, o_date, o_pri
○ → Similarly, same group-by keys are in the same nodes. Each node can aggregate data without the
need to reshuffle data

SELECT
FROM
WHERE
GROUP BY
l_key, o_date, o_pri
ORDER BY
What if Order not sharded on o_okey & Line_item not sharded on l_okey?

SELECT
FROM
WHERE
GROUP BY
ORDER BY
○ → Need to reshuffle data so same join keys land on the same nodes before joining. Many ways:
■ Reshard on the fly both Order on o_okey and Line_Item on l_okey
■ Broadcast small table (o_okey) to other nodes
● Group By: l_okey, o_date, o_pri
○ → If after the join the data is shared on l_okey, nothing is needed. Otherwise, either:
■ Reshard data on l_okey to 2 nodes
■ Send everything to one node to do the final group-by

SELECT
FROM
WHERE
GROUP BY
ORDER BY
● → Not sharded on join keys will lead to extra on-the-fly reshard or broadcast cost
● → Not already (re-)sharded on group-by keys before the group-by operator will lead to either
○ Reshard or
○ The final node has to do all the group-by work

Examples: Query - Do the partitions help?
SELECT
FROM
WHERE
GROUP BY
ORDER BY

o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
3 shirt 30 2021.05.2
5 lifevest 20 2021.05.02
1 chair 50 2021.05.03
1 monitor 130 2021.05.03
1 desk 100 2021.05.07
1 mouse 10 2021.05.07
5 kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
2 pot 20 2021.05.01
2 pan 25 2021.05.04
4 bike 120 2021.05.04
4 helmet 30 2021.05.10
Back to Partition Setup
Node 1 Node 2
Order

SELECT
FROM
WHERE
GROUP BY
ORDER BY
Yes
● Filter: o_date < 2021.05.02 and l_shipdate > 2021.05.03
○ → Prune partitions not in the filter ranges

SELECT
FROM
WHERE
GROUP BY
ORDER BY
What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?

SELECT
FROM
WHERE
GROUP BY
ORDER BY
What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
● → nothing to prune early, we have to scan all column data and apply the filter ranges

Examples: Query - Do the encoding & sorting help?
SELECT
FROM
WHERE
GROUP BY
ORDER BY

Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey)
o_okey o_date o_pri
1 2021.05.01 2
3 2021.05.01 1
5 2021.05.02 1
(3,1) shirt 30 2021.05.2
(5,1) lifevest 20 2021.05.02
(1, 2) chair 50 2021.05.03
monitor 130 2021.05.03
(1,2) desk 100 2021.05.07
mouse 10 2021.05.07
(5,1) kayak 200 2021.05.07
o_okey o_date o_pri
2 2021.05.01 1
4 2021.05.02 3
(2,1) pot 20 2021.05.01
(2,1) pan 25 2021.05.04
(4,1) bike 120 2021.05.04
(4,1) helmet 30 2021.05.10
Back to Encoding and Sorting Setup
Node 1 Node 2
Order

SELECT
FROM
WHERE
GROUP BY
ORDER BY
Yes
○ → use fast & more memory efficient merge join because data already sorted on the join keys
○ → l_okey can be kept in RLE during join
● Group By: l_okey, o_date,o_pri
○ → Group-by key is sorted and no need doing hash groupby, simply group data as we get new batches until we reach
higher value

SELECT
FROM
WHERE
GROUP BY
ORDER BY
What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?

SELECT
FROM
WHERE
GROUP BY
ORDER BY
What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
● → use hash join instead (usually slower and requires more memory than merge join)
● → use hash-group-by method (similarly, usually slower and requires more memory than pipe-lined group-by)
● → If there are only a few line items per order, the RLE won’t save much space

Database Designer:
● Topic for another talk
● Startup: Ottertune https://ottertune.com
○ Database Optimization on Autopilot
How to design sharding, partitioning, encoding, and sorting
for a combination of queries?

So what we have demonstrated today?
● Sharding
● Partitioning
● Encoding
→ Can you think of examples for the cases we have not covered?

Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance

More Related Content

What's hot(20)

Similar to Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance(20)

More from InfluxData(20)

Recently uploaded(20)

Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance