From: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Date: | 2019-07-04 12:06:38 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
The new version of the patch is attached.
This version is even simpler than the previous one,
thanks to the recent btree design changes and all the feedback I received.
I consider it ready for review and testing.
[feature overview]
This patch implements the deduplication of btree non-pivot tuples on
leaf pages
in a manner similar to GIN index "posting lists".
Non-pivot posting tuple has following format:
t_tid | t_info | key values | posting_list[]
Where t_tid and t_info fields are used to store meta info
about tuple's posting list.
posting list is an array of ItemPointerData.
Currently, compression is applied to all indexes except system indexes,
unique
indexes, and indexes with included columns.
On insertion, compression applied not to each tuple, but to the page before
split. If the target page is full, we try to compress it.
[benchmark results]
idx ON tbl(c1);
index contains 10000000 integer values
i - number of distinct values in the index.
So i=1 means that all rows have the same key,
and i=10000000 means that all keys are different.
i / old size (MB) / new size (MB)
1 215 88
1000 215 90
100000 215 71
10000000 214 214
For more, see the attached diagram with test results.
[future work]
Many things can be improved in this feature.
Personally, I'd prefer to keep this patch as small as possible
and work on other improvements after a basic part is committed.
Though, I understand that some of these can be considered essential
for this patch to be approved.
1. Implement a split of the posting tuples on a page split.
2. Implement microvacuum of posting tuples.
3. Add a flag into pg_index, which allows enabling/disabling compression
for a particular index.
4. Implement posting list compression.
--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
btree_compression_pg12_v1.patch | text/x-patch | 54.8 KB |
![]() |
image/png | 16.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | James Coleman | 2019-07-04 13:25:08 | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Previous Message | Alex | 2019-07-04 12:04:22 | run os command in pg_regress? |