-
Notifications
You must be signed in to change notification settings - Fork 783
feat(storage): improve optimize and recluster #11850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
tests/sqllogictests/suites/base/09_fuse_engine/09_0008_fuse_optimize_table
Show resolved
Hide resolved
tests/sqllogictests/suites/base/09_fuse_engine/09_0008_fuse_optimize_table
Outdated
Show resolved
Hide resolved
wonder about the mechanism, does the purge still needs to store candidate in memory and delete all candidates in a old segment? |
4e26c8c
to
9430699
Compare
will select and purge the oldest snapshots. |
Summary(By llmchain.rs)
|
This comment was marked as outdated.
This comment was marked as outdated.
IMO purge do not need scan the world, but only need just tailing the older snapshot & segment & block files. If I understands correctly, the block files have some kind of time ordering, if a block file's creation time earlier than the earliest active snapshot & not included in this snapshot, then it can be safely purged. |
This can be used by purge orphan blocks in vacuum table. However, we cannot guarantee the accuracy of the file's creation time and the creation time of block is generated earlier than the creation time of snapshot. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
b461133
to
ab80bc7
Compare
d9f461c
to
2cc0705
Compare
2cc0705
to
94f4377
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
Docker Image for PR
|
a74b739
to
cbdbee5
Compare
I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/
Summary
The option LIMIT sets the maximum number of segments to be recluster. Databend will select the newest segments. The default segment_ount limit is
max_threads * 4
.Add memory usage limit for recluster
Sort the blockmeta by cluster_statistics during compact
after optimize compact, do recluster
Optimize recluster, serialize block in parallel during recluster.
Before optimize:
After:
Closes #11799