-
Notifications
You must be signed in to change notification settings - Fork 2
Comparing changes
Open a pull request
base repository: postgresql-cfbot/postgresql
base: cf/4874~1
head repository: postgresql-cfbot/postgresql
compare: cf/4874
- 10 commits
- 16 files changed
- 4 contributors
Commits on Mar 13, 2025
-
Improve buffer manager API for backend pin limits.
Previously the support functions assumed that the caller needed one pin to make progress, and could optionally use some more. Add a couple more functions for callers that want to know: * what the maximum possible number could be irrespective of currently held pins, for space planning purposes, called the "soft pin limit" * how many additional pins they could acquire right now, without the special case allowing one pin, for users that already hold pins and could make progress even if zero extra pins are available These APIs are better suited to read_stream.c, which will be improved in a follow-up patch. Also compute MaxProportionalPins up front, to avoid performing division whenever we check the balance. Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 5031ac8 - Browse repository at this point
Copy the full SHA 5031ac8View commit details -
Respect pin limits accurately in read_stream.c.
To avoid pinning too much of the buffer pool at once, we previously used LimitAdditionalBuffers(). The coding was naive, and only considered the available buffers at stream construction time. This commit checks at the time of use with new buffer manager APIs. The result might change dynamically due to pins acquired outside this stream by the same backend. No extra CPU cycles are added to the all-buffered fast-path code, but the I/O-starting path now considers the up-to-date remaining buffer limit when making look-ahead decisions. In practice it was very difficult to exceed the limit in v17, so no back-patch, but changes due to land soon make it easy. Per code review from Andres, in the course of testing his AIO patches. Reviewed-by: Andres Freund <[email protected]> Reported-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 76915db - Browse repository at this point
Copy the full SHA 76915dbView commit details -
Improve read stream advice for large random chunks.
read_stream.c tries not to issue advice when it thinks the kernel's readahead should be active, ie when using buffered I/O and reading sequential blocks. It previously gave up a little too easily: it should issue advice until it has started running sequential pread() calls, not just when it's planning to. The simpler strategy worked for random chunks of size <= io_combine_limit and entirely sequential streams, but so not well when reading random chunks > io_combine limit. For example, a 256kB chunk of sequential data would benefit from only one fadvise(), but (assuming io_combine_limit=128kB) could suffer an I/O stall for the second half of it. Keep issuing advice until the pread() calls catch up with the start of the region we're currently issuing advice for, if ever. In practice, if there are any jumps in the lookahead window, we'll never stop issuing advice, and if the whole lookahead window becomes sequential we'll finally stop issuing advice. Discovered by Tomas Vondra's regression testing of many data clustering patterns using Melanie Plageman's streaming Bitmap Heap Scan patch, with analysis of the I/O stall-producing pattern from Andres Freund. Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGJ3HSWciQCz8ekP1Zn7N213RfA4nbuotQawfpq23%2Bw-5Q%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 6658d78 - Browse repository at this point
Copy the full SHA 6658d78View commit details -
Look ahead more when sequential in read_stream.c.
Previously, sequential reads would cause the look-ahead distance to fall back to io_combine_limit, on the basis that kernel read-ahead should start helping. It also meant that we'd have to ramp the distance back up when a sequential region was followed by a burst of random jumps, with little hope of avoiding a stall, which is not a good trade-off and is incompatible with AIO plans (you have to look ahead if you have to start real I/O). Simplify the algorithm: now only cache hits make the look-ahead distance drop off, and cache misses still make it grow rapidly. Random vs sequential heuristics are no longer taken into consideration while making that decision. Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 977125f - Browse repository at this point
Copy the full SHA 977125fView commit details -
Support buffer forwarding in read_stream.c.
In preparation for a following change to the buffer manager, teach read stream to keep track of buffers that were "forwarded" from one call to StartReadBuffers() to the next. Since StartReadBuffers() buffers argument will become an in/out argument, we need to initialize the buffer queue entries with InvalidBuffer. We don't want to do that up front, because we try to keep stream initialization cheap and code that uses the fast path stays in one single buffer queue element. Satisfy both goals by initializing the queue incrementally on the first cycle. Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 5d04d96 - Browse repository at this point
Copy the full SHA 5d04d96View commit details -
Support buffer forwarding in StartReadBuffers().
Sometimes we have to perform a short read because we hit a cached block that ends a contiguous run of blocks requiring I/O. We don't want StartReadBuffers() to have to start more than one I/O, so we stop there. We also don't want to have to unpin the cached block (and repin it later), so previously we'd silently pretend the hit was part of the I/O, and just leave it out of the read from disk. Now, we'll "forward" it to the next call. We still write it to the buffers[] array for the caller to pass back to us later, but it's not included in *nblocks. This policy means that we no longer mix hits and misses in a single operation's results, so we avoid the requirement to call WaitReadBuffers(), which might stall, before the caller can make use of the hits. The caller will get the hit in the next call instead, and know that it doesn't have to wait. That's important for later work on out-of-order read streams that minimize I/O stalls. This also makes life easier for proposed work on true AIO, which occasionally needs to split a large I/O after pinning all the buffers, while the current coding only ever forwards a single bookending hit. This API is natural for read_stream.c: it just leaves forwarded buffers where they are in its circular queue, where the next call will pick them up and continue, minimizing pin churn. If we ever think of a good reason to disable this feature, i.e. for other users of StartReadBuffers() that don't want to deal with forwarded buffers, then we could add a flag for that. For now read_steam.c is the only user. Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
Configuration menu - View commit details
-
Copy full SHA for 0ba3a9c - Browse repository at this point
Copy the full SHA 0ba3a9cView commit details -
Separate TBM[Shared|Private]Iterator and TBMIterateResult
Remove the TBMIterateResult member from the TBMPrivateIterator and TBMSharedIterator and make tbm_[shared|private_]iterate() take a TBMIterateResult as a parameter. This allows tidbitmap API users to manage multiple TBMIterateResults per scan. This is required for bitmap heap scan to use the read stream API, with which there may be multiple I/Os in flight at once, each one with a TBMIterateResult. Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
Configuration menu - View commit details
-
Copy full SHA for f850d35 - Browse repository at this point
Copy the full SHA f850d35View commit details -
BitmapHeapScan uses the read stream API
Make Bitmap Heap Scan use the read stream API instead of invoking ReadBuffer() for each block indicated by the bitmap. The read stream API handles prefetching, so remove all of the explicit prefetching from bitmap heap scan code. Now, heap table AM implements a read stream callback which uses the bitmap iterator to return the next required block to the read stream code. Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
Configuration menu - View commit details
-
Copy full SHA for 1455c2e - Browse repository at this point
Copy the full SHA 1455c2eView commit details -
Remove table AM callback scan_bitmap_next_block
After pushing the bitmap iterator into table-AM specific code (as part of making bitmap heap scan use the read stream API), scan_bitmap_next_block() no longer returns the current block number. Since scan_bitmap_next_block() isn't returning any relevant information to bitmap table scan code, it makes more sense to get rid of it. Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(), and the heap AM implementation of scan_bitmap_next_block() is a local helper in heapam_handler.c. Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me ci-os-only:
Configuration menu - View commit details
-
Copy full SHA for 08cba16 - Browse repository at this point
Copy the full SHA 08cba16View commit details -
[CF 4874] v36 - BitmapHeapScan table AM violation removal (and use st…
…reaming read API) This branch was automatically generated by a robot using patches from an email thread registered at: https://commitfest.postgresql.org/patch/4874 The branch will be overwritten each time a new patch version is posted to the thread, and also periodically to check for bitrot caused by changes on the master branch. Patch(es): https://www.postgresql.org/message-id/CAAKRu_b8nJua_W_C1i46a-XaMzJJnYF+8BpET0zyoREXEvrmhQ@mail.gmail.com Author(s): Melanie Plageman
Commitfest Bot committedMar 13, 2025 Configuration menu - View commit details
-
Copy full SHA for 942e8e2 - Browse repository at this point
Copy the full SHA 942e8e2View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff cf/4874~1...cf/4874