Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: cf/4874~1
Choose a base ref
...
head repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: cf/4874
Choose a head ref
  • 10 commits
  • 16 files changed
  • 4 contributors

Commits on Mar 13, 2025

  1. Improve buffer manager API for backend pin limits.

    Previously the support functions assumed that the caller needed one pin
    to make progress, and could optionally use some more.  Add a couple more
    functions for callers that want to know:
    
    * what the maximum possible number could be irrespective of currently
      held pins, for space planning purposes, called the "soft pin limit"
    
    * how many additional pins they could acquire right now, without the
      special case allowing one pin, for users that already hold pins and
      could make progress even if zero extra pins are available
    
    These APIs are better suited to read_stream.c, which will be improved in
    a follow-up patch.  Also compute MaxProportionalPins up front, to avoid
    performing division whenever we check the balance.
    
    Reviewed-by: Andres Freund <[email protected]>
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    5031ac8 View commit details
    Browse the repository at this point in the history
  2. Respect pin limits accurately in read_stream.c.

    To avoid pinning too much of the buffer pool at once, we previously used
    LimitAdditionalBuffers().  The coding was naive, and only considered the
    available buffers at stream construction time.
    
    This commit checks at the time of use with new buffer manager APIs.  The
    result might change dynamically due to pins acquired outside this stream
    by the same backend.  No extra CPU cycles are added to the all-buffered
    fast-path code, but the I/O-starting path now considers the up-to-date
    remaining buffer limit when making look-ahead decisions.
    
    In practice it was very difficult to exceed the limit in v17, so no
    back-patch, but changes due to land soon make it easy.
    
    Per code review from Andres, in the course of testing his AIO patches.
    
    Reviewed-by: Andres Freund <[email protected]>
    Reported-by: Andres Freund <[email protected]>
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    76915db View commit details
    Browse the repository at this point in the history
  3. Improve read stream advice for large random chunks.

    read_stream.c tries not to issue advice when it thinks the kernel's
    readahead should be active, ie when using buffered I/O and reading
    sequential blocks.  It previously gave up a little too easily: it should
    issue advice until it has started running sequential pread() calls, not
    just when it's planning to.  The simpler strategy worked for random
    chunks of size <= io_combine_limit and entirely sequential streams, but
    so not well when reading random chunks > io_combine limit.  For example,
    a 256kB chunk of sequential data would benefit from only one fadvise(),
    but (assuming io_combine_limit=128kB) could suffer an I/O stall for the
    second half of it.
    
    Keep issuing advice until the pread() calls catch up with the start of
    the region we're currently issuing advice for, if ever.  In practice, if
    there are any jumps in the lookahead window, we'll never stop issuing
    advice, and if the whole lookahead window becomes sequential we'll
    finally stop issuing advice.
    
    Discovered by Tomas Vondra's regression testing of many data clustering
    patterns using Melanie Plageman's streaming Bitmap Heap Scan patch, with
    analysis of the I/O stall-producing pattern from Andres Freund.
    
    Reviewed-by: Andres Freund <[email protected]>
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    Discussion: https://postgr.es/m/CA%2BhUKGJ3HSWciQCz8ekP1Zn7N213RfA4nbuotQawfpq23%2Bw-5Q%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    6658d78 View commit details
    Browse the repository at this point in the history
  4. Look ahead more when sequential in read_stream.c.

    Previously, sequential reads would cause the look-ahead distance to
    fall back to io_combine_limit, on the basis that kernel read-ahead
    should start helping.  It also meant that we'd have to ramp the distance
    back up when a sequential region was followed by a burst of random
    jumps, with little hope of avoiding a stall, which is not a good
    trade-off and is incompatible with AIO plans (you have to look ahead if
    you have to start real I/O).
    
    Simplify the algorithm: now only cache hits make the look-ahead distance
    drop off, and cache misses still make it grow rapidly.  Random vs
    sequential heuristics are no longer taken into consideration while
    making that decision.
    
    Reviewed-by: Andres Freund <[email protected]>
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    977125f View commit details
    Browse the repository at this point in the history
  5. Support buffer forwarding in read_stream.c.

    In preparation for a following change to the buffer manager, teach read
    stream to keep track of buffers that were "forwarded" from one call to
    StartReadBuffers() to the next.
    
    Since StartReadBuffers() buffers argument will become an in/out
    argument, we need to initialize the buffer queue entries with
    InvalidBuffer.  We don't want to do that up front, because we try to
    keep stream initialization cheap and code that uses the fast path stays
    in one single buffer queue element.  Satisfy both goals by initializing
    the queue incrementally on the first cycle.
    
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    5d04d96 View commit details
    Browse the repository at this point in the history
  6. Support buffer forwarding in StartReadBuffers().

    Sometimes we have to perform a short read because we hit a cached block
    that ends a contiguous run of blocks requiring I/O.  We don't want
    StartReadBuffers() to have to start more than one I/O, so we stop there.
    We also don't want to have to unpin the cached block (and repin it
    later), so previously we'd silently pretend the hit was part of the I/O,
    and just leave it out of the read from disk.  Now, we'll "forward" it to
    the next call.  We still write it to the buffers[] array for the caller
    to pass back to us later, but it's not included in *nblocks.
    
    This policy means that we no longer mix hits and misses in a single
    operation's results, so we avoid the requirement to call
    WaitReadBuffers(), which might stall, before the caller can make use of
    the hits.  The caller will get the hit in the next call instead, and
    know that it doesn't have to wait.  That's important for later work on
    out-of-order read streams that minimize I/O stalls.
    
    This also makes life easier for proposed work on true AIO, which
    occasionally needs to split a large I/O after pinning all the buffers,
    while the current coding only ever forwards a single bookending hit.
    
    This API is natural for read_stream.c: it just leaves forwarded buffers
    where they are in its circular queue, where the next call will pick them
    up and continue, minimizing pin churn.
    
    If we ever think of a good reason to disable this feature, i.e. for
    other users of StartReadBuffers() that don't want to deal with forwarded
    buffers, then we could add a flag for that.  For now read_steam.c is the
    only user.
    
    Discussion: https://postgr.es/m/CA%2BhUKGK_%3D4CVmMHvsHjOVrK6t4F%3DLBpFzsrr3R%2BaJYN8kcTfWg%40mail.gmail.com
    macdice authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    0ba3a9c View commit details
    Browse the repository at this point in the history
  7. Separate TBM[Shared|Private]Iterator and TBMIterateResult

    Remove the TBMIterateResult member from the TBMPrivateIterator and
    TBMSharedIterator and make tbm_[shared|private_]iterate() take a
    TBMIterateResult as a parameter.
    
    This allows tidbitmap API users to manage multiple TBMIterateResults per
    scan. This is required for bitmap heap scan to use the read stream API,
    with which there may be multiple I/Os in flight at once, each one with a
    TBMIterateResult.
    
    Reviewed-by: Tomas Vondra <[email protected]>
    Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
    melanieplageman authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    f850d35 View commit details
    Browse the repository at this point in the history
  8. BitmapHeapScan uses the read stream API

    Make Bitmap Heap Scan use the read stream API instead of invoking
    ReadBuffer() for each block indicated by the bitmap.
    
    The read stream API handles prefetching, so remove all of the explicit
    prefetching from bitmap heap scan code.
    
    Now, heap table AM implements a read stream callback which uses the
    bitmap iterator to return the next required block to the read stream
    code.
    
    Reviewed-by: Tomas Vondra <[email protected]>
    Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
    melanieplageman authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    1455c2e View commit details
    Browse the repository at this point in the history
  9. Remove table AM callback scan_bitmap_next_block

    After pushing the bitmap iterator into table-AM specific code (as part
    of making bitmap heap scan use the read stream API),
    scan_bitmap_next_block() no longer returns the current block number.
    Since scan_bitmap_next_block() isn't returning any relevant information
    to bitmap table scan code, it makes more sense to get rid of it.
    
    Now, bitmap table scan code only calls table_scan_bitmap_next_tuple(),
    and the heap AM implementation of scan_bitmap_next_block() is a local
    helper in heapam_handler.c.
    
    Reviewed-by: Tomas Vondra <[email protected]>
    Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
    
    ci-os-only:
    melanieplageman authored and Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    08cba16 View commit details
    Browse the repository at this point in the history
  10. [CF 4874] v36 - BitmapHeapScan table AM violation removal (and use st…

    …reaming read API)
    
    This branch was automatically generated by a robot using patches from an
    email thread registered at:
    
    https://commitfest.postgresql.org/patch/4874
    
    The branch will be overwritten each time a new patch version is posted to
    the thread, and also periodically to check for bitrot caused by changes
    on the master branch.
    
    Patch(es): https://www.postgresql.org/message-id/CAAKRu_b8nJua_W_C1i46a-XaMzJJnYF+8BpET0zyoREXEvrmhQ@mail.gmail.com
    Author(s): Melanie Plageman
    Commitfest Bot committed Mar 13, 2025
    Configuration menu
    Copy the full SHA
    942e8e2 View commit details
    Browse the repository at this point in the history
Loading