Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: cf/4538~1
Choose a base ref
...
head repository: postgresql-cfbot/postgresql
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: cf/4538
Choose a head ref
  • 4 commits
  • 5 files changed
  • 3 contributors

Commits on Apr 3, 2025

  1. Skip second WriteToc() for custom-format dumps without data.

    Presently, "pg_dump --format=custom" calls WriteToc() twice.  The
    second call is intended to update the data offset information,
    which allegedly makes parallel pg_restore significantly faster.
    However, if we aren't dumping any data, this step accomplishes
    nothing and can be skipped.  This is a preparatory optimization for
    follow-up commits that will move the queries for attribute
    statistics to WriteToc()/_printTocEntry() to save memory.
    
    Reviewed-by: Jeff Davis <[email protected]>
    Discussion: https://postgr.es/m/Z9c1rbzZegYQTOQE%40nathan
    nathan-bossart authored and Commitfest Bot committed Apr 3, 2025
    Configuration menu
    Copy the full SHA
    07b47f1 View commit details
    Browse the repository at this point in the history
  2. pg_dump: Reduce memory usage of dumps with statistics.

    Right now, pg_dump stores all generated commands for statistics in
    memory.  These commands can be quite large and therefore can
    significantly increase pg_dump's memory footprint.  To fix, wait
    until we are about to write out the commands before generating
    them, and be sure to free the commands after writing.  This is
    implemented via a new defnDumper callback that works much like the
    dataDumper one but is specially designed for TOC entries.
    
    Custom dumps that include data might write the TOC twice (to update
    data offset information), which would ordinarily cause pg_dump to
    run the attribute statistics queries twice.  However, as a hack, we
    save the length of the written-out entry in the first pass, and we
    skip over it in the second.  While there is no known technical
    problem with executing the queries multiple times and rewriting the
    results, it's expensive and feels risky, so it seems prudent to
    avoid it.
    
    As an exception, we _do_ execute the queries twice for the tar
    format.  This format does a second pass through the TOC to generate
    the restore.sql file, which isn't used by pg_restore, so different
    results won't corrupt the output (it'll just be different).  We
    could alternatively save the definition in memory the first time it
    is generated, but that defeats the purpose of this change.  In any
    case, past discussion indicates that the tar format might be a
    candidate for deprecation, so it doesn't seem worth trying too much
    harder.
    
    Author: Corey Huinker <[email protected]>
    Co-authored-by: Nathan Bossart <[email protected]>
    Reviewed-by: Jeff Davis <[email protected]>
    Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
    2 people authored and Commitfest Bot committed Apr 3, 2025
    Configuration menu
    Copy the full SHA
    6a824d9 View commit details
    Browse the repository at this point in the history
  3. pg_dump: Retrieve attribute statistics in batches.

    Currently, pg_dump gathers attribute statistics with a query per
    relation, which can cause pg_dump to take significantly longer,
    especially when there are many tables.  This commit improves
    matters by gathering attribute statistics for 64 relations at a
    time.  Some simple testing showed this was the ideal batch size,
    but performance may vary depending on workload.  This change
    increases the memory usage of pg_dump a bit, but that isn't
    expected to be too egregious and is arguably well worth the
    trade-off.
    
    Our lookahead code for determining the next batch of relations for
    which to gather attribute statistics is simple: we walk the TOC
    sequentially looking for eligible entries.  However, the assumption
    that we will dump all such entries in this order doesn't hold up
    for dump formats that use RestoreArchive().  This is because
    RestoreArchive() does multiple passes through the TOC and
    selectively dumps certain entries each time.  This is particularly
    troublesome for index stats and a subset of matview stats; both are
    in SECTION_POST_DATA, but matview stats that depend on matview data
    are dumped in RESTORE_PASS_POST_ACL, while all other statistics
    data is dumped in RESTORE_PASS_MAIN.  To deal with this, this
    commit moves all statistics data entries in SECTION_POST_DATA to
    RESTORE_PASS_POST_ACL, which ensures that we always dump statistics
    data entries in TOC order.  One convenient side effect of this
    change is that we can revert a decent chunk of commit a0a4601.
    
    Author: Corey Huinker <[email protected]>
    Co-authored-by: Nathan Bossart <[email protected]>
    Reviewed-by: Jeff Davis <[email protected]>
    Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
    2 people authored and Commitfest Bot committed Apr 3, 2025
    Configuration menu
    Copy the full SHA
    2ac93a8 View commit details
    Browse the repository at this point in the history
  4. [CF 4538] Statistics Import and Export

    This branch was automatically generated by a robot using patches from an
    email thread registered at:
    
    https://commitfest.postgresql.org/patch/4538
    
    The branch will be overwritten each time a new patch version is posted to
    the thread, and also periodically to check for bitrot caused by changes
    on the master branch.
    
    Patch(es): https://www.postgresql.org/message-id/Z-3x2AnPCP331JA3@nathan
    Author(s): Corey Huinker
    Commitfest Bot committed Apr 3, 2025
    Configuration menu
    Copy the full SHA
    dd0ff64 View commit details
    Browse the repository at this point in the history
Loading