Quick Links

BUG #17821: Assertion failed in heap_update() due to heap pruning

Lists:	pgsql-bugs

From:	PG Bug reporting form <noreply(at)postgresql(dot)org>
To:	pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc:	exclusion(at)gmail(dot)com
Subject:	BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2023-03-02 13:00:01
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17821
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 15.2
Operating system: Ubuntu 22.04
Description:

When performing multi-threaded tests, I've stumbled upon a failed
assertion with the following stack trace:

Core was generated by `postgres: postgres regress003 127.0.0.1(35610) GRANT
'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140650166969280) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140650166969280) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140650166969280) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140650166969280, signo=signo(at)entry=6) at
./nptl/pthread_kill.c:89
#3 0x00007febad28e476 in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007febad2747f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x0000556725c44981 in ExceptionalCondition
(conditionName=conditionName(at)entry=0x556725dc6b00 "ItemIdIsNormal(lp)",
fileName=fileName(at)entry=0x556725dc48e0 "heapam.c",
lineNumber=lineNumber(at)entry=3024) at assert.c:66
#6 0x00005567249f3aa7 in heap_update
(relation=relation(at)entry=0x7feb61671cd8, otid=otid(at)entry=0x61900002c3fc,
newtup=newtup(at)entry=0x61900002c3f8, cid=<optimized out>,
crosscheck=crosscheck(at)entry=0x0, wait=wait(at)entry=true,
tmfd=<optimized out>, lockmode=<optimized out>) at heapam.c:3024
#7 0x00005567249f8d46 in simple_heap_update
(relation=relation(at)entry=0x7feb61671cd8, otid=otid(at)entry=0x61900002c3fc,
tup=tup(at)entry=0x61900002c3f8) at heapam.c:3969
#8 0x0000556724c367ba in CatalogTupleUpdate
(heapRel=heapRel(at)entry=0x7feb61671cd8, otid=0x61900002c3fc,
tup=0x61900002c3f8) at indexing.c:309
#9 0x0000556724bfc30c in ExecGrant_Relation
(istmt=istmt(at)entry=0x7ffdeb8ca1e0) at aclchk.c:2041
#10 0x0000556724bffee3 in ExecGrantStmt_oids
(istmt=istmt(at)entry=0x7ffdeb8ca1e0) at aclchk.c:598
#11 0x0000556724c00ab0 in ExecuteGrantStmt (stmt=stmt(at)entry=0x625000005df0)
at aclchk.c:583
#12 0x00005567257ccac7 in ProcessUtilitySlow
(pstate=pstate(at)entry=0x61900002c298, pstmt=pstmt(at)entry=0x625000005ec0,
queryString=queryString(at)entry=0x625000005218 "GRANT SELECT ON pg_proc TO
CURRENT_USER;",
context=context(at)entry=PROCESS_UTILITY_TOPLEVEL, params=params(at)entry=0x0,
queryEnv=queryEnv(at)entry=0x0, dest=<optimized out>, qc=<optimized out>) at
utility.c:1802
#13 0x00005567257c92a5 in standard_ProcessUtility
(pstmt=pstmt(at)entry=0x625000005ec0,
queryString=queryString(at)entry=0x625000005218 "GRANT SELECT ON pg_proc TO
CURRENT_USER;", readOnlyTree=readOnlyTree(at)entry=false,
context=context(at)entry=PROCESS_UTILITY_TOPLEVEL, params=params(at)entry=0x0,
queryEnv=queryEnv(at)entry=0x0, dest=<optimized out>, qc=<optimized out>) at
utility.c:973
#14 0x00007feba7e91d97 in pgss_ProcessUtility (pstmt=<optimized out>,
queryString=<optimized out>, readOnlyTree=<optimized out>,
context=<optimized out>, params=<optimized out>, queryEnv=<optimized out>,
dest=<optimized out>,
qc=<optimized out>) at pg_stat_statements.c:1143
#15 0x00005567257c9906 in ProcessUtility (pstmt=pstmt(at)entry=0x625000005ec0,
queryString=<optimized out>, readOnlyTree=<optimized out>,
context=context(at)entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>,
queryEnv=<optimized out>,
dest=0x6250000066f0, qc=0x7ffdeb8cae50) at utility.c:526
#16 0x00005567257c2488 in PortalRunUtility
(portal=portal(at)entry=0x62500002fa18, pstmt=pstmt(at)entry=0x625000005ec0,
isTopLevel=isTopLevel(at)entry=true,
setHoldSnapshot=setHoldSnapshot(at)entry=false, dest=dest(at)entry=0x6250000066f0,

qc=qc(at)entry=0x7ffdeb8cae50) at pquery.c:1158
#17 0x00005567257c3010 in PortalRunMulti
(portal=portal(at)entry=0x62500002fa18, isTopLevel=isTopLevel(at)entry=true,
setHoldSnapshot=setHoldSnapshot(at)entry=false, dest=dest(at)entry=0x6250000066f0,
altdest=altdest(at)entry=0x6250000066f0,
qc=qc(at)entry=0x7ffdeb8cae50) at pquery.c:1315
#18 0x00005567257c3d82 in PortalRun (portal=portal(at)entry=0x62500002fa18,
count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true,
run_once=run_once(at)entry=true, dest=dest(at)entry=0x6250000066f0,
altdest=altdest(at)entry=0x6250000066f0, qc=<optimized out>) at
pquery.c:791
#19 0x00005567257b7f59 in exec_simple_query
(query_string=query_string(at)entry=0x625000005218 "GRANT SELECT ON pg_proc TO
CURRENT_USER;") at postgres.c:1240
#20 0x00005567257bcead in PostgresMain (dbname=dbname(at)entry=0x62900001b268
"regress003", username=username(at)entry=0x62900001b248 "postgres") at
postgres.c:4572
#21 0x000055672555f72b in BackendRun (port=port(at)entry=0x614000230640) at
postmaster.c:4461
#22 0x0000556725567fc7 in BackendStartup (port=port(at)entry=0x614000230640) at
postmaster.c:4189
#23 0x00005567255685c4 in ServerLoop () at postmaster.c:1779
#24 0x000055672556acbb in PostmasterMain (argc=argc(at)entry=3,
argv=argv(at)entry=0x603000000370) at postmaster.c:1463
#25 0x00005567251c4499 in main (argc=3, argv=0x603000000370) at main.c:200

After series of experiments I came to the following simple reproduction:
numclients=2
for ((i=1;i<=300;i++)); do
echo "iteration $i"
for ((j=1;j<=numclients;j++)); do
for ((k=1;k<=300;k++)); do echo "GRANT ALL ON pg_class TO public;"; done
| psql >psql-$j.log 2>&1 &
done
wait
grep 'TRAP:' server.log && { res=0; break; }
psql -c "VACUUM FULL pg_class" >psql-vacuum.log
done
pg_ctl -w -t 10 -D "$PGDB" stop

...
iteration 3
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3024,
PID: 1817582
waiting for server to shut down............. failed

To ease the reproduction I did:
sed 's/minfree = Max(minfree, BLCKSZ \/ 10)/minfree = Max(minfree, BLCKSZ \/
1)/' -i src/backend/access/heap/pruneheap.c
(I believe that that doesn't change modus vivendi of the page pruning
too much.)
With some debugging messages added, I see:

So it looks like the session 1ba5ac made linepointer 24 unused while
pruning in the guts of objectNamesToOids(), then got the same
offset (24) from the SearchSysCache1() and finally was surprised
inside heap_update() to see an unused line pointer.
This raises two questions for me.
1) Was it legal to prune that linepointer (what if ExecGrant do not
perform
heap_modify_tuple() in this case, will an unused line pointer be left
behind?)?
2) Should page pruning be reflected in SysCache somehow?

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2023-03-04 06:00:00
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

02.03.2023 16:00, PG Bug reporting form wrote:
> #5 0x0000556725c44981 in ExceptionalCondition
> (conditionName=conditionName(at)entry=0x556725dc6b00 "ItemIdIsNormal(lp)",
> fileName=fileName(at)entry=0x556725dc48e0 "heapam.c",
> lineNumber=lineNumber(at)entry=3024) at assert.c:66
> ...
> 2023-03-02 15:22:13.031 MSK|law|test1|640094f5.1ba5ac|LOG:
> ExecuteGrantStmt(): before objectNamesToOids()
> 2023-03-02 15:22:13.031 MSK|law|test1|640094f5.1ba5ac|STATEMENT: GRANT ALL
> ON pg_class TO public;
> 2023-03-02 15:22:13.031 MSK|law|test1|640094f5.1ba5ac|LOG:
> heap_page_prune_execute() before ItemIdSetUnused(); buffer: 2012, off: 24
> 2023-03-02 15:22:13.031 MSK|law|test1|640094f5.1ba5ac|STATEMENT: GRANT ALL
> ON pg_class TO public;
> ...
>
> So it looks like the session 1ba5ac made linepointer 24 unused while
> pruning in the guts of objectNamesToOids(), then got the same
> offset (24) from the SearchSysCache1() and finally was surprised
> inside heap_update() to see an unused line pointer.
> This raises two questions for me.
> 1) Was it legal to prune that linepointer (what if ExecGrant do not
> perform
> heap_modify_tuple() in this case, will an unused line pointer be left
> behind?)?
> 2) Should page pruning be reflected in SysCache somehow?

Further investigation gave me a positive answer to the first question.
With more debug logging added I see that when that pruning executed
there was a redirection happened. There was dead items found in a tuple
chain starting from position 22, and redirection 22 -> 25 was performed,
so item 24 became unused.
The fragment of the detailed log is attached, and also you can look at
a dump of the modified page at the moment of the Assert.

SELECT * from heap_page_item_attrs(pg_read_binary_file('page.bin'),
1259::oid, true);
shows:
...
21 |   3600 |        1 |    225 |    680 |      0 |        0 | (10,21)
|          33 |      11011 |     32 |
1111111111111111111111111111111000000000 |       | ...
22 |     25 |        2 |      0 |        |        | |
|             |            | |
|       |
23 |      0 |        3 |      0 |        |        | |
|             |            | |
|       |
24 |      0 |        0 |      0 |        |        | |
|             |            | |
|       |
25 |   3368 |        1 |    225 | 222501 |      0 |        0 | (10,25)
|       32801 |      10499 |     32 |
1111111111111111111111111111111000000000 |       | ...
(25 rows)

So we have only the second question left.

Best regards,
Alexander

Attachment	Content-Type	Size
server.log	text/x-log	4.7 KB
page.bin	application/octet-stream	8.0 KB

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-bugs(at)lists(dot)postgresql(dot)org, Alexander Lakhin <exclusion(at)gmail(dot)com>
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2023-03-04 16:48:32
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On March 3, 2023 10:00:00 PM PST, Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
>02.03.2023 16:00, PG Bug reporting form wrote:
>So we have only the second question left.

My first suspicion would be that we aren't holding a lock in the right moment, to enforce cache invalidation processing.

Greetings,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2023-10-07 08:00:00
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

Hi,

04.03.2023 19:48, Andres Freund wrote:
> My first suspicion would be that we aren't holding a lock in the right moment, to enforce cache invalidation processing.
>

I've encountered another case of that assertion failure (this time with pg_statistic):
numclients=10
for ((c=1;c<=numclients;c++)); do
createdb db$c
done

for ((i=1;i<=50;i++)); do
echo "iteration $i"

for ((c=1;c<=numclients;c++)); do
echo "
CREATE TABLE t(i int4, t1 text[], t2 text[]);
INSERT INTO t SELECT g, ARRAY[repeat('x', g)], ARRAY[repeat('x', g)] FROM generate_series(1, 200) g;
ANALYZE t;
" | psql -d db$c >/dev/null
done

for ((c=1;c<=numclients;c++)); do
echo "
ANALYZE t;
" | psql -d db$c >/dev/null &

echo "
SELECT * FROM t WHERE i = 1;
ANALYZE t;
" | psql -d db$c >/dev/null &
done
wait
grep 'TRAP:' server.log && { break; }

for ((c=1;c<=numclients;c++)); do
echo "
DROP TABLE t; VACUUM pg_statistic;
" | psql -d db$c >/dev/null
done
done

...
iteration 13
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3074, PID: 89697

Reproduced on REL_12_STABLE (starting from 5e0928005) .. master.
(Modifying pruneheap.c as in [1] not needed for this case.)

Here we have the following sequence of events:
Session 1
ANALYZE t; // needed just to produce material for pruning
    do_analyze_rel() -> update_attstats() -> CatalogTupleUpdateWithInfo() -> heap_update(ctid(18,16));
PrepareToInvalidateCacheTuple(hashvalue: {hash1})
                Session 2:
                SELECT * FROM t WHERE i = 1;
                    SearchCatCacheInternal(hashValue: {hash1})
                        SearchCatCacheMiss(hashValue: {hash1})
    ProcessInvalidationMessages()
    CatCacheInvalidate(hashValue: {hash1})
    CatCacheInvalidate(hashValue: {hash2})
                            CatCacheInvalidate(hashValue: {hash1})
                        SearchCatCacheMiss returns tid(18,16)

                ANALYZE t;
                    do_analyze_rel() -> update_attstats()
                        attno=0 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash2}) -> SearchCatCacheMiss()
-> heap_page_prune(block: 18)
                        attno=1 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash1}) -> oldtup = tid(18,16)
                        CatalogTupleUpdateWithInfo() -> heap_update(tid(18, 16))

(A server.log with additional tracing messages for one of the failed runs
is attached.)

In other words, a backend can get some tuple id from catcache and then prune
the underlying page while searching for another (missing) entry in the cache
(any other reading of the page that may trigger pruning will do as well).
After that, an attempt to update that tuple might trigger the Assert().

[1] https://www.postgresql.org/message-id/17821-dd8c334263399284%40postgresql.org

Best regards,
Alexander

Attachment	Content-Type	Size
server-db6.log	text/x-log	23.9 KB

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2024-04-09 12:00:00
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

Hello,

07.10.2023 11:00, Alexander Lakhin wrote:
>
> I've encountered another case of that assertion failure (this time with pg_statistic):

That second case is not reproduced anymore, after ad98fb142, but the first
is still actual.

I also discovered another one, with pg_subscription:
echo "
CREATE ROLE su SUPERUSER;
CREATE SUBSCRIPTION sub CONNECTION 'dbname=nonexisting'
PUBLICATION pub WITH (connect = false);
" | psql

for ((i=1;i<=20;i++)); do
echo "iteration $i"

for ((k=1;k<=100;k++)); do
for ((j=1;j<=200;j++)); do echo "
ALTER SUBSCRIPTION sub SET (synchronous_commit = false);
"; done | psql >/dev/null 2>&1 &

    for ((j=1;j<=200;j++)); do echo "
ALTER SUBSCRIPTION sub OWNER TO su;
ALTER SUBSCRIPTION sub SET (slot_name = NONE);
    "; done | psql >/dev/null 2>&1 &
    wait
done

grep 'TRAP:' server.log && break;
done

This script fails for me as below:
...
iteration 5
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3296, PID: 388093

(sed 's/minfree = Max(minfree, BLCKSZ \/ 10)/minfree = Max(minfree, BLCKSZ \/1)/'
-i src/backend/access/heap/pruneheap.c
is highly recommended for reproducing)

Melanie and Heikki, as you have been working on pruning over the recent
months, perhaps this defect could be of interest to you...

Best regards,
Alexander

From:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2024-04-09 17:41:51
Message-ID:	CAAKRu_bWz0KhTKW3SzetvhZL3gpsnyekB2WFCn-MO4aCx=37=g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Tue, Apr 9, 2024 at 8:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
>
> Hello,
>
> 07.10.2023 11:00, Alexander Lakhin wrote:
> >
> > I've encountered another case of that assertion failure (this time with pg_statistic):
>
> That second case is not reproduced anymore, after ad98fb142, but the first
> is still actual.
>
> I also discovered another one, with pg_subscription:
> echo "
> CREATE ROLE su SUPERUSER;
> CREATE SUBSCRIPTION sub CONNECTION 'dbname=nonexisting'
> PUBLICATION pub WITH (connect = false);
> " | psql
>
> for ((i=1;i<=20;i++)); do
> echo "iteration $i"
>
> for ((k=1;k<=100;k++)); do
> for ((j=1;j<=200;j++)); do echo "
> ALTER SUBSCRIPTION sub SET (synchronous_commit = false);
> "; done | psql >/dev/null 2>&1 &
>
> for ((j=1;j<=200;j++)); do echo "
> ALTER SUBSCRIPTION sub OWNER TO su;
> ALTER SUBSCRIPTION sub SET (slot_name = NONE);
> "; done | psql >/dev/null 2>&1 &
> wait
> done
>
> grep 'TRAP:' server.log && break;
> done
>
> This script fails for me as below:
> ...
> iteration 5
> TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3296, PID: 388093
>
> (sed 's/minfree = Max(minfree, BLCKSZ \/ 10)/minfree = Max(minfree, BLCKSZ \/1)/'
> -i src/backend/access/heap/pruneheap.c
> is highly recommended for reproducing)
>
> Melanie and Heikki, as you have been working on pruning over the recent
> months, perhaps this defect could be of interest to you...

Hi Alexander,

Just to confirm, the original bug was filed against 15, but I see
ad98fb142 was committed on master as well as backpatched. You are able
to reproduce your pg_subscription example on master? I tried your
repro (including the modification to minfree) and wasn't able to
reproduce on master on my machine. This could totally be my fault, but
I just wanted to confirm you were able to repro on master.

- Melanie

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2024-04-09 19:00:00
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

Hi Melanie,

09.04.2024 20:41, Melanie Plageman wrote:
> Just to confirm, the original bug was filed against 15, but I see
> ad98fb142 was committed on master as well as backpatched. You are able
> to reproduce your pg_subscription example on master? I tried your
> repro (including the modification to minfree) and wasn't able to
> reproduce on master on my machine. This could totally be my fault, but
> I just wanted to confirm you were able to repro on master.
>

Yes, I reproduced both cases on master (27074bce0).
Please look at the self-contained script that works for me:
.../repro-17821-SUBSCRIPTION.sh
...
CREATE ROLE
WARNING: subscription was created, but is not connected
HINT: To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the
subscription.
CREATE SUBSCRIPTION
iteration 1
iteration 2
iteration 3
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3296, PID: 1069025
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3296, PID: 1069621

Best regards,
Alexander

Attachment	Content-Type	Size
repro-17821-SUBSCRIPTION.sh.txt	text/plain	1.3 KB

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2024-11-13 00:26:57
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Tue, Apr 09, 2024 at 10:00:00PM +0300, Alexander Lakhin wrote:
> 09.04.2024 20:41, Melanie Plageman wrote:
> > Just to confirm, the original bug was filed against 15, but I see
> > ad98fb142 was committed on master as well as backpatched. You are able
> > to reproduce your pg_subscription example on master? I tried your
> > repro (including the modification to minfree) and wasn't able to
> > reproduce on master on my machine. This could totally be my fault, but
> > I just wanted to confirm you were able to repro on master.
>
> Yes, I reproduced both cases on master (27074bce0).

This looked a lot like the former LP_UNUSED race conditions that inplace.spec
demonstrated. I forked inplace.spec to use regular MVCC updates (attached).
That reproduces the symptoms you describe. This happens because syscache is a
way to get a tuple without a pin on the tuple's page and without satisfying
any snapshot. Either a pin or a snapshot would stop the pruning. That's how
regular UPDATE statements avoid this.

Some options for fixing it:

1. Make the ItemIdIsNormal() check a runtime check that causes heap_update()
to return TM_Deleted. It could Assert(RelationSupportsSysCache(rel)), too.

2. Like the previous, but introduce a TM_Pruned. If pruning hadn't happened,
heap_update() would have returned TM_Updated or TM_Deleted. We don't know
which one heap_update() would have returned, because pruning erased the
t_ctid that would have driven that decision.

3. Before using a syscache tuple for a heap_update(), pin its buffer and
compare the syscache xmin to the buffer tuple xmin. Normally, they'll
match. If they don't match, ERROR or re-fetch the latest tuple as though
the syscache lookup had experienced a cache miss.

4. Take a self-exclusive heavyweight lock before we use the syscache to fetch
the old tuple for a heap_update(). That's how we avoid this if the only
catalog-modifying commands are ALTER TABLE.

I think this defect can't corrupt catalogs or other data. It's merely a
source of transient failure. While one of the test permutations does update a
pg_class tuple of a different rel, the update then fails with a primary key
violation. All misdirected updates will fail that way, because we don't
update the primary key columns of syscache relations. With assertions
disabled, I think heap_update() looks for an old tuple at lp_off=0, hence
treating the page header as a tuple header. Bytes of pd_lsn become the
putative xmin. The attached test gets "ERROR: attempted to update invisible
tuple" error from that, but other PageHeader bit patterns might reach "could
not access status of transaction", unique constraint violations, etc.

I recommend (2) or (1), since they're self-contained. (3) or (4) potentially
give a better user experience for this rare failure, but that's code churn at
every DDL heap_update() site. How do you see it?

Thanks,
nm

Attachment	Content-Type	Size
repro-ItemIdIsNormal-v1.patch	text/plain	10.1 KB

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2025-01-11 21:44:54
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Tue, Nov 12, 2024 at 04:26:57PM -0800, Noah Misch wrote:
> This looked a lot like the former LP_UNUSED race conditions that inplace.spec
> demonstrated. I forked inplace.spec to use regular MVCC updates (attached).
> That reproduces the symptoms you describe. This happens because syscache is a
> way to get a tuple without a pin on the tuple's page and without satisfying
> any snapshot. Either a pin or a snapshot would stop the pruning. That's how
> regular UPDATE statements avoid this.
>
> Some options for fixing it:
>
> 1. Make the ItemIdIsNormal() check a runtime check that causes heap_update()
> to return TM_Deleted. It could Assert(RelationSupportsSysCache(rel)), too.
>
> 2. Like the previous, but introduce a TM_Pruned. If pruning hadn't happened,
> heap_update() would have returned TM_Updated or TM_Deleted. We don't know
> which one heap_update() would have returned, because pruning erased the
> t_ctid that would have driven that decision.
>
> 3. Before using a syscache tuple for a heap_update(), pin its buffer and
> compare the syscache xmin to the buffer tuple xmin. Normally, they'll
> match. If they don't match, ERROR or re-fetch the latest tuple as though
> the syscache lookup had experienced a cache miss.
>
> 4. Take a self-exclusive heavyweight lock before we use the syscache to fetch
> the old tuple for a heap_update(). That's how we avoid this if the only
> catalog-modifying commands are ALTER TABLE.
>
> I think this defect can't corrupt catalogs or other data. It's merely a
> source of transient failure.

I got that wrong ...

> While one of the test permutations does update a
> pg_class tuple of a different rel, the update then fails with a primary key
> violation. All misdirected updates will fail that way, because we don't
> update the primary key columns of syscache relations. With assertions
> disabled, I think heap_update() looks for an old tuple at lp_off=0, hence
> treating the page header as a tuple header.

It does.

> Bytes of pd_lsn become the
> putative xmin. The attached test gets "ERROR: attempted to update invisible
> tuple" error from that, but other PageHeader bit patterns might reach "could
> not access status of transaction", unique constraint violations, etc.

A UNIQUE constraint violation is too late to avoid page header corruption. I
modified the test case to reach page header corruption. Attached. Key parts:

lsn |checksum|flags|lower|upper|special|pagesize|version|prune_xid
------+--------+-----+-----+-----+-------+--------+-------+---------
1/6AF8| 0| 0| 0| 11| 38| 8192| 4| 0
(1 row)

ERROR: invalid page in block 11 of relation base/16384/16462

A UNIQUE violation implies heap_update() succeeded, and heap_update() success
implies changes to the old tuple header. Since the test catches us wrongly
treating the page header as a tuple header, overwrites ensue as follows:

Page header field | Overwritten as though it were this heap tuple header field
------------------------------------------------------------------------------
pd_lsn.xrecoff | xmax
pd_lower | ctid.ip_blkid.bi_hi
pd_upper | ctid.ip_blkid.bi_lo
pd_special | ctid.ip_posid

t_ctid=(11,38) caused those values of "lower", "upper", and "special". When
the page is next read from disk, it gets "ERROR: invalid page".

> I recommend (2) or (1), since they're self-contained. (3) or (4) potentially
> give a better user experience for this rare failure, but that's code churn at
> every DDL heap_update() site. How do you see it?

I plan to write (1).

Attachment	Content-Type	Size
repro-ItemIdIsNormal-v2.patch	text/plain	15.7 KB

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2025-01-23 02:13:49
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Sat, Jan 11, 2025 at 01:44:54PM -0800, Noah Misch wrote:
> On Tue, Nov 12, 2024 at 04:26:57PM -0800, Noah Misch wrote:
> > 1. Make the ItemIdIsNormal() check a runtime check that causes heap_update()
> > to return TM_Deleted. It could Assert(RelationSupportsSysCache(rel)), too.

> A UNIQUE constraint violation is too late to avoid page header corruption.

> the page is next read from disk, it gets "ERROR: invalid page".

> I plan to write (1).

Attached. Outside of comments and tests, it's a simple 20-line patch. Since
only two weeks will remain before release freeze, in the absence of review, I
would push this after 2025-01-25T16:00+0000. Unfortunately, while this
prevents the page header corruption, there's a preexisting loss of DDL changes
that it doesn't fix. I'll start a new thread about that, but I've included a
preview in Appendix 1 below. Would anyone else like to work on the problem in
Appendix 1? If so, let me know and feel free to start that thread yourself.
I may not get to it in 2025.

The new code sets TM_FailureData.xmax to invalid, which TM_Deleted may have
never done before. I see no problem at sites using that xmax:

src/backend/executor/nodeModifyTable.c:3137: if (TransactionIdIsCurrentTransactionId(context->tmfd.xmax))
src/backend/executor/nodeModifyTable.c:3307: if (TransactionIdIsCurrentTransactionId(context->tmfd.xmax))
src/backend/access/heap/heapam_handler.c:395: priorXmax = tmfd->xmax;
src/backend/access/heap/heapam_handler.c:488: tmfd->xmax = priorXmax;

The first two will rightly compute
TransactionIdIsCurrentTransactionId()==false. The second two are dealing with
tmfd from heap_lock_tuple(), not from heap_update(). Those grep matches are
from v18, but v13 and PGXN don't have additional challenges.

The test was sensitive at first to debug_parallel_query, debug_discard_caches,
autovacuum, and -DCATCACHE_FORCE_RELEASE. It would have avoided an
alternative output to have a GUC controlling -DCATCACHE_FORCE_RELEASE, like we
have debug_discard_caches to control CLOBBER_CACHE_ALWAYS. I'll not be
shocked if the buildfarm finds another way to make the test unstable. We may
end up disabling autovacuum for src/test/modules/injection_points, since
autovacuum doesn't mix with non-local injection points that autovacuum can
reach. For now, I've aimed to preserve compatibility with autovacuum=on.

==== Appendix 1: the unfixed loss of DDL changes

Unfortunately, I found a related case the patch doesn't fix. The known-to-me
fixes for this other case are more invasive than the fix for $SUBJECT, so I
plan to move forward with the $SUBJECT fix and bring the following to a new
thread. The third & last permutation of the $SUBJECT patch test case covers
the unfixed case. In brief, suppose the oldtup for a catalog update is a
syscache entry with xmin=1234 ctid=(11,38). We call CatalogTupleUpdate(rel,
'(11,38)', newtup). Meanwhile, the tuple has been updated twice with a VACUUM
between the two updates, so (11,38) now contains an xmin=5678 tuple for the
same primary key. CatalogTupleUpdate() has no way to know that its caller
used a stale oldtup, so the effect is to discard the two intervening updates.

To fix that, I lean toward the following:

- Make heap_modify_tuple() copy xmin from the oldtup.
- Make simple_heap_update() assert that newtup contains a valid xmin.
- In heap_update(), if newtup contains a valid xmin and otid now points to
tuple with a different xmin, fail with TM_Deleted. (Invalid xmin is also
okay for heap_update() callers that don't route through
simple_heap_update(), e.g. ExecUpdate(). ExecUpdate() has a snapshot
preventing the loss.)

Catalog updates not based on heap_copytuple() or heap_modify_tuple() would
need changes. They'll keep working in non-assert builds if they pass invalid
xmin. They'll break if they pass garbage bytes in xmin.

Better ideas? Other ways our lack of snapshot protecting oldtup might break?

Two of my upthread alternatives would also fix this, but they both entail the
churn of changing every CatalogTupleUpdate() caller (6 PGXN modules and ~200
postgresql.git calls):

> > 3. Before using a syscache tuple for a heap_update(), pin its buffer and
> > compare the syscache xmin to the buffer tuple xmin. Normally, they'll
> > match. If they don't match, ERROR or re-fetch the latest tuple as though
> > the syscache lookup had experienced a cache miss.

> > 4. Take a self-exclusive heavyweight lock before we use the syscache to fetch
> > the old tuple for a heap_update(). That's how we avoid this if the only
> > catalog-modifying commands are ALTER TABLE.

==== Appendix 2: UNIQUE violation helps less than theorized

The repro-ItemIdIsNormal-v2.patch test case (see previous message of thread)
didn't get the UNIQUE violation I expected. I opted to find out why not,
because I thought at the time we'll be relying long-term on the UNIQUE
violation when a different live tuple has replaced the ctid we're updating.
The problem update makes index entries for its new tuple, but this code stops
us from reporting the violation:

* Otherwise we have a definite conflict. But before
* complaining, look to see if the tuple we want to insert
* is itself now committed dead --- if so, don't complain.
* This is a waste of time in normal scenarios but we must
* do it to support CREATE INDEX CONCURRENTLY.
...
if (table_index_fetch_tuple_check(heapRel, &htid,
SnapshotSelf, NULL))

Since the page header corruption sets pd_lower=0, PageGetMaxOffsetNumber()
concludes the page has no tuples, and that table_index_fetch_tuple_check()
finds nothing.

Attachment	Content-Type	Size
update-unused-lp-10-no-installcheck-v1.patch	text/plain	1.2 KB
update-unused-lp-40-xid2fxid-v1.patch	text/plain	10.1 KB
update-unused-lp-50-fix-v1.patch	text/plain	24.8 KB

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2025-03-03 22:07:10
Message-ID:	sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

Hi,

On 2025-01-22 18:13:49 -0800, Noah Misch wrote:
> Attached. Outside of comments and tests, it's a simple 20-line patch. Since
> only two weeks will remain before release freeze, in the absence of review, I
> would push this after 2025-01-25T16:00+0000.

I just fixed skink, the valgrind animal, so it runs not just the main
regression tests but all tests with a valgrind-ified postgres. Unfortunately,
the next run triggered a failure in the test added in this thread:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2025-03-03%2019%3A52%3A38&stg=injection_points-check

diff -U3 /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out
--- /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out 2025-01-25 19:30:50.005386842 +0000
+++ /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out 2025-03-03 21:08:02.025314915 +0000
@@ -75,6 +75,7 @@
SELECT FROM injection_points_wakeup('heap_update-before-pin');
<waiting ...>
step grant1: <... completed>
+ERROR: tuple concurrently deleted
step wakegrant4: <... completed>
step inspect4:
SELECT relhastriggers, relhassubclass FROM pg_class
@@ -82,6 +83,6 @@

relhastriggers|relhassubclass
--------------+--------------
-f |f
+t |t
(1 row)

Greetings,

Andres Freund

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2025-03-04 03:34:42
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Mon, Mar 03, 2025 at 05:07:10PM -0500, Andres Freund wrote:
> On 2025-01-22 18:13:49 -0800, Noah Misch wrote:
> > Attached. Outside of comments and tests, it's a simple 20-line patch. Since
> > only two weeks will remain before release freeze, in the absence of review, I
> > would push this after 2025-01-25T16:00+0000.
>
> I just fixed skink, the valgrind animal, so it runs not just the main
> regression tests but all tests with a valgrind-ified postgres.

Thanks.

> Unfortunately,
> the next run triggered a failure in the test added in this thread:
>
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2025-03-03%2019%3A52%3A38&stg=injection_points-check
>
> diff -U3 /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out
> --- /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out 2025-01-25 19:30:50.005386842 +0000
> +++ /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out 2025-03-03 21:08:02.025314915 +0000
> @@ -75,6 +75,7 @@
> SELECT FROM injection_points_wakeup('heap_update-before-pin');
> <waiting ...>
> step grant1: <... completed>
> +ERROR: tuple concurrently deleted
> step wakegrant4: <... completed>
> step inspect4:
> SELECT relhastriggers, relhassubclass FROM pg_class
> @@ -82,6 +83,6 @@
>
> relhastriggers|relhassubclass
> --------------+--------------
> -f |f
> +t |t
> (1 row)

That isolation permutation tests an unfixed bug. Here, it's giving a result
as though the bug were fixed. The suite passed the next two skink runs. I'd
tend to defer debugging exactly what went wrong until the project to fix the
bug under test. I could delete the permutation, or I could leave it awhile to
see how high-probability this failure is. I'm inclined to leave it until it
gets four failures, then delete the permutation.

From:	Noah Misch <noah(at)leadboat(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Date:	2025-04-02 17:04:36
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-bugs

On Mon, Mar 03, 2025 at 07:34:42PM -0800, Noah Misch wrote:
> On Mon, Mar 03, 2025 at 05:07:10PM -0500, Andres Freund wrote:
> > I just fixed skink, the valgrind animal, so it runs not just the main
> > regression tests but all tests with a valgrind-ified postgres.
>
> Thanks.
>
> > Unfortunately,
> > the next run triggered a failure in the test added in this thread:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2025-03-03%2019%3A52%3A38&stg=injection_points-check
> >
> > diff -U3 /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out
> > --- /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/modules/injection_points/expected/syscache-update-pruned.out 2025-01-25 19:30:50.005386842 +0000
> > +++ /home/bf/bf-build/skink-master/HEAD/pgsql.build/testrun/injection_points/isolation/results/syscache-update-pruned.out 2025-03-03 21:08:02.025314915 +0000
> > @@ -75,6 +75,7 @@
> > SELECT FROM injection_points_wakeup('heap_update-before-pin');
> > <waiting ...>
> > step grant1: <... completed>
> > +ERROR: tuple concurrently deleted
> > step wakegrant4: <... completed>
> > step inspect4:
> > SELECT relhastriggers, relhassubclass FROM pg_class
> > @@ -82,6 +83,6 @@
> >
> > relhastriggers|relhassubclass
> > --------------+--------------
> > -f |f
> > +t |t
> > (1 row)
>
> That isolation permutation tests an unfixed bug. Here, it's giving a result
> as though the bug were fixed. The suite passed the next two skink runs. I'd
> tend to defer debugging exactly what went wrong until the project to fix the
> bug under test. I could delete the permutation, or I could leave it awhile to
> see how high-probability this failure is. I'm inclined to leave it until it
> gets four failures, then delete the permutation.

Alexander Lakhin shared a recipe that reproduced it. I found permutation 2
also failed under that recipe, so removing permutation 3 got less attractive.

These tests need a concept of "wait until VACUUM definitely will prune a
particular tuple". My removable_cutoff() header comment made an argument for
having achieved that, but it had two gaps. The attached patch passes 1000
Valgrind iterations of the spec. Without the patch, ~3% of iterations failed.

Commit c2dc1a7 added DISABLE_PAGE_SKIPPING to stop buffer pins thwarting other
VACUUM tests. However, I suspect FREEZE is necessary and sufficient in all
these tests, since we need to coax this code to insist on the cleanup lock:

/*
* If we didn't get the cleanup lock, we can still collect LP_DEAD
* items in the dead_items area for later vacuuming, count live and
* recently dead tuples for vacuum logging, and determine if this
* block could later be truncated. If we encounter any xid/mxids that
* require advancing the relfrozenxid/relminxid, we'll have to wait
* for a cleanup lock and call lazy_scan_prune().
*/
if (!got_cleanup_lock &&
!lazy_scan_noprune(vacrel, buf, blkno, page, &has_lpdead_items))
{
/*
* lazy_scan_noprune could not do all required processing. Wait
* for a cleanup lock, and call lazy_scan_prune in the usual way.
*/
Assert(vacrel->aggressive);
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
LockBufferForCleanup(buf);
got_cleanup_lock = true;
}

FREEZE causes the VACUUM to "require advancing the relfrozenxid/relminxid",
hence the effectiveness of FREEZE for this purpose. The code has moved
around, but I think the policy was the same at the time of that commit
c2dc1a7. Per the doc for DISABLE_PAGE_SKIPPING, it's for working around a
corrupt vm, not for pin contention. The spec still passes 1000 Valgrind
iterations if I remove DISABLE_PAGE_SKIPPING.

Attachment	Content-Type	Size
update-unused-lp-60-dpageskip-v1.patch	text/plain	9.2 KB