Lists: | pgsql-hackers |
---|
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Too rigorous assert in reorderbuffer.c |
Date: | 2019-01-31 06:21:59 |
Message-ID: | 874l9p8hyw.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
My colleague Alexander Lakhin has noticed an assertion failure in
reorderbuffer.c:1330. Here is a simple snippet reproducing it:
SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
create table t(k int);
begin;
savepoint a;
alter table t alter column k type text;
rollback to savepoint a;
alter table t alter column k type bigint;
commit;
SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
It is indeed too opinionated since cmax of a tuple is not stable; it can
be rewritten if subxact who tried to delete it later aborts (analogy
also holds for xmax). Attached patch removes it. While here, I had also
considered worthwhile to add a test involving DDL in aborted subxact as
it is interesting anyway and wasn't covered before.
Attachment | Content-Type | Size |
---|---|---|
0001-Remove-assertion-in-reorderbuffer-that-cmax-is-stabl.patch | text/x-diff | 3.8 KB |
From: | Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-04 16:15:14 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 31.01.2019 9:21, Arseny Sher wrote:
> My colleague Alexander Lakhin has noticed an assertion failure in
> reorderbuffer.c:1330. Here is a simple snippet reproducing it:
>
> SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
>
> create table t(k int);
> begin;
> savepoint a;
> alter table t alter column k type text;
> rollback to savepoint a;
> alter table t alter column k type bigint;
> commit;
>
> SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
I just want to add, that I have accidentally discovered the same issue
during the testing of the Tomas's large transactions streaming patch
[1], and had to remove this assert to get things working. I thought that
it was somehow related to the streaming mode and did not test the same
query alone.
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-05 21:11:19 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2019-Jan-31, Arseny Sher wrote:
> My colleague Alexander Lakhin has noticed an assertion failure in
> reorderbuffer.c:1330. Here is a simple snippet reproducing it:
>
> SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
>
> create table t(k int);
> begin;
> savepoint a;
> alter table t alter column k type text;
> rollback to savepoint a;
> alter table t alter column k type bigint;
> commit;
>
> SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
Hmm, the new test introduced by your patch fails in early branches (at
least 9.4): the transaction is decoded like this:
! data
! -----------------------------------------------------
BEGIN
table public.tr_sub_ddl: INSERT: data[integer]:42
+ table public.pg_temp_16445: INSERT: data[bigint]:42
table public.tr_sub_ddl: INSERT: data[bigint]:43
COMMIT
! (5 rows)
note the additional pg_temp_XYZ row in the middle. This is caused by
the rewrite in ALTER TABLE. Peter E fixed that in Pg11 in commit
325f2ec55; I don't think there's much to do in the backbranches other
than hide the pesky record to avoid it breaking the test.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-06 09:21:27 |
Message-ID: | 8736p18e7c.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> note the additional pg_temp_XYZ row in the middle. This is caused by
> the rewrite in ALTER TABLE. Peter E fixed that in Pg11 in commit
> 325f2ec55; I don't think there's much to do in the backbranches other
> than hide the pesky record to avoid it breaking the test.
Oh, I see. Let's just remove the first insertion then, as in attached.
I've tested it on master and on 9.4.
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
0001-Remove-assertion-in-reorderbuffer-that-cmax-is-stabl.patch | text/x-diff | 3.7 KB |
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-06 16:30:50 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2019-Feb-06, Arseny Sher wrote:
>
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>
> > note the additional pg_temp_XYZ row in the middle. This is caused by
> > the rewrite in ALTER TABLE. Peter E fixed that in Pg11 in commit
> > 325f2ec55; I don't think there's much to do in the backbranches other
> > than hide the pesky record to avoid it breaking the test.
>
> Oh, I see. Let's just remove the first insertion then, as in attached.
> I've tested it on master and on 9.4.
Ah, okay. Does the test still fail when run without the code fix?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-07 08:50:29 |
Message-ID: | 871s4k7zje.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2019-Feb-06, Arseny Sher wrote:
>
>>
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>
>> > note the additional pg_temp_XYZ row in the middle. This is caused by
>> > the rewrite in ALTER TABLE. Peter E fixed that in Pg11 in commit
>> > 325f2ec55; I don't think there's much to do in the backbranches other
>> > than hide the pesky record to avoid it breaking the test.
>>
>> Oh, I see. Let's just remove the first insertion then, as in attached.
>> I've tested it on master and on 9.4.
>
> Ah, okay. Does the test still fail when run without the code fix?
Yes. The problem here is overriding cmax of catalog (pg_attribute in the
test) tuples, so it fails without any data at all.
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-11 18:50:03 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2019-Feb-07, Arseny Sher wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Ah, okay. Does the test still fail when run without the code fix?
>
> Yes. The problem here is overriding cmax of catalog (pg_attribute in the
> test) tuples, so it fails without any data at all.
Makes sense.
I thought the blanket removal of the assert() was excessive, and we can
relax it instead; what do you think of the attached?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Remove-assertion-in-reorderbuffer-that-cmax-is-stabl.patch | text/x-diff | 3.0 KB |
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-12 13:53:43 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2019-Feb-11, Alvaro Herrera wrote:
> On 2019-Feb-07, Arseny Sher wrote:
>
> > Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>
> > > Ah, okay. Does the test still fail when run without the code fix?
> >
> > Yes. The problem here is overriding cmax of catalog (pg_attribute in the
> > test) tuples, so it fails without any data at all.
>
> Makes sense.
>
> I thought the blanket removal of the assert() was excessive, and we can
> relax it instead; what do you think of the attached?
More precisely, my question was: with this change, does the code still
work correctly in your non-toy case?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-12 16:10:55 |
Message-ID: | 87tvh9xa0g.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> I thought the blanket removal of the assert() was excessive, and we can
>> relax it instead; what do you think of the attached?
>
> More precisely, my question was: with this change, does the code still
> work correctly in your non-toy case?
Yes, it works. I thought for a moment that some obscure cases where cmax
on a single tuple is not strictly monotonic might exist, but looks like
they don't. So your change is ok for me, reshaping assert is better than
removing. make check is also good on all supported branches.
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-12 22:00:22 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2019-Feb-12, Arseny Sher wrote:
>
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>
> >> I thought the blanket removal of the assert() was excessive, and we can
> >> relax it instead; what do you think of the attached?
> >
> > More precisely, my question was: with this change, does the code still
> > work correctly in your non-toy case?
>
> Yes, it works. I thought for a moment that some obscure cases where cmax
> on a single tuple is not strictly monotonic might exist, but looks like
> they don't. So your change is ok for me, reshaping assert is better than
> removing. make check is also good on all supported branches.
Thanks for checking! I also run it on all branches, everything passes.
Pushed now.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-02-16 06:02:29 |
Message-ID: | 87r2c8xocq.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Thanks for checking! I also run it on all branches, everything passes.
> Pushed now.
I'm sorry to bother you with this again, but due to new test our
internal buildfarm revealed that ajacent assert on cmin is also lie. You
see, we can't assume cmin is stable because the same key (relnode, tid)
might refer to completely different tuples if tuple was inserted by
aborted subxact, immeditaly reclaimed and then space occupied by another
one. Fix is attached.
Technically this might mean a user-facing bug, because we only pick the
first cmin which means we might get visibility wrong, allowing to see
some version too early (i.e real cmin of tuple is y, but decoding thinks
it is x, and x < y). However, I couldn't quickly make up an example
where this would actually lead to bad consequences. I tried to create
such extra visible row in pg_attribute, but that's ok because loop in
RelationBuildTupleDesc spins exactly natts times and ignores what is
left unscanned. It is also ok with pg_class, because apparently
ScanPgRelation also fishes out the (right) first tuple and doesn't check
for duplicates appearing later in the scan. Maybe I just haven't tried
hard enough though.
Attached 'aborted_subxact_test.patch' is an illustration of such wrong
cmin visibility on pg_attribute. It triggers assertion failure, but
otherwise ok (no user-facing issues), as I said earlier, so I am
disinclined to include it in the fix.
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
pick_latest_cmin_in_reorderbuffer.patch | text/x-diff | 2.1 KB |
aborted_subxact_test.patch | text/x-diff | 2.5 KB |
From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lakhin <a(dot)lakhin(at)postgrespro(dot)ru> |
Subject: | Re: Too rigorous assert in reorderbuffer.c |
Date: | 2019-12-19 16:30:10 |
Message-ID: | 87lfr8wa5p.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> writes:
> I'm sorry to bother you with this again, but due to new test our
> internal buildfarm revealed that ajacent assert on cmin is also lie. You
> see, we can't assume cmin is stable because the same key (relnode, tid)
> might refer to completely different tuples if tuple was inserted by
> aborted subxact, immeditaly reclaimed and then space occupied by another
> one. Fix is attached.
>
> Technically this might mean a user-facing bug, because we only pick the
> first cmin which means we might get visibility wrong, allowing to see
> some version too early (i.e real cmin of tuple is y, but decoding thinks
> it is x, and x < y). However, I couldn't quickly make up an example
> where this would actually lead to bad consequences. I tried to create
> such extra visible row in pg_attribute, but that's ok because loop in
> RelationBuildTupleDesc spins exactly natts times and ignores what is
> left unscanned. It is also ok with pg_class, because apparently
> ScanPgRelation also fishes out the (right) first tuple and doesn't check
> for duplicates appearing later in the scan. Maybe I just haven't tried
> hard enough though.
This issue still exists, it would be nice to fix it...
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company