Cross-type index comparison support in contrib/btree_gin

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Cross-type index comparison support in contrib/btree_gin
Date: 2025-02-02 01:44:12
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

We've had multiple requests for $SUBJECT over the years
([1][2][3][4][5], and I'm sure my archive search missed some).
I finally decided to look into what it'd take to make that happen.
It's not as bad as I feared, and attached is a draft patch.

The thing that makes this sticky is that GIN itself doesn't support
any such thing as cross-type comparisons: all the Datums that it deals
with directly have to be of the same type as the stored index keys.
However, for the cases that btree_gin deals with, we can make use of
the "partial match" feature because all the entries we need to find
will be consecutive in the index. And it turns out that the
comparePartial() method is only ever applied to compare the original
query value with an index entry, which means that internally to
comparePartial() we can apply the proper cross-type comparison
operator. Our GIN index documentation about comparePartial() doesn't
quite say that in so many words, but btree_gin was already relying on
it --- in a very confusing and ill-explained way, if you ask me, but
it was relying on it. (The 0001 patch below is mainly concerned with
making that reliance simpler and clearer.)

The other thing that has to be dealt with is that cross-type or not,
we need to somehow create a Datum of the index key type to perform
the initial index descent with. But I realized that this isn't
that tough after all. Aside from boring change-of-representation
work, there are these special cases:

* Query value is out of range for the index type. We can simply
clamp it to the index type's range, so that GIN descends to one
end of the index or the other and then searches normally. GIN
might falsely think that the endmost entry(s) of the index equal
the search datum, but it doesn't matter too much what GIN thinks
because comparePartial can filter away the false matches by
applying the correct comparison with the original query value.

* Query value falls between possible values of the index type
(possible in float8->float4 or timestamp->date cases, for example).
We can just use our usual conversion rules, though. The critical
observation here is that it does not matter whether the conversion
rounds to the next lower or next higher possible value. If we are
searching for equality, neither of those values will pass the
cross-type comparison so it doesn't matter. If we are searching for
inequality, for example "indcol <= value", then only index entries
strictly less than the query value can match. Rounding down clearly
doesn't hurt, while rounding up at worst makes the search include
some index entries just larger than the query value, which will be
correctly rejected by the cross-type comparison.

So basically all I had to do was write a bunch of non-error-throwing
conversion routines and set up some boilerplate infrastructure.
Patch series attached --- it's rather long, but a lot of it is
new test cases.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/58782480-ab75-4416-a177-ccf91be288a9%40app.fastmail.com
[2] https://www.postgresql.org/message-id/flat/17079-c5edf57c47debc2c%40postgresql.org
[3] https://www.postgresql.org/message-id/flat/20170207150420.1409.58748%40wrigleys.postgresql.org
[4] https://www.postgresql.org/message-id/flat/20160415185902.22924.77993%40wrigleys.postgresql.org
[5] https://www.postgresql.org/message-id/flat/VisenaEmail.42.91df4628bdf7755c.1537e96e852%40tc7-visena

Attachment Content-Type Size
v1-0001-Preliminary-refactoring.patch text/x-diff 6.5 KB
v1-0002-Add-cross-type-comparisons-for-integer-types.patch text/x-diff 35.4 KB
v1-0003-Add-cross-type-comparisons-for-float-types.patch text/x-diff 11.6 KB
v1-0004-Add-cross-type-comparisons-for-string-types.patch text/x-diff 8.3 KB
v1-0005-Add-cross-type-comparisons-for-datetime-types.patch text/x-diff 35.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cross-type index comparison support in contrib/btree_gin
Date: 2025-02-02 16:47:13
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> We've had multiple requests for $SUBJECT over the years
> ([1][2][3][4][5], and I'm sure my archive search missed some).
> I finally decided to look into what it'd take to make that happen.

I forgot to mention a couple of questions for review:

> ... it turns out that the
> comparePartial() method is only ever applied to compare the original
> query value with an index entry, which means that internally to
> comparePartial() we can apply the proper cross-type comparison
> operator. Our GIN index documentation about comparePartial() doesn't
> quite say that in so many words, but btree_gin was already relying on
> it --- in a very confusing and ill-explained way, if you ask me, but
> it was relying on it.

Should we adjust the documentation of comparePartial() to promise
explicitly that partial_key is the same datum returned by extractQuery?
By my reading, it kind of implies that, but it's not quite black and
white.

> So basically all I had to do was write a bunch of non-error-throwing
> conversion routines and set up some boilerplate infrastructure.

In the 0005 patch, I relied on date2timestamp_opt_overflow and
its siblings where available. But some of the conversions such
as timestamptz-to-timestamp don't have one of those, so I was
forced to copy-and-paste some fairly low-level code. Would it
make sense to refactor the related core routines to expose
xxx2yyy_opt_overflow interfaces, extending what 5bc450629 and
52ad1e659 did? I wouldn't think this worth doing just for
btree_gin's benefit, but if btree_gin needs it maybe some other
extensions could use it too.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cross-type index comparison support in contrib/btree_gin
Date: 2025-02-07 22:41:32
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I forgot to mention a couple of questions for review:

> Should we adjust the documentation of comparePartial() to promise
> explicitly that partial_key is the same datum returned by extractQuery?
> By my reading, it kind of implies that, but it's not quite black and
> white.

> In the 0005 patch, I relied on date2timestamp_opt_overflow and
> its siblings where available. But some of the conversions such
> as timestamptz-to-timestamp don't have one of those, so I was
> forced to copy-and-paste some fairly low-level code. Would it
> make sense to refactor the related core routines to expose
> xxx2yyy_opt_overflow interfaces, extending what 5bc450629 and
> 52ad1e659 did?

After further review it seems like both of those things would be
improvements, so here's a v2 that does it like that. This also
adds a PG_USED_FOR_ASSERTS_ONLY marker whose lack was pointed
out by the cfbot; no other meaningful changes.

regards, tom lane

Attachment Content-Type Size
v2-0001-Break-out-xxx2yyy_opt_overflow-APIs-for-more-date.patch text/x-diff 9.4 KB
v2-0002-Preliminary-refactoring.patch text/x-diff 7.6 KB
v2-0003-Add-cross-type-comparisons-for-integer-types.patch text/x-diff 35.6 KB
v2-0004-Add-cross-type-comparisons-for-float-types.patch text/x-diff 11.6 KB
v2-0005-Add-cross-type-comparisons-for-string-types.patch text/x-diff 8.3 KB
v2-0006-Add-cross-type-comparisons-for-datetime-types.patch text/x-diff 33.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cross-type index comparison support in contrib/btree_gin
Date: 2025-03-28 01:22:19
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

v3 needed to rebase over 55527368b. (I guess "git am" cannot
tolerate any fuzz at all?) No changes of any significance
whatsoever.

regards, tom lane

Attachment Content-Type Size
v3-0001-Break-out-xxx2yyy_opt_overflow-APIs-for-more-date.patch text/x-diff 9.4 KB
v3-0002-Preliminary-refactoring.patch text/x-diff 7.6 KB
v3-0003-Add-cross-type-comparisons-for-integer-types.patch text/x-diff 35.6 KB
v3-0004-Add-cross-type-comparisons-for-float-types.patch text/x-diff 11.6 KB
v3-0005-Add-cross-type-comparisons-for-string-types.patch text/x-diff 8.4 KB
v3-0006-Add-cross-type-comparisons-for-datetime-types.patch text/x-diff 33.9 KB