Quick Links

Re: SQLFunctionCache and generic plans

Lists:	pgsql-hackers

From:	Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	SQLFunctionCache and generic plans
Date:	2023-02-07 14:55:34
Message-ID:	8216639.NyiUUSuA9g@aivenlaptop
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

It has been brought to my attention that SQL functions always use generic
plans.

Take this function for example:

create or replace function test_plpgsql(p1 oid) returns text as $$
BEGIN
RETURN (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1);
END;
$$ language plpgsql;

As expected, the PlanCache takes care of generating parameter specific plans,
and correctly prunes the redundant OR depending on wether we call the function
with a NULL value or not:

ro=# select test_plpgsql(NULL);
LOG: duration: 0.030 ms plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT
1)
Result (cost=0.04..0.05 rows=1 width=64)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.04 rows=1 width=64)
-> Seq Scan on pg_class (cost=0.00..18.12 rows=412 width=64)
LOG: duration: 0.662 ms plan:
Query Text: select test_plpgsql(NULL);
Result (cost=0.00..0.26 rows=1 width=32)

ro=# select test_plpgsql(1);
LOG: duration: 0.075 ms plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT
1)
Result (cost=8.29..8.30 rows=1 width=64)
InitPlan 1 (returns $0)
-> Limit (cost=0.27..8.29 rows=1 width=64)
-> Index Scan using pg_class_oid_index on pg_class
(cost=0.27..8.29 rows=1 width=64)
Index Cond: (oid = '1'::oid)
LOG: duration: 0.675 ms plan:
Query Text: select test_plpgsql(1);
Result (cost=0.00..0.26 rows=1 width=32)

But writing the same function in SQL:
create or replace function test_sql(p1 oid) returns text as $$
SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
$$ language sql;

we end up with a generic plan:

ro=# select test_sql(1);
LOG: duration: 0.287 ms plan:
Query Text: SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
Query Parameters: $1 = '1'
Limit (cost=0.00..6.39 rows=1 width=32)
-> Seq Scan on pg_class (cost=0.00..19.16 rows=3 width=32)
Filter: ((oid = $1) OR ($1 IS NULL))

This is due to the fact that SQL functions are planned once for the whole
query using a specific SQLFunctionCache instead of using the whole PlanCache
machinery.

The following comment can be found in functions.c, about the SQLFunctionCache:

* Note that currently this has only the lifespan of the calling query.
* Someday we should rewrite this code to use plancache.c to save parse/plan
* results for longer than that.

I would be interested in working on this, primarily to avoid this problem of
having generic query plans for SQL functions but maybe having a longer lived
cache as well would be nice to have.

Is there any reason not too, or pitfalls we would like to avoid ?

Best regards,

--
Ronan Dunklau

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: SQLFunctionCache and generic plans
Date:	2023-02-07 15:29:57
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> writes:
> The following comment can be found in functions.c, about the SQLFunctionCache:

> * Note that currently this has only the lifespan of the calling query.
> * Someday we should rewrite this code to use plancache.c to save parse/plan
> * results for longer than that.

> I would be interested in working on this, primarily to avoid this problem of
> having generic query plans for SQL functions but maybe having a longer lived
> cache as well would be nice to have.
> Is there any reason not too, or pitfalls we would like to avoid ?

AFAIR it's just lack of round tuits. There would probably be some
semantic side-effects, though if you pay attention you could likely
make things better while you are at it. The existing behavior of
parsing and planning all the statements at once is not very desirable
--- for instance, it doesn't work to do
CREATE TABLE foo AS ...;
SELECT * FROM foo;
I think if we're going to nuke this code and start over, we should
try to make that sort of case work.

regards, tom lane

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc:	Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2024-09-03 07:33:23
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane писал(а) 2023-02-07 18:29:
> Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> writes:
>> The following comment can be found in functions.c, about the
>> SQLFunctionCache:
>
>> * Note that currently this has only the lifespan of the calling
>> query.
>> * Someday we should rewrite this code to use plancache.c to save
>> parse/plan
>> * results for longer than that.
>
>> I would be interested in working on this, primarily to avoid this
>> problem of
>> having generic query plans for SQL functions but maybe having a longer
>> lived
>> cache as well would be nice to have.
>> Is there any reason not too, or pitfalls we would like to avoid ?
>
> AFAIR it's just lack of round tuits. There would probably be some
> semantic side-effects, though if you pay attention you could likely
> make things better while you are at it. The existing behavior of
> parsing and planning all the statements at once is not very desirable
> --- for instance, it doesn't work to do
> CREATE TABLE foo AS ...;
> SELECT * FROM foo;
> I think if we're going to nuke this code and start over, we should
> try to make that sort of case work.
>
> regards, tom lane

Hi.

I've tried to make SQL functions use CachedPlan machinery. The main goal
was to allow SQL functions to use custom plans
(the work was started from question - why sql function is so slow
compared to plpgsql one). It turned out that
plpgsql function used custom plan and eliminated scan of all irrelevant
sections, but
exec-time pruning didn't cope with pruning when ScalarArrayOpExpr,
filtering data using int[] parameter.

In current prototype there are two restrictions. The first one is that
CachecPlan has lifetime of a query - it's not
saved for future use, as we don't have something like plpgsql hashtable
for long live function storage. Second -
SQL language functions in sql_body form (with stored queryTree_list) are
handled in the old way, as we currently lack
tools to make cached plans from query trees.

Currently this change solves the issue of inefficient plans for queries
over partitioned tables. For example, function like

CREATE OR REPLACE FUNCTION public.test_get_records(ids integer[])
RETURNS SETOF test
LANGUAGE sql
AS $function$
select *
from test
where id = any (ids)
$function$;

for hash-distributed table test can perform pruning in plan time and
can have plan like

Append (cost=0.00..51.88 rows=26 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..25.88 rows=13
width=36)
Filter: (id = ANY ('{1,2}'::integer[]))
-> Seq Scan on test_2 (cost=0.00..25.88 rows=13 width=36)
Filter: (id = ANY ('{1,2}'::integer[]))

instead of

Append (cost=0.00..155.54 rows=248 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_1 test_2 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_2 test_3 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_3 test_4 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))

This patch definitely requires more work, and I share it to get some
early feedback.

What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, but this
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we need some form
of GetCachedPlan(), which ignores raw_parse_tree?
In this case how could we possibly cache plans for session lifetime
(like plpgsql language does) if we can't use cached revalidation
machinery?
I hope to get some hints to move further.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
0001-Use-custom-plan-machinery-for-SQL-functions.patch	text/x-diff	18.4 KB

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2024-09-20 12:06:27
Message-ID:	CAPpHfdue9RVJKo33jKwrCeEXec0BTTH7CTxVcyXGMbfMExaGLA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi, Alexander!

On Tue, Sep 3, 2024 at 10:33 AM Alexander Pyhalov
<a(dot)pyhalov(at)postgrespro(dot)ru> wrote:
> Tom Lane писал(а) 2023-02-07 18:29:
> > Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> writes:
> >> The following comment can be found in functions.c, about the
> >> SQLFunctionCache:
> >
> >> * Note that currently this has only the lifespan of the calling
> >> query.
> >> * Someday we should rewrite this code to use plancache.c to save
> >> parse/plan
> >> * results for longer than that.
> >
> >> I would be interested in working on this, primarily to avoid this
> >> problem of
> >> having generic query plans for SQL functions but maybe having a longer
> >> lived
> >> cache as well would be nice to have.
> >> Is there any reason not too, or pitfalls we would like to avoid ?
> >
> > AFAIR it's just lack of round tuits. There would probably be some
> > semantic side-effects, though if you pay attention you could likely
> > make things better while you are at it. The existing behavior of
> > parsing and planning all the statements at once is not very desirable
> > --- for instance, it doesn't work to do
> > CREATE TABLE foo AS ...;
> > SELECT * FROM foo;
> > I think if we're going to nuke this code and start over, we should
> > try to make that sort of case work.
> >
> > regards, tom lane
>
> Hi.
>
> I've tried to make SQL functions use CachedPlan machinery. The main goal
> was to allow SQL functions to use custom plans
> (the work was started from question - why sql function is so slow
> compared to plpgsql one). It turned out that
> plpgsql function used custom plan and eliminated scan of all irrelevant
> sections, but
> exec-time pruning didn't cope with pruning when ScalarArrayOpExpr,
> filtering data using int[] parameter.
>
> In current prototype there are two restrictions. The first one is that
> CachecPlan has lifetime of a query - it's not
> saved for future use, as we don't have something like plpgsql hashtable
> for long live function storage. Second -
> SQL language functions in sql_body form (with stored queryTree_list) are
> handled in the old way, as we currently lack
> tools to make cached plans from query trees.
>
> Currently this change solves the issue of inefficient plans for queries
> over partitioned tables. For example, function like
>
> CREATE OR REPLACE FUNCTION public.test_get_records(ids integer[])
> RETURNS SETOF test
> LANGUAGE sql
> AS $function$
> select *
> from test
> where id = any (ids)
> $function$;
>
> for hash-distributed table test can perform pruning in plan time and
> can have plan like
>
> Append (cost=0.00..51.88 rows=26 width=36)
> -> Seq Scan on test_0 test_1 (cost=0.00..25.88 rows=13
> width=36)
> Filter: (id = ANY ('{1,2}'::integer[]))
> -> Seq Scan on test_2 (cost=0.00..25.88 rows=13 width=36)
> Filter: (id = ANY ('{1,2}'::integer[]))
>
> instead of
>
> Append (cost=0.00..155.54 rows=248 width=36)
> -> Seq Scan on test_0 test_1 (cost=0.00..38.58 rows=62
> width=36)
> Filter: (id = ANY ($1))
> -> Seq Scan on test_1 test_2 (cost=0.00..38.58 rows=62
> width=36)
> Filter: (id = ANY ($1))
> -> Seq Scan on test_2 test_3 (cost=0.00..38.58 rows=62
> width=36)
> Filter: (id = ANY ($1))
> -> Seq Scan on test_3 test_4 (cost=0.00..38.58 rows=62
> width=36)
> Filter: (id = ANY ($1))
>
> This patch definitely requires more work, and I share it to get some
> early feedback.
>
> What should we do with "pre-parsed" SQL functions (when prosrc is
> empty)? How should we create cached plans when we don't have raw
> parsetrees?
> Currently we can create cached plans without raw parsetrees, but this
> means that plan revalidation doesn't work, choose_custom_plan()
> always returns false and we get generic plan. Perhaps, we need some form
> of GetCachedPlan(), which ignores raw_parse_tree?

I don't think you need a new form of GetCachedPlan(). Instead, it
seems that StmtPlanRequiresRevalidation() should be revised. As I got
from comments and the d8b2fcc9d4 commit message, the primary goal was
to skip revalidation of utility statements. Skipping revalidation was
a positive side effect, as long as we didn't support custom plans for
them anyway. But as you're going to change this,
StmtPlanRequiresRevalidation() needs to be revised.

I also think it's not necessary to implement long-lived plan cache in
the initial patch. The work could be split into two patches. The
first could implement query lifetime plan cache. This is beneficial
already by itself as you've shown by example. The second could
implement long-lived plan cache.

I appreciate your work in this direction. I hope you got the feedback
to go ahead and work on remaining issues.

------
Regards,
Alexander Korotkov
Supabase

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2024-10-01 15:12:40
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi.

>> What should we do with "pre-parsed" SQL functions (when prosrc is
>> empty)? How should we create cached plans when we don't have raw
>> parsetrees?
>> Currently we can create cached plans without raw parsetrees, but this
>> means that plan revalidation doesn't work, choose_custom_plan()
>> always returns false and we get generic plan. Perhaps, we need some
>> form
>> of GetCachedPlan(), which ignores raw_parse_tree?
>
> I don't think you need a new form of GetCachedPlan(). Instead, it
> seems that StmtPlanRequiresRevalidation() should be revised. As I got
> from comments and the d8b2fcc9d4 commit message, the primary goal was
> to skip revalidation of utility statements. Skipping revalidation was
> a positive side effect, as long as we didn't support custom plans for
> them anyway. But as you're going to change this,
> StmtPlanRequiresRevalidation() needs to be revised.
>

Thanks for feedback.

I've modifed StmtPlanRequiresRevalidation() so that it looks on queries
command type.
Not sure if it's enough or I have to recreate something more similar to
stmt_requires_parse_analysis()
logic. I was wondering if this can lead to triggering plan revalidation
in RevalidateCachedQuery().
I suppose not - as we plan in executor (so shouldn't catch userid change
or see some changes in
related objects. Revalidation would kill our plan, destroying
resultDesc.

Also while looking at this, fixed processing of instead of rules (which
would lead to NULL execution_state).
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
v2-0001-Use-custom-plan-machinery-for-SQL-functions.patch	text/x-diff	25.5 KB

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2024-12-31 15:39:45
Message-ID:	CAFj8pRCoR9B+Ap70TQqWUa0PeeEk94D6VW20YbWN2Pm4b6-JMQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

út 31. 12. 2024 v 16:36 odesílatel Alexander Pyhalov <
a(dot)pyhalov(at)postgrespro(dot)ru> napsal:

> Hi.
>
> >> What should we do with "pre-parsed" SQL functions (when prosrc is
> >> empty)? How should we create cached plans when we don't have raw
> >> parsetrees?
> >> Currently we can create cached plans without raw parsetrees, but this
> >> means that plan revalidation doesn't work, choose_custom_plan()
> >> always returns false and we get generic plan. Perhaps, we need some
> >> form
> >> of GetCachedPlan(), which ignores raw_parse_tree?
> >
> > I don't think you need a new form of GetCachedPlan(). Instead, it
> > seems that StmtPlanRequiresRevalidation() should be revised. As I got
> > from comments and the d8b2fcc9d4 commit message, the primary goal was
> > to skip revalidation of utility statements. Skipping revalidation was
> > a positive side effect, as long as we didn't support custom plans for
> > them anyway. But as you're going to change this,
> > StmtPlanRequiresRevalidation() needs to be revised.
> >
>
> Thanks for feedback.
>
> I've modifed StmtPlanRequiresRevalidation() so that it looks on queries
> command type.
> Not sure if it's enough or I have to recreate something more similar to
> stmt_requires_parse_analysis()
> logic. I was wondering if this can lead to triggering plan revalidation
> in RevalidateCachedQuery().
> I suppose not - as we plan in executor (so shouldn't catch userid change
> or see some changes in
> related objects. Revalidation would kill our plan, destroying
> resultDesc.
>
> Also while looking at this, fixed processing of instead of rules (which
> would lead to NULL execution_state).
> --
>

there are lot of fails found by tester

Please, can you check it?

regards

Pavel

> Best regards,
> Alexander Pyhalov,
> Postgres Professional

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-01-17 07:15:34
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule писал(а) 2024-12-31 18:39:
> Hi
>
> út 31. 12. 2024 v 16:36 odesílatel Alexander Pyhalov
> <a(dot)pyhalov(at)postgrespro(dot)ru> napsal:
>
>> Hi.
>>
>>>> What should we do with "pre-parsed" SQL functions (when prosrc is
>>>> empty)? How should we create cached plans when we don't have raw
>>>> parsetrees?
>>>> Currently we can create cached plans without raw parsetrees, but
>> this
>>>> means that plan revalidation doesn't work, choose_custom_plan()
>>>> always returns false and we get generic plan. Perhaps, we need
>> some
>>>> form
>>>> of GetCachedPlan(), which ignores raw_parse_tree?
>>>
>>> I don't think you need a new form of GetCachedPlan(). Instead, it
>>> seems that StmtPlanRequiresRevalidation() should be revised. As I
>> got
>>> from comments and the d8b2fcc9d4 commit message, the primary goal
>> was
>>> to skip revalidation of utility statements. Skipping revalidation
>> was
>>> a positive side effect, as long as we didn't support custom plans
>> for
>>> them anyway. But as you're going to change this,
>>> StmtPlanRequiresRevalidation() needs to be revised.
>>>
>>
>> Thanks for feedback.
>>
>> I've modifed StmtPlanRequiresRevalidation() so that it looks on
>> queries
>> command type.
>> Not sure if it's enough or I have to recreate something more similar
>> to
>> stmt_requires_parse_analysis()
>> logic. I was wondering if this can lead to triggering plan
>> revalidation
>> in RevalidateCachedQuery().
>> I suppose not - as we plan in executor (so shouldn't catch userid
>> change
>> or see some changes in
>> related objects. Revalidation would kill our plan, destroying
>> resultDesc.
>>
>> Also while looking at this, fixed processing of instead of rules
>> (which
>> would lead to NULL execution_state).
>> --
>
> there are lot of fails found by tester
>
> Please, can you check it?

Hi. Sorry for late response - we had holidays here and later there was
some urgent work from 2024 :)

Do you speak about failures on check-world?
It seems

commit a8ccf4e93a7eeaae66007bbf78cf9183ceb1b371
Author: Richard Guo <rguo(at)postgresql(dot)org>
Date: Tue Nov 26 09:25:18 2024 +0900

Reordering DISTINCT keys to match input path's pathkeys

added new GUC enable_distinct_reordering and this caused test failures.
I've rebased patch on master. Tests pass here.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
v3-0001-Use-custom-plan-machinery-for-SQL-functions.patch	text/x-diff	25.5 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-01-17 18:27:36
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> I've rebased patch on master. Tests pass here.

The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.

I've looked through the patch briefly and have a few thoughts:

* You cannot use plancache.c like this:

plansource = CreateCachedPlan(NULL, fcache->src, CreateCommandTag((Node *)parsetree));

CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
query. Now aside from the fact that you didn't update the comment
documenting that, this won't work, because RevalidateCachedQuery
also knows that raw_parse_tree == NULL means an empty query. If an
invalidation happens while the function is running, revalidation
will produce an empty plan list, causing failure. (It seems like
a regression test case exercising such invalidation is called for.)

I think the only way you can really make this work is to have two
kinds of cached plans: the current kind where the starting point
is a RawStmt, and a new kind where the starting point is an
analyzed-but-not-rewritten Query tree (or maybe a list of those,
but if you can hold it to one tree things will be simpler).
I'd be inclined to mechanize that by adding a field named something
like "Query *analyzed_parse_tree" to CachedPlanSource and inventing a
new creation function that parallels CreateCachedPlan but takes an
analyzed Query instead of a RawStmt. Then you'll need to fix
RevalidateCachedQuery so that if analyzed_parse_tree isn't NULL then
it ignores raw_parse_tree and proceeds straight to rewriting the
already-analyzed Query. You'll need to check all the other places
that touch CachedPlanSource.raw_parse_tree, but I think all the other
required updates are pretty obvious. (Do not omit updating the
comments in plancache.h and plancache.c that describe all this.)

* I'm not very convinced by the new logic in
StmtPlanRequiresRevalidation. Is it really the case that "check for
CMD_UTILITY" is sufficient? Even if the coding is okay, the comment
needs work to provide a better explanation why it's okay.

* I'd say lose the enable_sql_func_custom_plans GUC. We don't have
anything comparable for plpgsql and there's been (relatively) little
complaint about that. Aside from adding clutter to the patch, it
contorts the logic in functions.c because it forces you to support
two code paths.

* The regression output changes like

-CONTEXT: SQL function "regexp_match" statement 1
+CONTEXT: SQL function "regexp_match" during startup

seem fairly unfortunate. Aside from making maintenance of the patch
painful, the changed messages provide strictly less information to the
user. I would try hard to eliminate that behavioral change. I think
you could do so, or at least greatly reduce the number of diffs, by
having init_sql_fcache perform only raw parsing (producing a list of
RawStmts, or extracting a list of Queries from prosqlbody) and
delaying creation of a plan for a particular statement until you first
reach execution of that statement. This would be a better way all
around because it'd also fix the anomalies I mentioned upthread for
cases where a DDL statement earlier in the function should affect
parse analysis of a later statement.

* release_plans seems kind of wrong: isn't it giving up most of
the benefit of doing this? plpgsql doesn't seem to release plans
unless it's forced to revalidate. Why should SQL functions
behave differently? Also, I'm pretty sure it's wrong that
you have
+ ReleaseCachedPlan(cplan, NULL);
with no ResourceOwner tracking the reference. That probably
means that the code leaks cached plans on error. For the
current patch iteration with the function's data structure only
meant to live as long as the query, it should be sufficient to
use CurrentResourceOwner to manage these cplan references.

* The patch is hard to read in functions.c because there's a mix of
code-motion and actual logic changes that are touching the same
places. Perhaps it'd be worth splitting it into a series of two
patches: the first one just does code motion such as pushing existing
logic into a new subroutine, and then the second one has the logic
changes. Or maybe that wouldn't help much, but consider giving it a
try.

regards, tom lane

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-01-29 14:35:52
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane писал(а) 2025-01-17 21:27:
> Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
>> I've rebased patch on master. Tests pass here.
>
> The cfbot still doesn't like it; my guess is that you built without
> --with-libxml and so didn't notice the effects on xml.out.

Hi. Thank you for review.

I've updated patch.

> I've looked through the patch briefly and have a few thoughts:
>
> * You cannot use plancache.c like this:
>
> plansource = CreateCachedPlan(NULL, fcache->src,
> CreateCommandTag((Node *)parsetree));
>
> CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
> query. Now aside from the fact that you didn't update the comment
> documenting that, this won't work, because RevalidateCachedQuery
> also knows that raw_parse_tree == NULL means an empty query. If an
> invalidation happens while the function is running, revalidation
> will produce an empty plan list, causing failure. (It seems like
> a regression test case exercising such invalidation is called for.)

Yes, yes... I've spent some time, trying to reproduce it or to prove
that
plan revalidation is impossible (as I considered). I was wrong, plan
revaldiation is possible, and I've added some test cases. First of all,
we don't save our plans to CacheMemoryContext, so usual invalidation
callbacks don't touch our cached plans. However, I've managed
to reproduce it by changing row_security GUC in SQL function. So,
after the first round of creating plan and running them, on second call
we can see that plansource->rewriteRowSecurity mismatches row_security
and invalidate the plan. I've added Query to plansource, and rewrite it
in RevalidateCachedQuery(), as you suggested.

Interesting side effect - earlier if row_security was altered in sql
function,
this didn't have impact on further function execution (generic plan was
already built).
Now, if revalidation for generic plan is triggered, query ends up with
"ERROR: query would be affected by row-level security policy for
table".

>
> * I'm not very convinced by the new logic in
> StmtPlanRequiresRevalidation. Is it really the case that "check for
> CMD_UTILITY" is sufficient? Even if the coding is okay, the comment
> needs work to provide a better explanation why it's okay.

Updated logic to match stmt_requires_parse_analysis().

>
> * I'd say lose the enable_sql_func_custom_plans GUC. We don't have
> anything comparable for plpgsql and there's been (relatively) little
> complaint about that. Aside from adding clutter to the patch, it
> contorts the logic in functions.c because it forces you to support
> two code paths.

Removed it.

>
> * The regression output changes like
>
> -CONTEXT: SQL function "regexp_match" statement 1
> +CONTEXT: SQL function "regexp_match" during startup
>
> seem fairly unfortunate. Aside from making maintenance of the patch
> painful, the changed messages provide strictly less information to the
> user. I would try hard to eliminate that behavioral change. I think
> you could do so, or at least greatly reduce the number of diffs, by
> having init_sql_fcache perform only raw parsing (producing a list of
> RawStmts, or extracting a list of Queries from prosqlbody) and
> delaying creation of a plan for a particular statement until you first
> reach execution of that statement. This would be a better way all
> around because it'd also fix the anomalies I mentioned upthread for
> cases where a DDL statement earlier in the function should affect
> parse analysis of a later statement.

Yes, I've seen this remark. However, was a bit frightened to touch
all guts of functions.c processing. So, instead I've just recorded
a statement number of the currently-planning query and use it in
error messages. If you insist, I can try rewriting current
first-plan-all-then-run approach, but without good tests for
sql functions this looks error-prone.

>
> * release_plans seems kind of wrong: isn't it giving up most of
> the benefit of doing this? plpgsql doesn't seem to release plans
> unless it's forced to revalidate. Why should SQL functions
> behave differently?

The issue is that we potentially create custom plan for each set
of sql function arguments. And in queries like
SELECT f(field) FROM T1
this can consume much memory. The good thing here is that we don't care
much about it if execution switches to generic plans -
after some rows are selected and some custom plans are built.

Of course, there are bad cases. For example, we've seen that
some simple functions like

create function test(n int) returns text as $$
select string_agg(i::text, '') FROM generate_series(1,n) i
$$
language sql;

always use custom plan for small n numbers - it's the same, but cheaper
(as we can't predict correct row number without knowing function
arguments):

Aggregate (cost=0.15..0.16 rows=1 width=32)
-> Function Scan on generate_series i (cost=0.00..0.08 rows=8
width=4)

versus

Aggregate (cost=17.50..17.52 rows=1 width=32)
-> Function Scan on generate_series i (cost=0.00..10.00
rows=1000 width=4)

So if it appears somewhere deep in data generation scenarios, planning
time
can sufficiently rise (earlier generic plan was always used).

Turning back to release_plans(), it's needed, because there's
possibility
that for each set of arguments we create new custom plan. It doesn't
affect
generic plan, as plansource itself holds reference to it.

> Also, I'm pretty sure it's wrong that
> you have
> + ReleaseCachedPlan(cplan, NULL);
> with no ResourceOwner tracking the reference. That probably
> means that the code leaks cached plans on error. For the
> current patch iteration with the function's data structure only
> meant to live as long as the query, it should be sufficient to
> use CurrentResourceOwner to manage these cplan references.

Cached plans are not stored in long-lived context for now. So we can't
use CurrentResourceOwner,
ReleaseCachedPlan(cplan, CurrentResourceOwner) will die on
"Assert(plan->is_saved)".

>
> * The patch is hard to read in functions.c because there's a mix of
> code-motion and actual logic changes that are touching the same
> places. Perhaps it'd be worth splitting it into a series of two
> patches: the first one just does code motion such as pushing existing
> logic into a new subroutine, and then the second one has the logic
> changes. Or maybe that wouldn't help much, but consider giving it a
> try.

Moved splitting of check_planned_stmt() into separate patch.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
0002-Use-custom-plan-machinery-for-SQL-function.patch	text/x-diff	23.1 KB
0001-Split-out-SQL-functions-checks-from-init_execution_s.patch	text/x-diff	3.1 KB

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-01-30 08:50:35
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov писал(а) 2025-01-29 17:35:
> Tom Lane писал(а) 2025-01-17 21:27:
>> Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
>>> I've rebased patch on master. Tests pass here.
>>
>> The cfbot still doesn't like it; my guess is that you built without
>> --with-libxml and so didn't notice the effects on xml.out.
>
> Hi. Thank you for review.
>
> I've updated patch.

Sorry, missed one local patch to fix memory bloat during replaning. Also
fixed a compiler warning.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
v5-0002-Use-custom-plan-machinery-for-SQL-function.patch	text/x-diff	23.9 KB
v5-0001-Split-out-SQL-functions-checks-from-init_execution_s.patch	text/x-diff	3.1 KB

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 12:39:57
Message-ID:	CAFj8pRD2Fa84HoH_R_S5aKngyA6a+Au7U6NyidfBMrwEjFAc9g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

čt 30. 1. 2025 v 9:50 odesílatel Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
napsal:

> Alexander Pyhalov писал(а) 2025-01-29 17:35:
> > Tom Lane писал(а) 2025-01-17 21:27:
> >> Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> >>> I've rebased patch on master. Tests pass here.
> >>
> >> The cfbot still doesn't like it; my guess is that you built without
> >> --with-libxml and so didn't notice the effects on xml.out.
> >
> > Hi. Thank you for review.
> >
> > I've updated patch.
>
> Sorry, missed one local patch to fix memory bloat during replaning. Also
> fixed a compiler warning.
>

Did you do some performance checks?

I tried some worst case

CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;

do $$
begin
for i in 1..1000000 loop
perform fx(i); -- or fx2
end loop;
end;
$$;
DO

The patched version reduces the difference between execution fx and fx2,
but patched version is about 10% slower than unpatched.

The overhead of plan cache looks significant for simple cases (and a lot of
SQL functions are very simple).

Regards

Pavel

>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 16:00:48
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> Did you do some performance checks?

This is a good question to ask ...

> I tried some worst case

> CREATE OR REPLACE FUNCTION fx(int)
> RETURNS int AS $$
> SELECT $1 + $1
> $$ LANGUAGE SQL IMMUTABLE;

... but I don't think tests like this will give helpful answers.
That function is simple enough to be inlined:

regression=# explain verbose select fx(f1) from int4_tbl;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on public.int4_tbl (cost=0.00..1.06 rows=5 width=4)
Output: (f1 + f1)
(2 rows)

So functions.c shouldn't have any involvement at all in the
actually-executed PERFORM expression, and whatever difference
you measured must have been noise. (If the effect *is* real,
we'd better find out why.)

You need to test with a non-inline-able function. Looking
at the inlining conditions in inline_function(), one simple
hack is to make the function return SETOF. That'll only
exercise the returns-set path in functions.c though, so it'd
be advisable to check other inline-blocking conditions too.

regards, tom lane

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 16:32:00
Message-ID:	CAFj8pRC_4gEfmDoB5KbacNDdt3e6Tnf1z1BFK=MVp3+2kcDVqw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

po 3. 2. 2025 v 17:00 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > Did you do some performance checks?
>
> This is a good question to ask ...
>
> > I tried some worst case
>
> > CREATE OR REPLACE FUNCTION fx(int)
> > RETURNS int AS $$
> > SELECT $1 + $1
> > $$ LANGUAGE SQL IMMUTABLE;
>
> ... but I don't think tests like this will give helpful answers.
> That function is simple enough to be inlined:
>
> regression=# explain verbose select fx(f1) from int4_tbl;
> QUERY PLAN
> ---------------------------------------------------------------
> Seq Scan on public.int4_tbl (cost=0.00..1.06 rows=5 width=4)
> Output: (f1 + f1)
> (2 rows)
>
> So functions.c shouldn't have any involvement at all in the
> actually-executed PERFORM expression, and whatever difference
> you measured must have been noise. (If the effect *is* real,
> we'd better find out why.)
>
> You need to test with a non-inline-able function. Looking
> at the inlining conditions in inline_function(), one simple
> hack is to make the function return SETOF. That'll only
> exercise the returns-set path in functions.c though, so it'd
> be advisable to check other inline-blocking conditions too.
>

I am sorry. I was wrong - I tested inlining on different case

(2025-02-03 17:24:25) postgres=# explain analyze verbose select fx(i) from
generate_series(1,10) g(i);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Function Scan on pg_catalog.generate_series g (cost=0.00..0.13 rows=10
width=4) (actual time=0.016..0.018 rows=10 loops=1) │
│ Output: (i + i)
│
│ Function Call: generate_series(1, 10)
│
│ Planning:
│
│ Buffers: shared hit=11
│
│ Planning Time: 0.190 ms
│
│ Execution Time: 0.066 ms
│
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(7 rows)

(2025-02-03 17:25:06) postgres=# explain analyze verbose select
fx((random()*100)::int) from generate_series(1,10) g(i);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Function Scan on pg_catalog.generate_series g (cost=0.00..2.68 rows=10
width=4) (actual time=0.104..0.169 rows=10 loops=1) │
│ Output: fx(((random() * '100'::double precision))::integer)
│
│ Function Call: generate_series(1, 10)
│
│ Planning Time: 0.054 ms
│
│ Execution Time: 0.182 ms
│
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

I read https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions, and I
don't remember the rule `if an actual argument to the function call is a
volatile expression, then it must not be referenced in the body more than
once` well, so I didn't apply this rule correctly. I'll recheck this test.

Regards

Pavel

> regards, tom lane
>

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 17:55:48
Message-ID:	CAFj8pRA7MFTfUPwDDWtghaUvAjbrMHFUPbHyLSX3h8+FDw4WFg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I did multiple benchmarking, and still looks so the proposed patch doesn't
help and has significant overhead

testcase:

create or replace function fx(int) returns int as $$ select $1 + $1; $$
language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;

I tested

do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;

Results (master, patched):
fx: 17067 ms, 22165 ms
fx2: 2234 ms, 2311 ms

the execution of dynamic sql

2025-02-03 18:47:33) postgres=# do $$
begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
DO
Time: 13412.990 ms (00:13.413)

In the profiler I see a significant overhead of the parser, so it looks
like there is some more (overhead), but plan cache is not used.

Please, can somebody recheck my tests?

Regards

Pavel

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 19:58:07
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> I did multiple benchmarking, and still looks so the proposed patch doesn't
> help and has significant overhead

Yeah, your fx() test case is clearly worse. For me,

HEAD:

regression=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 5229.184 ms (00:05.229)

PATCH:

regression=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 6934.413 ms (00:06.934)

Adding some debug printout shows me that BuildCachedPlan is called to
construct a custom plan on every single execution, which is presumably
because the patch doesn't make any attempt to carry plancache state
across successive executions of the same query. If we were saving that
state it would have soon switched to a generic plan and then won big.

So, even though I thought we could leave that for later, it seems like
maybe we have to have it before we'll have a committable patch.

There might be some residual inefficiency in there though. In the
unpatched code we'd be calling pg_parse_query and pg_plan_query once
per execution. You'd think that would cost more than BuildCachedPlan,
which can skip the raw-parsing part. Even more interesting, the
patch gets slower yet if we use a new-style SQL function:

regression=# create or replace function fx3 (int) returns int immutable
regression-# begin atomic select $1 + $1; end;
CREATE FUNCTION
Time: 0.813 ms
regression=# do $$
begin
for i in 1..1000000 loop
perform fx3((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 8007.062 ms (00:08.007)

That makes no sense either, because with a new-style SQL
function we should be skipping parse analysis as well.

But wait: HEAD takes
Time: 6632.709 ms (00:06.633)
to do the same thing. So somehow the new-style SQL function
stuff is very materially slower in this use-case, with or
without this patch. I do not understand why.

Definitely some performance investigation needs to be done here.
Even without cross-query plan caching, I don't see why the
patch isn't better than it is. It ought to be at least
competitive with the unpatched code.

(I've not read the v5 patch yet, so I have no theories.)

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-03 22:08:45
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> But wait: HEAD takes
> Time: 6632.709 ms (00:06.633)
> to do the same thing. So somehow the new-style SQL function
> stuff is very materially slower in this use-case, with or
> without this patch. I do not understand why.

"perf" tells me that in the fx3 test, a full third of the runtime
is spent inside stringToNode(), with about three-quarters of that
going into pg_strtok(). This makes no sense to me: we'll be reading
the prosqlbody of fx3(), sure, but that's not enormously long (about
1200 bytes). And pg_strtok() doesn't look remarkably slow. There's
no way this should be taking more time than raw parsing + parse
analysis, even for such a trivial query as "select $1 + $1".

There's been some talk of getting rid of our existing nodetree
storage format in favor of something more efficient. Maybe we
should put a higher priority on getting that done. But anyway,
that seems orthogonal to the current patch.

> Even without cross-query plan caching, I don't see why the
> patch isn't better than it is. It ought to be at least
> competitive with the unpatched code.

This remains true.

regards, tom lane

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-14 15:11:47
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi, folks.

I've looked through performance and found that most performance issues
was caused by CachedPlanSource machinery itself. At least, a lot of it.
And so decided to go along and implement plan cache for sql functions.
I'm not sure that it's clean enough, but at least it seems to be
working. While working at it I've found issues in
RevalidateCachedQuery() and fixed them. What have changed:
- now plans are released after execution;
- revalidation now takes locks on analyzed_tree;
- query tree is copied prior to analyzing in RevalidateCachedQuery();
- queryTree_list in SQLFunctionCache is no necessary and so has gone.

Now sql functions plans are actually saved. The most of it is a
simplified version of plpgsql plan cache. Perhaps, I've missed
something.
We have some cases when we don't save CachedPlanSource for future use.
One case is for trigger functions (pl_comp.c has some logic to handle
them specially, and I didn't want to introduce it so far). Another (more
interesting) issue is invalidation. SQL functions have a feature of
rewriting query when targetlist doesn't match function call context. I
haven't thought this through carefully during last patch version, but
luckily got some tests, which showed this behavior. When compiled with
RANDOMIZE_ALLOCATED_MEMORY, the following test case dumped core (because
after invalidation executor got "name" fields, but expected text):

create table t (nspname text, tname text);

CREATE OR REPLACE FUNCTION get_basic_attributes_from_pg_tables(
_schemaname name, _tablename name)
RETURNS TABLE(tname text, tablespace text, owner text)
LANGUAGE sql
AS $function$
SELECT
schemaname || '.' || tablename AS "full name",
tablespace AS "tablespace",
tableowner AS "tableowner"
FROM pg_tables
WHERE pg_tables.schemaname = _schemaname AND pg_tables.tablename =
_tablename
ORDER BY 1;
$function$;

create or replace function trg_func() RETURNS TRIGGER
AS
$$
declare
t record;
begin
FOR t IN (SELECT * FROM
get_basic_attributes_from_pg_tables(new.nspname, new.tname)) LOOP
RAISE WARNING '"% % %"', t.owner, t.tablespace, t.tname;
END LOOP;
RETURN NEW;
END
$$ LANGUAGE PLPGSQL;

create trigger t_ins_t AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION
trg_func();

set debug_discard_caches to 1;
insert into t values('pg_catalog', 'pg_class');

It's harder to achieve this without permanent cache (harder to trigger
revalidation), but it's still possible.

What happened here is that during revalidation query plan was rebuilt,
but modifications to query tree, made by check_sql_fn_retval() , were
lost.

To fix this issue:
1) We avoid caching modified plans (and check_sql_fn_retval() now
reports if it touched a query tree);
2) For non-cached plans we still need a hack (callback) into
CachedPlanSource to rebuild query tree if invalidation happens. This
callback rebuilds query tree, using check_sql_fn_retval(). We sure that
callback parameters, specifying actual function return type, should not
be recalculated, as such plans can appear only during one function
execution and are not reused.
3) To prove that result type was not changed between plans execution, we
build plans with fixed_result = true.
4) When we get saved plan, prior to using it, we check that result tlist
matches the one built while planning function execution. Otherwise, we
recreate CachedPlanSource.

Well, it appeared more complicated than I've expected, but now it seems
simple SQL functions have much better performance.

create or replace function fx(int) returns int as $$ select $1 + $1; $$
language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;

-- sql function
do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
Time: 3008.869 ms (00:03.009)

-- dynamic SQL
do
$$ begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
Time: 4915.295 ms (00:04.915)do $$

-- pre-parsed function
begin
for i in 1..1000000 loop
perform fx3((random()*100)::int);
end loop;
end;
$$;
Time: 2992.166 ms (00:02.992)

-- no plan caching due to need in fixing target list:
do $$
begin
for i in 1..1000000 loop
perform fx4((random()*100)::int);
end loop;
end;
$$;
Time: 11020.820 ms (00:11.021)

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
v6-0004-Handle-SQL-functions-which-are-modified-between-rewr.patch	text/x-diff	14.3 KB
v6-0003-Introduce-SQL-functions-plan-cache.patch	text/x-diff	23.6 KB
v6-0002-Use-custom-plan-machinery-for-SQL-function.patch	text/x-diff	28.2 KB
v6-0001-Split-out-SQL-functions-checks-from-init_execution_s.patch	text/x-diff	3.1 KB

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-26 19:34:42
Message-ID:	CAFj8pRD-JcaGCktoWqGTt7TEb1zgENajVQwuhbHFfSMRH7+XPw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I can confirm 60% speedup for execution of function fx and fx3 - both
functions are very primitive, so for real code the benefit can be higher

Unfortunately, there is about 5% slowdown for inlined code, and for just
plpgsql code too.

I tested fx4

create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;

and fx2

create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;

and execution of patched code is about 5% slower. It is strange so this
patch has a negative impact on plpgsql execution.

Regards

Pavel

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 12:25:03
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule писал(а) 2025-02-26 22:34:
> hI
>
> I can confirm 60% speedup for execution of function fx and fx3 - both
> functions are very primitive, so for real code the benefit can be
> higher
>
> Unfortunately, there is about 5% slowdown for inlined code, and for
> just plpgsql code too.
>
> I tested fx4
>
> create or replace function fx4(int) returns int immutable as $$ begin
> return $1 + $1; end $$ language plpgsql;
>
> and fx2
>
> create or replace function fx2(int) returns int as $$ select 2 * $1;
> $$
> language sql immutable;
>
> and execution of patched code is about 5% slower. It is strange so
> this patch has a negative impact on plpgsql execution.
>
> Regards
>
> Pavel

Hi. I've tried to reproduce slowdown and couldn't.

create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;

do $$
begin
for i in 1..5000000 loop
perform fx4((random()*100)::int); -- or fx2
end loop;
end;
$$;

OLD results:
Time: 8268.614 ms (00:08.269)
Time: 8178.836 ms (00:08.179)
Time: 8306.694 ms (00:08.307)

New (patched) results:
Time: 7743.945 ms (00:07.744)
Time: 7803.109 ms (00:07.803)
Time: 7736.735 ms (00:07.737)

Not sure why new is faster (perhaps, some noise?) - looking at perf
flamegraphs I don't see something evident.

create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
do $$
begin
for i in 1..5000000 loop
perform fx2((random()*100)::int); -- or fx2
end loop;
end;
$$;

OLD results:
Time: 5346.471 ms (00:05.346)
Time: 5359.222 ms (00:05.359)
Time: 5316.747 ms (00:05.317)

New (patched) results:
Time: 5188.363 ms (00:05.188)
Time: 5225.322 ms (00:05.225)
Time: 5203.667 ms (00:05.204)

--
Best regards,
Alexander Pyhalov,
Postgres Professional

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 12:43:50
Message-ID:	CAFj8pRDKJktnr_5JLQTQi4Wh2uKU-cnzc-ojwwRqKfuQBDpuSw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
a(dot)pyhalov(at)postgrespro(dot)ru> napsal:

> Pavel Stehule писал(а) 2025-02-26 22:34:
> > hI
> >
> > I can confirm 60% speedup for execution of function fx and fx3 - both
> > functions are very primitive, so for real code the benefit can be
> > higher
> >
> > Unfortunately, there is about 5% slowdown for inlined code, and for
> > just plpgsql code too.
> >
> > I tested fx4
> >
> > create or replace function fx4(int) returns int immutable as $$ begin
> > return $1 + $1; end $$ language plpgsql;
> >
> > and fx2
> >
> > create or replace function fx2(int) returns int as $$ select 2 * $1;
> > $$
> > language sql immutable;
> >
> > and execution of patched code is about 5% slower. It is strange so
> > this patch has a negative impact on plpgsql execution.
> >
> > Regards
> >
> > Pavel
>
> Hi. I've tried to reproduce slowdown and couldn't.
>
> create or replace function fx4(int) returns int immutable as $$ begin
> return $1 + $1; end $$ language plpgsql;
>
> do $$
> begin
> for i in 1..5000000 loop
> perform fx4((random()*100)::int); -- or fx2
> end loop;
> end;
> $$;
>
> OLD results:
> Time: 8268.614 ms (00:08.269)
> Time: 8178.836 ms (00:08.179)
> Time: 8306.694 ms (00:08.307)
>
> New (patched) results:
> Time: 7743.945 ms (00:07.744)
> Time: 7803.109 ms (00:07.803)
> Time: 7736.735 ms (00:07.737)
>
> Not sure why new is faster (perhaps, some noise?) - looking at perf
> flamegraphs I don't see something evident.
>
> create or replace function fx2(int) returns int as $$ select 2 * $1; $$
> language sql immutable;
> do $$
> begin
> for i in 1..5000000 loop
> perform fx2((random()*100)::int); -- or fx2
> end loop;
> end;
> $$;
>
> OLD results:
> Time: 5346.471 ms (00:05.346)
> Time: 5359.222 ms (00:05.359)
> Time: 5316.747 ms (00:05.317)
>
> New (patched) results:
> Time: 5188.363 ms (00:05.188)
> Time: 5225.322 ms (00:05.225)
> Time: 5203.667 ms (00:05.204)
>

I'll try to get profiles.

Regards

Pavel

>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 19:52:40
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
> a(dot)pyhalov(at)postgrespro(dot)ru> napsal:
>>> Unfortunately, there is about 5% slowdown for inlined code, and for
>>> just plpgsql code too.

>> Hi. I've tried to reproduce slowdown and couldn't.

> I'll try to get profiles.

I tried to reproduce this too. What I got on my usual development
workstation (RHEL8/gcc 8.5.0 on x86_64) was:

fx2 example: v6 patch about 2.4% slower than HEAD
fx4 example: v6 patch about 7.3% slower than HEAD

I was quite concerned after that result, but then I tried it on
another machine (macOS/clang 16.0.0 on Apple M1) and got:

fx2 example: v6 patch about 0.2% slower than HEAD
fx4 example: v6 patch about 0.7% faster than HEAD

(These are average-of-three-runs tests on --disable-cassert
builds; I trust you guys were not doing performance tests on
assert-enabled builds?)

So taken together, our results are all over the map, anywhere
from 7% speedup to 7% slowdown. My usual rule of thumb is that
you can see up to 2% variation in this kind of microbenchmark even
when "nothing has changed", just due to random build details like
whether critical loops cross a cacheline or not. 7% is pretty
well above that threshold, but maybe it's just random build
variation anyway.

Furthermore, since neither example involves functions.c at all
(fx2 would be inlined, and fx4 isn't SQL-language), it's hard
to see how the patch would directly affect either example unless
it were adding overhead to plancache.c. And I don't see any
changes there that would amount to meaningful overhead for the
existing use-case with a raw parse tree.

So right at the moment I'm inclined to write this off as
measurement noise. Perhaps it'd be worth checking a few
more platforms, though.

regards, tom lane

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 20:01:50
Message-ID:	CAFj8pRCGGsEhomePYc_xQTw-kRBqVxgAA0EkQPePFaBOH6AOFw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
> > a(dot)pyhalov(at)postgrespro(dot)ru> napsal:
> >>> Unfortunately, there is about 5% slowdown for inlined code, and for
> >>> just plpgsql code too.
>
> >> Hi. I've tried to reproduce slowdown and couldn't.
>
> > I'll try to get profiles.
>
> I tried to reproduce this too. What I got on my usual development
> workstation (RHEL8/gcc 8.5.0 on x86_64) was:
>
> fx2 example: v6 patch about 2.4% slower than HEAD
> fx4 example: v6 patch about 7.3% slower than HEAD
>
> I was quite concerned after that result, but then I tried it on
> another machine (macOS/clang 16.0.0 on Apple M1) and got:
>
> fx2 example: v6 patch about 0.2% slower than HEAD
> fx4 example: v6 patch about 0.7% faster than HEAD
>
> (These are average-of-three-runs tests on --disable-cassert
> builds; I trust you guys were not doing performance tests on
> assert-enabled builds?)
>
> So taken together, our results are all over the map, anywhere
> from 7% speedup to 7% slowdown. My usual rule of thumb is that
>

Where do you see 7% speedup? Few lines up you wrote 0.7% faster.

you can see up to 2% variation in this kind of microbenchmark even
> when "nothing has changed", just due to random build details like
> whether critical loops cross a cacheline or not. 7% is pretty
> well above that threshold, but maybe it's just random build
> variation anyway.
>
> Furthermore, since neither example involves functions.c at all
> (fx2 would be inlined, and fx4 isn't SQL-language), it's hard
> to see how the patch would directly affect either example unless
> it were adding overhead to plancache.c. And I don't see any
> changes there that would amount to meaningful overhead for the
> existing use-case with a raw parse tree.
>
> So right at the moment I'm inclined to write this off as
> measurement noise. Perhaps it'd be worth checking a few
> more platforms, though.
>
> regards, tom lane
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 20:40:57
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> Now sql functions plans are actually saved. The most of it is a
> simplified version of plpgsql plan cache. Perhaps, I've missed
> something.

A couple of thoughts about v6:

* I don't think it's okay to just summarily do this:

/* It's stale; unlink and delete */
fcinfo->flinfo->fn_extra = NULL;
MemoryContextDelete(fcache->fcontext);
fcache = NULL;

when fmgr_sql sees that the cache is stale. If we're
doing a nested call of a recursive SQL function, this'd be
cutting the legs out from under the outer recursion level.
plpgsql goes to some lengths to do reference-counting of
function cache entries, and I think you need the same here.

* I don't like much of anything about 0004. It's messy and it
gives up all the benefit of plan caching in some pretty-common
cases (anywhere where the user was sloppy about what data type
is being returned). I wonder if we couldn't solve that with
more finesse by applying check_sql_fn_retval() to the query tree
after parse analysis, but before we hand it to plancache.c?
This is different from what happens now because we'd be applying
it before not after query rewrite, but I don't think that
query rewrite ever changes the targetlist results. Another
point is that the resultTargetList output might change subtly,
but I don't think we care there either: I believe we only examine
that output for its result data types and resjunk markers.
(This is nonobvious and inadequately documented, but for sure we
couldn't try to execute that tlist --- it's never passed through
the planner.)

* One diff that caught my eye was

- if (!ActiveSnapshotSet() &&
- plansource->raw_parse_tree &&
- analyze_requires_snapshot(plansource->raw_parse_tree))
+ if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource))

Because StmtPlanRequiresRevalidation uses
stmt_requires_parse_analysis, this is basically throwing away the
distinction between stmt_requires_parse_analysis and
analyze_requires_snapshot. I do not think that's okay, for the
reasons explained in analyze_requires_snapshot. We should expend the
additional notational overhead to keep those things separate.

* I'm also not thrilled by teaching StmtPlanRequiresRevalidation
how to do something equivalent to stmt_requires_parse_analysis for
Query trees. The entire point of the current division of labor is
for it *not* to know that, but keep the knowledge of the properties
of different statement types in parser/analyze.c. So it looks to me
like we need to add a function to parser/analyze.c that does this.
Not quite sure what to call it though.
querytree_requires_parse_analysis() would be a weird name, because
if it's a Query tree then it's already been through parse analysis.
Maybe querytree_requires_revalidation()?

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-27 20:45:33
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
>> So taken together, our results are all over the map, anywhere
>> from 7% speedup to 7% slowdown. My usual rule of thumb is that

> Where do you see 7% speedup? Few lines up you wrote 0.7% faster.

Alexander got that on the fx4 case, according to his response a
few messages ago [1]. It'd be good if someone else could reproduce
that, because right now we have two "it's slower" results versus
only one "it's faster".

regards, tom lane

[1] https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-02-28 06:29:49
Message-ID:	CAFj8pRCTwLOa28fEJMOQOwCbNEouprsXuucZzGyddbNNwq8YPA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

čt 27. 2. 2025 v 21:45 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
> >> So taken together, our results are all over the map, anywhere
> >> from 7% speedup to 7% slowdown. My usual rule of thumb is that
>
> > Where do you see 7% speedup? Few lines up you wrote 0.7% faster.
>
> Alexander got that on the fx4 case, according to his response a
> few messages ago [1]. It'd be good if someone else could reproduce
> that, because right now we have two "it's slower" results versus
> only one "it's faster".
>

here is a profile from master

6.98% postgres postgres [.] hash_bytes
6.30% postgres postgres [.] palloc0
3.57% postgres postgres [.] SearchCatCacheInternal
3.29% postgres postgres [.] AllocSetAlloc
2.65% postgres plpgsql.so [.] exec_stmts
2.55% postgres postgres [.] expression_tree_walker_impl
2.34% postgres postgres [.] _SPI_execute_plan
2.13% postgres postgres [.] CheckExprStillValid
2.02% postgres postgres [.] fmgr_info_cxt_security
1.89% postgres postgres [.] ExecInitFunc
1.51% postgres postgres [.] ExecInterpExpr
1.48% postgres postgres [.] ResourceOwnerForget
1.44% postgres postgres [.] AllocSetReset
1.35% postgres postgres [.] MemoryContextCreate
1.30% postgres plpgsql.so [.] plpgsql_exec_function
1.29% postgres libc.so.6 [.] __memcmp_sse2
1.24% postgres postgres [.] MemoryContextDelete
1.13% postgres postgres [.] check_stack_depth
1.11% postgres postgres [.] AllocSetContextCreateInternal
1.09% postgres postgres [.] resolve_polymorphic_argtypes
1.08% postgres postgres [.] hash_search_with_hash_value
1.07% postgres postgres [.] standard_ExecutorStart
1.07% postgres postgres [.] ExprEvalPushStep
1.04% postgres postgres [.] ExecInitExprRec
0.95% postgres plpgsql.so [.] plpgsql_estate_setup
0.91% postgres postgres [.] ExecReadyInterpretedExp

and from patched

7.08% postgres postgres [.] hash_bytes
6.25% postgres postgres [.] palloc0
3.52% postgres postgres [.] SearchCatCacheInternal
3.30% postgres postgres [.] AllocSetAlloc
2.39% postgres postgres [.] expression_tree_walker_impl
2.37% postgres plpgsql.so [.] exec_stmts
2.15% postgres postgres [.] _SPI_execute_plan
2.10% postgres postgres [.] CheckExprStillValid
1.94% postgres postgres [.] fmgr_info_cxt_security
1.71% postgres postgres [.] ExecInitFunc
1.41% postgres postgres [.] AllocSetReset
1.40% postgres postgres [.] ExecInterpExpr
1.38% postgres postgres [.] ExprEvalPushStep
1.34% postgres postgres [.] ResourceOwnerForget
1.31% postgres postgres [.] MemoryContextDelete
1.24% postgres libc.so.6 [.] __memcmp_sse2
1.21% postgres postgres [.] MemoryContextCreate
1.18% postgres postgres [.] AllocSetContextCreateInternal
1.17% postgres postgres [.] hash_search_with_hash_value
1.13% postgres postgres [.] resolve_polymorphic_argtypes
1.13% postgres plpgsql.so [.] plpgsql_exec_function
1.03% postgres postgres [.] standard_ExecutorStart
0.98% postgres postgres [.] ExecInitExprRec
0.96% postgres postgres [.] check_stack_depth

looks so there is only one significant differences

ExprEvalPushStep 1.07 x 1.38%

Regards

Pavel

compiled without assertions on gcc 15 with 02

vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping : 7
microcode : 0x2f
cpu MHz : 2691.102
cache size : 6144 KB

> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru
>

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-01 08:18:13
Message-ID:	CAFj8pRA8jDcHLi6gcpJpMTfyX9chEfoUydU1taxpAL08+=VAjw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

pá 28. 2. 2025 v 7:29 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

> Hi
>
> čt 27. 2. 2025 v 21:45 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
>
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> > čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
>> >> So taken together, our results are all over the map, anywhere
>> >> from 7% speedup to 7% slowdown. My usual rule of thumb is that
>>
>> > Where do you see 7% speedup? Few lines up you wrote 0.7% faster.
>>
>> Alexander got that on the fx4 case, according to his response a
>> few messages ago [1]. It'd be good if someone else could reproduce
>> that, because right now we have two "it's slower" results versus
>> only one "it's faster".
>>
>
> ok
>
> here is a profile from master
>
> 6.98% postgres postgres [.] hash_bytes
> 6.30% postgres postgres [.] palloc0
> 3.57% postgres postgres [.] SearchCatCacheInternal
> 3.29% postgres postgres [.] AllocSetAlloc
> 2.65% postgres plpgsql.so [.] exec_stmts
> 2.55% postgres postgres [.] expression_tree_walker_impl
> 2.34% postgres postgres [.] _SPI_execute_plan
> 2.13% postgres postgres [.] CheckExprStillValid
> 2.02% postgres postgres [.] fmgr_info_cxt_security
> 1.89% postgres postgres [.] ExecInitFunc
> 1.51% postgres postgres [.] ExecInterpExpr
> 1.48% postgres postgres [.] ResourceOwnerForget
> 1.44% postgres postgres [.] AllocSetReset
> 1.35% postgres postgres [.] MemoryContextCreate
> 1.30% postgres plpgsql.so [.] plpgsql_exec_function
> 1.29% postgres libc.so.6 [.] __memcmp_sse2
> 1.24% postgres postgres [.] MemoryContextDelete
> 1.13% postgres postgres [.] check_stack_depth
> 1.11% postgres postgres [.] AllocSetContextCreateInternal
> 1.09% postgres postgres [.] resolve_polymorphic_argtypes
> 1.08% postgres postgres [.] hash_search_with_hash_value
> 1.07% postgres postgres [.] standard_ExecutorStart
> 1.07% postgres postgres [.] ExprEvalPushStep
> 1.04% postgres postgres [.] ExecInitExprRec
> 0.95% postgres plpgsql.so [.] plpgsql_estate_setup
> 0.91% postgres postgres [.] ExecReadyInterpretedExp
>
> and from patched
>
> 7.08% postgres postgres [.] hash_bytes
> 6.25% postgres postgres [.] palloc0
> 3.52% postgres postgres [.] SearchCatCacheInternal
> 3.30% postgres postgres [.] AllocSetAlloc
> 2.39% postgres postgres [.] expression_tree_walker_impl
> 2.37% postgres plpgsql.so [.] exec_stmts
> 2.15% postgres postgres [.] _SPI_execute_plan
> 2.10% postgres postgres [.] CheckExprStillValid
> 1.94% postgres postgres [.] fmgr_info_cxt_security
> 1.71% postgres postgres [.] ExecInitFunc
> 1.41% postgres postgres [.] AllocSetReset
> 1.40% postgres postgres [.] ExecInterpExpr
> 1.38% postgres postgres [.] ExprEvalPushStep
> 1.34% postgres postgres [.] ResourceOwnerForget
> 1.31% postgres postgres [.] MemoryContextDelete
> 1.24% postgres libc.so.6 [.] __memcmp_sse2
> 1.21% postgres postgres [.] MemoryContextCreate
> 1.18% postgres postgres [.] AllocSetContextCreateInternal
> 1.17% postgres postgres [.] hash_search_with_hash_value
> 1.13% postgres postgres [.] resolve_polymorphic_argtypes
> 1.13% postgres plpgsql.so [.] plpgsql_exec_function
> 1.03% postgres postgres [.] standard_ExecutorStart
> 0.98% postgres postgres [.] ExecInitExprRec
> 0.96% postgres postgres [.] check_stack_depth
>
> looks so there is only one significant differences
>
> ExprEvalPushStep 1.07 x 1.38%
>
> Regards
>
> Pavel
>
> compiled without assertions on gcc 15 with 02
>
> vendor_id : GenuineIntel
> cpu family : 6
> model : 42
> model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
> stepping : 7
> microcode : 0x2f
> cpu MHz : 2691.102
> cache size : 6144 KB
>
>
I tested the patches on another notebook with more recent cpu

vendor_id : GenuineIntel
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB

And the difference are smaller - about 3%

Regards

Pavel

>
>
>
>> regards, tom lane
>>
>> [1]
>> https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru
>>
>

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-06 08:57:07
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi.

Tom Lane писал(а) 2025-02-27 23:40:
> Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
>> Now sql functions plans are actually saved. The most of it is a
>> simplified version of plpgsql plan cache. Perhaps, I've missed
>> something.
>
> A couple of thoughts about v6:
>
> * I don't think it's okay to just summarily do this:
>
> /* It's stale; unlink and delete */
> fcinfo->flinfo->fn_extra = NULL;
> MemoryContextDelete(fcache->fcontext);
> fcache = NULL;
>
> when fmgr_sql sees that the cache is stale. If we're
> doing a nested call of a recursive SQL function, this'd be
> cutting the legs out from under the outer recursion level.
> plpgsql goes to some lengths to do reference-counting of
> function cache entries, and I think you need the same here.
>

I've looked for original bug report 7881 (
https://www.postgresql.org/message-id/E1U5ytP-0006E3-KB%40wrigleys.postgresql.org
).
It's interesting, but it seems that plan cache is not affected by it as
when fcinfo xid mismatches we destroy fcinfo, not plan itself (cached
plan survives and still can be used).

I thought about another case - when recursive function is invalidated
during its execution. But I haven't found such case. If function is
modified during function execution in another backend, the original
backend uses old snapshot during function execution and still sees the
old function definition. Looked at the following case -

create or replace function f (int) returns setof int language sql as $$
select i from t where t.j = $1
union all
select f(i) from t where t.j = $1
$$;

and changed function definition to

create or replace function f (int) returns setof int language sql as $$
select i from t where t.j = $1 and i > 0
union all
select f(i) from t where t.j = $1
$$;

during execution of the function. As expected, backend still sees the
old definition and uses cached plan.

> * I don't like much of anything about 0004. It's messy and it
> gives up all the benefit of plan caching in some pretty-common
> cases (anywhere where the user was sloppy about what data type
> is being returned). I wonder if we couldn't solve that with
> more finesse by applying check_sql_fn_retval() to the query tree
> after parse analysis, but before we hand it to plancache.c?
> This is different from what happens now because we'd be applying
> it before not after query rewrite, but I don't think that
> query rewrite ever changes the targetlist results. Another
> point is that the resultTargetList output might change subtly,
> but I don't think we care there either: I believe we only examine
> that output for its result data types and resjunk markers.
> (This is nonobvious and inadequately documented, but for sure we
> couldn't try to execute that tlist --- it's never passed through
> the planner.)

I'm also not fond of the last patch. So tried to fix it in a way you've
suggested - we call check_sql_fn_retval() before creating cached plans.
It fixes issue with revalidation happening after modifying query
results.

One test now changes behavior. What's happening is that after moving
extension to another schema, cached plan is invalidated. But
revalidation
happens and rebuilds the plan. As we've saved analyzed parse tree, not
the raw one, we refer to public.dep_req2() not by name, but by oid. Oid
didn't change, so cached plan is rebuilt and used. Don't know, should we
fix it and if should, how.
>
> * One diff that caught my eye was
>
> - if (!ActiveSnapshotSet() &&
> - plansource->raw_parse_tree &&
> - analyze_requires_snapshot(plansource->raw_parse_tree))
> + if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource))
>
> Because StmtPlanRequiresRevalidation uses
> stmt_requires_parse_analysis, this is basically throwing away the
> distinction between stmt_requires_parse_analysis and
> analyze_requires_snapshot. I do not think that's okay, for the
> reasons explained in analyze_requires_snapshot. We should expend the
> additional notational overhead to keep those things separate.
>
> * I'm also not thrilled by teaching StmtPlanRequiresRevalidation
> how to do something equivalent to stmt_requires_parse_analysis for
> Query trees. The entire point of the current division of labor is
> for it *not* to know that, but keep the knowledge of the properties
> of different statement types in parser/analyze.c. So it looks to me
> like we need to add a function to parser/analyze.c that does this.
> Not quite sure what to call it though.
> querytree_requires_parse_analysis() would be a weird name, because
> if it's a Query tree then it's already been through parse analysis.
> Maybe querytree_requires_revalidation()?

I've created querytree_requires_revalidation() in analyze.c and used it
in both
StmtPlanRequiresRevalidation() and BuildingPlanRequiresSnapshot(). Both
are essentially the same,
but this allows to preserve the distinction between
stmt_requires_parse_analysis and
analyze_requires_snapshot.

I've checked old performance results:

create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;

postgres=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2896.729 ms (00:02.897)

postgres=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2943.926 ms (00:02.944)

postgres=# do $$ begin
for i in 1..1000000 loop
perform fx3((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2941.629 ms (00:02.942)

postgres=# do $$
begin
for i in 1..1000000 loop
perform fx4((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2952.383 ms (00:02.952)

Here we see the only distinction - fx4() is now also fast, as we use
cached plan for it.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
0004-Handle-SQL-functions-which-result-type-is-adjuisted.patch	text/x-diff	6.7 KB
0003-Introduce-SQL-functions-plan-cache.patch	text/x-diff	23.6 KB
0002-Use-custom-plan-machinery-for-SQL-function.patch	text/x-diff	30.6 KB
0001-Split-out-SQL-functions-checks-from-init_execution_s.patch	text/x-diff	3.1 KB

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-10 06:58:43
Message-ID:	CAFj8pRDWDeF2cC+pCjLHJno7KnK5kdtjYN-f933RHS7UneArFw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

čt 6. 3. 2025 v 9:57 odesílatel Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
napsal:

> Hi.
>
> Tom Lane писал(а) 2025-02-27 23:40:
> > Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> >> Now sql functions plans are actually saved. The most of it is a
> >> simplified version of plpgsql plan cache. Perhaps, I've missed
> >> something.
> >
> > A couple of thoughts about v6:
> >
> > * I don't think it's okay to just summarily do this:
> >
> > /* It's stale; unlink and delete */
> > fcinfo->flinfo->fn_extra = NULL;
> > MemoryContextDelete(fcache->fcontext);
> > fcache = NULL;
> >
> > when fmgr_sql sees that the cache is stale. If we're
> > doing a nested call of a recursive SQL function, this'd be
> > cutting the legs out from under the outer recursion level.
> > plpgsql goes to some lengths to do reference-counting of
> > function cache entries, and I think you need the same here.
> >
>
> I've looked for original bug report 7881 (
>
> https://www.postgresql.org/message-id/E1U5ytP-0006E3-KB%40wrigleys.postgresql.org
> ).
> It's interesting, but it seems that plan cache is not affected by it as
> when fcinfo xid mismatches we destroy fcinfo, not plan itself (cached
> plan survives and still can be used).
>
> I thought about another case - when recursive function is invalidated
> during its execution. But I haven't found such case. If function is
> modified during function execution in another backend, the original
> backend uses old snapshot during function execution and still sees the
> old function definition. Looked at the following case -
>
> create or replace function f (int) returns setof int language sql as $$
> select i from t where t.j = $1
> union all
> select f(i) from t where t.j = $1
> $$;
>
> and changed function definition to
>
> create or replace function f (int) returns setof int language sql as $$
> select i from t where t.j = $1 and i > 0
> union all
> select f(i) from t where t.j = $1
> $$;
>
> during execution of the function. As expected, backend still sees the
> old definition and uses cached plan.
>
> > * I don't like much of anything about 0004. It's messy and it
> > gives up all the benefit of plan caching in some pretty-common
> > cases (anywhere where the user was sloppy about what data type
> > is being returned). I wonder if we couldn't solve that with
> > more finesse by applying check_sql_fn_retval() to the query tree
> > after parse analysis, but before we hand it to plancache.c?
> > This is different from what happens now because we'd be applying
> > it before not after query rewrite, but I don't think that
> > query rewrite ever changes the targetlist results. Another
> > point is that the resultTargetList output might change subtly,
> > but I don't think we care there either: I believe we only examine
> > that output for its result data types and resjunk markers.
> > (This is nonobvious and inadequately documented, but for sure we
> > couldn't try to execute that tlist --- it's never passed through
> > the planner.)
>
> I'm also not fond of the last patch. So tried to fix it in a way you've
> suggested - we call check_sql_fn_retval() before creating cached plans.
> It fixes issue with revalidation happening after modifying query
> results.
>
> One test now changes behavior. What's happening is that after moving
> extension to another schema, cached plan is invalidated. But
> revalidation
> happens and rebuilds the plan. As we've saved analyzed parse tree, not
> the raw one, we refer to public.dep_req2() not by name, but by oid. Oid
> didn't change, so cached plan is rebuilt and used. Don't know, should we
> fix it and if should, how.
> >
> > * One diff that caught my eye was
> >
> > - if (!ActiveSnapshotSet() &&
> > - plansource->raw_parse_tree &&
> > - analyze_requires_snapshot(plansource->raw_parse_tree))
> > + if (!ActiveSnapshotSet() &&
> StmtPlanRequiresRevalidation(plansource))
> >
> > Because StmtPlanRequiresRevalidation uses
> > stmt_requires_parse_analysis, this is basically throwing away the
> > distinction between stmt_requires_parse_analysis and
> > analyze_requires_snapshot. I do not think that's okay, for the
> > reasons explained in analyze_requires_snapshot. We should expend the
> > additional notational overhead to keep those things separate.
> >
> > * I'm also not thrilled by teaching StmtPlanRequiresRevalidation
> > how to do something equivalent to stmt_requires_parse_analysis for
> > Query trees. The entire point of the current division of labor is
> > for it *not* to know that, but keep the knowledge of the properties
> > of different statement types in parser/analyze.c. So it looks to me
> > like we need to add a function to parser/analyze.c that does this.
> > Not quite sure what to call it though.
> > querytree_requires_parse_analysis() would be a weird name, because
> > if it's a Query tree then it's already been through parse analysis.
> > Maybe querytree_requires_revalidation()?
>
> I've created querytree_requires_revalidation() in analyze.c and used it
> in both
> StmtPlanRequiresRevalidation() and BuildingPlanRequiresSnapshot(). Both
> are essentially the same,
> but this allows to preserve the distinction between
> stmt_requires_parse_analysis and
> analyze_requires_snapshot.
>
> I've checked old performance results:
>
> create or replace function fx2(int) returns int as $$ select 2 * $1; $$
> language sql immutable;
> create or replace function fx3 (int) returns int immutable begin atomic
> select $1 + $1; end;
> create or replace function fx4(int) returns numeric as $$ select $1 +
> $1; $$ language sql immutable;
>
> postgres=# do $$
> begin
> for i in 1..1000000 loop
> perform fx((random()*100)::int);
> end loop;
> end;
> $$;
> DO
> Time: 2896.729 ms (00:02.897)
>
> postgres=# do $$
> begin
> for i in 1..1000000 loop
> perform fx((random()*100)::int);
> end loop;
> end;
> $$;
> DO
> Time: 2943.926 ms (00:02.944)
>
> postgres=# do $$ begin
> for i in 1..1000000 loop
> perform fx3((random()*100)::int);
> end loop;
> end;
> $$;
> DO
> Time: 2941.629 ms (00:02.942)
>
> postgres=# do $$
> begin
> for i in 1..1000000 loop
> perform fx4((random()*100)::int);
> end loop;
> end;
> $$;
> DO
> Time: 2952.383 ms (00:02.952)
>
> Here we see the only distinction - fx4() is now also fast, as we use
> cached plan for it.
>

I checked these tests with

vendor_id : GenuineIntel
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB

CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;

create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;

create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;

create or replace function fx5(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql immutable;

create or replace function fx6(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql volatile;

postgres=# do $$
begin
for i in 1..10000000 loop
perform fx6((random()*100)::int); -- or fx2
end loop;
end;
$$;

My results are

master f1, fx2, fx3, fx4, fx5, fx6
36233, 7297,45693,40794, 11020,10897
19446, 7315,19777,20547, 11144,10954

Still I see a small slowdown in today's fast cases, but probably it will
not be extra important - on 10M operations it is about 50ms
so in real world there will be other factors more stronger. The speedup in
the slow cases is about 50%.

Regards

Pavel

> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-13 16:59:14
Message-ID:	CAFj8pRD0diy5VDP=8B9au5sk_yRknCWqZx35VniBtP_OA4aJNw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I am checking last patches

Maybe interesting change is the change of error message context

QUERY: SELECT public.dep_req2() || ' req3b'.
-CONTEXT: SQL function "dep_req3b" during startup
+CONTEXT: SQL function "dep_req3b" statement 1

almost all SQL functions have just one statement, so showing the number of
the statement looks useless
(until now, I didn't see multiple statements SQL function) ,
we lost the time info "during startup". Maybe the error message can be
enhanced more like plpgsql,
instead of statement numbers, the lines or positions should be displayed.

The changing context text can be done in a separate patch - and in this
moment, we
can use old behaviour.

Regards

Pavel

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-13 18:29:20
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> Maybe interesting change is the change of error message context

> QUERY: SELECT public.dep_req2() || ' req3b'.
> -CONTEXT: SQL function "dep_req3b" during startup
> +CONTEXT: SQL function "dep_req3b" statement 1

I'm not hugely excited about that given that it's just happening in
one case. It might be useful to understand exactly why it's changing,
but I doubt it's something we need to "fix".

regards, tom lane

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-14 06:33:48
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane писал(а) 2025-03-13 21:29:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> Maybe interesting change is the change of error message context
>
>> QUERY: SELECT public.dep_req2() || ' req3b'.
>> -CONTEXT: SQL function "dep_req3b" during startup
>> +CONTEXT: SQL function "dep_req3b" statement 1
>
> I'm not hugely excited about that given that it's just happening in
> one case. It might be useful to understand exactly why it's changing,
> but I doubt it's something we need to "fix".
>
> regards, tom lane

Hi. What happens here that dep_req3b() was already cached and planned.
But when another extension test_ext_req_schema2 is moved to another
schema,
plan is invalidated. Note that function is already planned, and error
hppens not "during startup", but when we execute the first cached
plan. Now when we are executing the first execution_state,
init_execution_state()
is called. RevalidateCachedQuery() tries to rewrite query and gets an
error while
fcache->planning_stmt_number is 1. So, it correctly reports that it's
the first
statement.
Earlier this error was also called in init_execution_state() (in
pg_analyze_and_rewrite_withcb()),
but it was considered a startup error (as fcache->func_state) doesn't
exist
when error is thrown.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-14 20:52:26
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I spent some time today going through the actual code in this patch.
I realized that there's no longer any point in 0001: the later patches
don't move or repeatedly-call that bit of code, so it can be left as-is.

What I think we could stand to split out, though, is the changes in
the plancache support. The new 0001 attached is just the plancache
and analyze.c changes. That could be committed separately, although
of course there's little point in pushing it till we're happy with
the rest.

In general, this patch series is paying far too little attention to
updating existing comments that it obsoletes or adding new ones
explaining what's going on. For example, the introductory comment
for struct SQLFunctionCache still says

* Note that currently this has only the lifespan of the calling query.
* Someday we should rewrite this code to use plancache.c to save parse/plan
* results for longer than that.

and I wonder how much of the para after that is still accurate either.
The new structs aren't adequately documented either IMO. We now have
about three different structs that have something to do with caches
by their names, but the reader is left to guess how they fit together.
Another example is that the header comment for init_execution_state
still describes an argument list it hasn't got anymore.

I tried to clean up the comment situation in the plancache in 0001,
but I've not done much of anything to functions.c.

I'm fairly confused why 0002 and 0003 are separate patches, and the
commit messages for them do nothing to clarify that. It seems like
you're expecting reviewers to review a very transitory state of
affairs in 0002, and it's not clear why. Maybe the code is fine
and you just need to explain the change sequence a bit more
in the commit messages. 0002 could stand to explain the point
of the new test cases, too, especially since one of them seems to
be demonstrating the fixing of a pre-existing bug.

Something is very wrong in 0004: it should not be breaking that
test case in test_extensions. It seems to me we should already
have the necessary infrastructure for that, in that the plan
ought to have a PlanInvalItem referencing public.dep_req2(),
and the ALTER SET SCHEMA that gets done on that function should
result in an invalidation. So it looks to me like that patch
has somehow rearranged things so we miss an invalidation.
I've not tried to figure out why. I'm also sad that 0004
doesn't appear to include any test cases showing it doing
something right: without that, why do it at all?

regards, tom lane

Attachment	Content-Type	Size
v8-0001-Support-cached-plans-that-work-from-a-parse-analy.patch	text/x-diff	17.1 KB
v8-0002-Use-custom-plan-machinery-for-SQL-function.patch	text/x-diff	19.6 KB
v8-0003-Introduce-SQL-functions-plan-cache.patch	text/x-diff	24.0 KB
v8-0004-Handle-SQL-functions-which-result-type-is-adjuist.patch	text/x-diff	6.7 KB

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-28 12:22:39
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane писал(а) 2025-03-14 23:52:
> I spent some time today going through the actual code in this patch.
> I realized that there's no longer any point in 0001: the later patches
> don't move or repeatedly-call that bit of code, so it can be left
> as-is.
>
> What I think we could stand to split out, though, is the changes in
> the plancache support. The new 0001 attached is just the plancache
> and analyze.c changes. That could be committed separately, although
> of course there's little point in pushing it till we're happy with
> the rest.
>

Hi.
Sorry that didn't reply immediately, was busy with another tasks.

> In general, this patch series is paying far too little attention to
> updating existing comments that it obsoletes or adding new ones
> explaining what's going on. For example, the introductory comment
> for struct SQLFunctionCache still says
>
> * Note that currently this has only the lifespan of the calling query.
> * Someday we should rewrite this code to use plancache.c to save
> parse/plan
> * results for longer than that.
>
> and I wonder how much of the para after that is still accurate either.
> The new structs aren't adequately documented either IMO. We now have
> about three different structs that have something to do with caches
> by their names, but the reader is left to guess how they fit together.
> Another example is that the header comment for init_execution_state
> still describes an argument list it hasn't got anymore.
>
> I tried to clean up the comment situation in the plancache in 0001,
> but I've not done much of anything to functions.c.

I've added some comments to functions.c. Modified comments you've
spotted out.

>
> I'm fairly confused why 0002 and 0003 are separate patches, and the
> commit messages for them do nothing to clarify that. It seems like
> you're expecting reviewers to review a very transitory state of
> affairs in 0002, and it's not clear why. Maybe the code is fine
> and you just need to explain the change sequence a bit more
> in the commit messages. 0002 could stand to explain the point
> of the new test cases, too, especially since one of them seems to
> be demonstrating the fixing of a pre-existing bug.

Also merged introducing plan cache to sql functions and session-level
plan cache support. Mostly they were separate for historic reasons.

>
> Something is very wrong in 0004: it should not be breaking that
> test case in test_extensions. It seems to me we should already
> have the necessary infrastructure for that, in that the plan
> ought to have a PlanInvalItem referencing public.dep_req2(),
> and the ALTER SET SCHEMA that gets done on that function should
> result in an invalidation. So it looks to me like that patch
> has somehow rearranged things so we miss an invalidation.
> I've not tried to figure out why.

Plan is invalidated in both cases (before and after the patch).
What happens here is that earlier when revalidation happened, we
couldn't find renamed function.
Now function in Query is identified by its oid, and it didn't change.
So, we still can find function by oid and rebuild cached plan.

> I'm also sad that 0004
> doesn't appear to include any test cases showing it doing
> something right: without that, why do it at all?

I've added sample, which is fixed by this patch. It can happen that
plan is adjusted and saved. Later it's invalidated and when revalidation
happens,
we miss modifications, added by check_sql_fn_retval(). Another
interesting issue
is that cached plan is checked for being valid before function starts
executing
(like earlier planning happened before function started executing). So,
once we
discard cached plans, plan for second query in function is not
invalidated immediately,
just on the second execution. And after rebuilding plan, it becomes
wrong.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment	Content-Type	Size
0003-Handle-SQL-functions-which-result-type-is-adjuisted.patch	text/x-diff	9.2 KB
0002-Use-cached-plans-machinery-for-SQL-function.patch	text/x-diff	43.2 KB
0001-Support-cached-plans-that-work-from-a-parse-analyzed.patch	text/x-diff	17.0 KB

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-29 07:24:41
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov писал(а) 2025-03-28 15:22:
> Tom Lane писал(а) 2025-03-14 23:52:
>> I spent some time today going through the actual code in this patch.
>> I realized that there's no longer any point in 0001: the later patches
>> don't move or repeatedly-call that bit of code, so it can be left
>> as-is.
>>
>> What I think we could stand to split out, though, is the changes in
>> the plancache support. The new 0001 attached is just the plancache
>> and analyze.c changes. That could be committed separately, although
>> of course there's little point in pushing it till we're happy with
>> the rest.
>>
>
> Hi.
> Sorry that didn't reply immediately, was busy with another tasks.
>
>> In general, this patch series is paying far too little attention to
>> updating existing comments that it obsoletes or adding new ones
>> explaining what's going on. For example, the introductory comment
>> for struct SQLFunctionCache still says
>>
>> * Note that currently this has only the lifespan of the calling
>> query.
>> * Someday we should rewrite this code to use plancache.c to save
>> parse/plan
>> * results for longer than that.
>>
>> and I wonder how much of the para after that is still accurate either.
>> The new structs aren't adequately documented either IMO. We now have
>> about three different structs that have something to do with caches
>> by their names, but the reader is left to guess how they fit together.
>> Another example is that the header comment for init_execution_state
>> still describes an argument list it hasn't got anymore.
>>
>> I tried to clean up the comment situation in the plancache in 0001,
>> but I've not done much of anything to functions.c.
>
> I've added some comments to functions.c. Modified comments you've
> spotted out.
>
>>
>> I'm fairly confused why 0002 and 0003 are separate patches, and the
>> commit messages for them do nothing to clarify that. It seems like
>> you're expecting reviewers to review a very transitory state of
>> affairs in 0002, and it's not clear why. Maybe the code is fine
>> and you just need to explain the change sequence a bit more
>> in the commit messages. 0002 could stand to explain the point
>> of the new test cases, too, especially since one of them seems to
>> be demonstrating the fixing of a pre-existing bug.
>
> Also merged introducing plan cache to sql functions and session-level
> plan cache support. Mostly they were separate for historic reasons.
>
>>
>> Something is very wrong in 0004: it should not be breaking that
>> test case in test_extensions. It seems to me we should already
>> have the necessary infrastructure for that, in that the plan
>> ought to have a PlanInvalItem referencing public.dep_req2(),
>> and the ALTER SET SCHEMA that gets done on that function should
>> result in an invalidation. So it looks to me like that patch
>> has somehow rearranged things so we miss an invalidation.
>> I've not tried to figure out why.
>
> Plan is invalidated in both cases (before and after the patch).
> What happens here is that earlier when revalidation happened, we
> couldn't find renamed function.
> Now function in Query is identified by its oid, and it didn't change.
> So, we still can find function by oid and rebuild cached plan.
>
>> I'm also sad that 0004
>> doesn't appear to include any test cases showing it doing
>> something right: without that, why do it at all?
>
> I've added sample, which is fixed by this patch. It can happen that
> plan is adjusted and saved. Later it's invalidated and when
> revalidation happens,
> we miss modifications, added by check_sql_fn_retval(). Another
> interesting issue
> is that cached plan is checked for being valid before function starts
> executing
> (like earlier planning happened before function started executing). So,
> once we
> discard cached plans, plan for second query in function is not
> invalidated immediately,
> just on the second execution. And after rebuilding plan, it becomes
> wrong.

After writing some comments, looking at it once again, I've found that
one assumption is wrong - function can be discarded from cache during
its execution.

For example,

create or replace function recurse(anyelement) returns record as
$$
begin
if ($1 > 0) then
if (mod($1, 2) = 0) then
execute format($query$
create or replace function sql_recurse(anyelement) returns
record as
$q$ select recurse($1); select ($1,2); $q$ language sql;
$query$);
end if;
return sql_recurse($1 - 1);
else
return row($1, 1::int);
end if;
end;
$$ language plpgsql;

create or replace function sql_recurse(anyelement) returns record as
$$ select recurse($1); select ($1,2); $$ language sql;

create table t1 (i int);
insert into t1 values(2),(3),(4);

select sql_recurse(i) from t1;

leads to dropping cached plans while they are needed. Will look how
better to handle it.

Also one interesting note is as we don't use raw_parse_tree, it seems we
don't need plansource->parserSetup and plansource->parserSetupArg. It
seems we can avoid caching complete parse info.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-29 14:08:49
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> After writing some comments, looking at it once again, I've found that
> one assumption is wrong - function can be discarded from cache during
> its execution.

Yeah. You really need a use-count on the shared cache object.

I've been working on pulling out plpgsql's code that manages its
function cache into a new module that can be shared with functions.c.
That code is quite battle-tested and I don't see a good argument
for reinventing the logic. It's not fit to share yet, but I hope
to have something in a day or so.

> Also one interesting note is as we don't use raw_parse_tree, it seems we
> don't need plansource->parserSetup and plansource->parserSetupArg. It
> seems we can avoid caching complete parse info.

Well, you do need those when dealing with an old-style function (raw
parse trees).

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-30 16:10:17
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I spent some time reading and reworking this code, and have
arrived at a patch set that I'm pretty happy with. I'm not
sure it's quite committable but it's close:

0001: Same as in v8, extend plancache.c to allow caching
starting from a Query.

0002: Add a post-rewrite callback hook in plancache.c. There
is no chance of getting check_sql_fn_retval to work correctly
without that in the raw-parsetree case: we can't apply the
transformation on what goes into the plancache, and we have
to be able to re-apply it if plancache regenerates the plan
from the raw parse trees.

0003: As I mentioned yesterday, I think we should use the same
cache management logic that plpgsql does, and the best way for
that to happen is to share code. So this invents a new module
"funccache" that extracts the logic plpgsql was using. I did
have to add one feature that plpgsql doesn't have, to allow
part of the cache key to be the output rowtype. Otherwise
cases like this one from rangefuncs.sql won't work:

select * from array_to_set(array['one', 'two']) as t(f1 int,f2 text);
select * from array_to_set(array['one', 'two']) as t(f1 numeric(4,2),f2 text);

These have to have separate cached plans because check_sql_fn_retval
will modify the plans differently.

0004: Restructure check_sql_fn_retval so that we can use it in the
callback hook envisioned in 0002. There's an edge-case semantics
change as explained in the commit message; perhaps that will be
controversial?

0005: This extracts the RLS test case you had and commits it
with the old non-failing behavior, just so that we can see that
the new code does it differently. (I didn't adopt your test
from rules.sql, because AFAICS it works the same with or without
this patch set. What was the point of that one again?)

0006: The guts of the patch. I couldn't break this down any
further.

One big difference from what you had is that there is only one path
of control: we always use the plan cache. The hack you had to not
use it for triggers was only needed because you didn't include the
right cache key items to distinguish different trigger usages, but
the code coming from plpgsql has that right.

Also, the memory management is done a bit differently. The
"fcontext" memory context holding the SQLFunctionCache struct is
now discarded at the end of each execution of the SQL function,
which considerably alleviates worries about leaking memory there.
I invented a small "SQLFunctionLink" struct that is what fn_extra
points at, and it survives as long as the FmgrInfo does, so that's
what saves us from redoing hash key computations in most cases.

I also moved some code around -- notably, init_execution_state now
builds plans for only one CachedPlanState at a time, and we don't
try to construct the output JunkFilter until we plan the last
CachedPlanState. Because of this change, there's no longer a List
of execution_state sublists, but only a sublist matching the
current CachedPlan. We track where we are in the function using
an integer counter of the CachedPlanStates instead.

There's more stuff that could be done, but I feel that all of this
could be left for later:

* I really wanted to do what I mentioned upthread and change things
so we don't even parse the later queries until we've executed the
ones before that. However that seems to be a bit of a mess to
make happen, and the patch is large/complex enough already.

* The execution_state sublist business seems quite vestigial now:
we could probably drop it in favor of one set of those fields and
a counter. But that would involve a lot of notational churn,
much of it in code that doesn't need changes otherwise, and in
the end it would not buy much except removing a tiny amount of
transient space usage. Maybe some other day.

* There's some duplication of effort between cache key computation
and the callers, particularly that for SQL functions we end up
doing get_call_result_type() twice during the initial call.
This could probably be fixed with more refactoring, but it's not
really expensive enough to get too excited about.

I redid Pavel's tests from [1], and got these results in
non-assert builds:

master v10 patch
fx: 50077.251 ms 21221.104 ms
fx2: 8578.874 ms 8576.935 ms
fx3: 66331.186 ms 21173.215 ms
fx4: 56233.003 ms 22757.320 ms
fx5: 13248.177 ms 12370.693 ms
fx6: 13103.840 ms 12245.266 ms

We get substantial wins on all of fx, fx3, fx4. fx2 is the
case that gets inlined and never reaches functions.c, so the
lack of change there is expected. What I found odd is that
I saw a small speedup (~6%) on fx5 and fx6; those functions
are in plpgsql so they really shouldn't change either.
The only thing I can think of is that I made the hash key
computation a tiny bit faster by omitting unused argtypes[]
entries. That does avoid hashing several hundred bytes
typically, but it's hard to believe it'd amount to any
visible savings overall.

Anyway, PFA v10.

regards, tom lane

[1] https://www.postgresql.org/message-id/CAFj8pRDWDeF2cC%2BpCjLHJno7KnK5kdtjYN-f933RHS7UneArFw%40mail.gmail.com

Attachment	Content-Type	Size
v10-0001-Support-cached-plans-that-work-from-a-parse-anal.patch	text/x-diff	17.1 KB
v10-0002-Provide-a-post-rewrite-callback-hook-in-plancach.patch	text/x-diff	5.5 KB
v10-0003-Factor-out-plpgsql-s-management-of-its-function-.patch	text/x-diff	48.8 KB
v10-0004-Restructure-check_sql_fn_retval.patch	text/x-diff	13.2 KB
v10-0005-Add-a-test-case-showing-undesirable-RLS-behavior.patch	text/x-diff	4.8 KB
v10-0006-Change-SQL-language-functions-to-use-the-plan-ca.patch	text/x-diff	57.9 KB

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-30 18:53:02
Message-ID:	CAFj8pRCz4NRas1KNWnE5=8Lg16MNOYPb9OAU3SzDrS+WBy3BBQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> We get substantial wins on all of fx, fx3, fx4. fx2 is the
> case that gets inlined and never reaches functions.c, so the
> lack of change there is expected. What I found odd is that
> I saw a small speedup (~6%) on fx5 and fx6; those functions
> are in plpgsql so they really shouldn't change either.
> The only thing I can think of is that I made the hash key
> computation a tiny bit faster by omitting unused argtypes[]
> entries. That does avoid hashing several hundred bytes
> typically, but it's hard to believe it'd amount to any
> visible savings overall.
>
> Anyway, PFA v10.
>
>
I can confirm so all tests passed without problems

Regards

Pavel

> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/CAFj8pRDWDeF2cC%2BpCjLHJno7KnK5kdtjYN-f933RHS7UneArFw%40mail.gmail.com
>
>

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-31 10:31:45
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi.

Tom Lane писал(а) 2025-03-30 19:10:
> I spent some time reading and reworking this code, and have
> arrived at a patch set that I'm pretty happy with. I'm not
> sure it's quite committable but it's close:
>

> 0005: This extracts the RLS test case you had and commits it
> with the old non-failing behavior, just so that we can see that
> the new code does it differently. (I didn't adopt your test
> from rules.sql, because AFAICS it works the same with or without
> this patch set. What was the point of that one again?)

The test was introduced after my error to handle case when
execution_state
is NULL in fcache->func_state list due to statement being completely
removed
by instead rule. After founding this issue, I've added a test to cover
it.
Not sure if it should be preserved.

>
> 0006: The guts of the patch. I couldn't break this down any
> further.
>
> One big difference from what you had is that there is only one path
> of control: we always use the plan cache. The hack you had to not
> use it for triggers was only needed because you didn't include the
> right cache key items to distinguish different trigger usages, but
> the code coming from plpgsql has that right.
>

Yes, now it looks much more consistent. Still going through it. Will
do some additional testing here. So far have checked all known corner
cases and found no issues.

> Also, the memory management is done a bit differently. The
> "fcontext" memory context holding the SQLFunctionCache struct is
> now discarded at the end of each execution of the SQL function,
> which considerably alleviates worries about leaking memory there.
> I invented a small "SQLFunctionLink" struct that is what fn_extra
> points at, and it survives as long as the FmgrInfo does, so that's
> what saves us from redoing hash key computations in most cases.

I've looked through it and made some tests, including ones which caused
me
to create separate context for planing. Was a bit worried that it has
gone,
but now, as fcache->fcontext is deleted in the end of function
execution,
I don't see leaks, which were the initial reason for introducing it.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-31 14:50:43
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> writes:
> I've looked through it and made some tests, including ones which
> caused me to create separate context for planing. Was a bit worried
> that it has gone, but now, as fcache->fcontext is deleted in the end
> of function execution, I don't see leaks, which were the initial
> reason for introducing it.

Yeah. As it's set up in v10, we do parsing work in the caller's
context (which is expected to be short-lived) when creating or
recreating the long-lived cache entry. However, planning work
(if needed) is done in the fcontext, since that will happen within
init_execution_state which is called after fmgr_sql has switched into
the fcontext. I thought about switching back to the caller's context
but decided that it wouldn't really be worth the trouble. For a
non-SRF there's no meaningful difference anyway. For a SRF, it'd mean
that planning cruft survives till end of execution of the SRF rather
than possibly going away when the first row is returned. But we
aren't looping: any given query within the SRF is planned only once
during an execution of the function. So there's no possibility of
indefinite accumulation of leakage.

If we wanted to improve that, my inclination would be to try to
not switch into the fcontext for the whole of fmgr_sql, but only
use it explicitly for allocations that need to survive. But I
don't think that'll save much, so I think any such change would
be best left for later. The patch is big enough already.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-03-31 23:29:05
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> There's more stuff that could be done, but I feel that all of this
> could be left for later:
> * I really wanted to do what I mentioned upthread and change things
> so we don't even parse the later queries until we've executed the
> ones before that. However that seems to be a bit of a mess to
> make happen, and the patch is large/complex enough already.

I felt more regret about that decision after reading the discussion
at [1] and realizing that we already moved those goalposts part way.
For example, historically this fails:

set check_function_bodies to off;

create or replace function create_and_insert() returns void
language sql as $$
create table t1 (f1 int);
insert into t1 values (1.2);
$$;

select create_and_insert();

You get "ERROR: relation "t1" does not exist" because we try to
parse-analyze the INSERT before the CREATE has been executed.
That's still true with patch v10. However, consider this:

create table t1 (f1 int);

create or replace function alter_and_insert() returns void
language sql as $$
alter table t1 alter column f1 type numeric;
insert into t1 values (1.2);
$$;

select alter_and_insert();

Historically that fails with

ERROR: table row type and query-specified row type do not match
DETAIL: Table has type numeric at ordinal position 1, but query expects integer.
CONTEXT: SQL function "alter_and_insert" statement 2

because we built a plan for the INSERT before executing the ALTER.
However, with v10 it works! The INSERT is parse-analyzed against the
old table definition, but that's close enough that we can successfully
make a CachedPlanSource. Then when we come to execute the INSERT,
the plancache notices that what it has is already invalidated,
so it re-does everything and the correct data type is used.

So that left me quite unsatisfied. It's not great to modify edge-case
semantics a little bit and then come back and modify them some more in
the next release; better for it to all happen at once. Besides which,
it'd be quite difficult to document v10's behavior in a fashion that
makes any sense to users. I'd abandoned the idea of delaying parse
analysis over the weekend because time was running out and I had
enough things to worry about, but I resolved to take another look.
And it turns out not to be very hard at all to get there from where
we were in v10: we already did 90% of the restructuring needed to
do the processing incrementally. The main missing thing is we need
someplace to stash the raw parse trees or Queries we got from pg_proc
until we are ready to do the analyze-rewrite-and-make-a-cached-plan
business for them. I had already had the idea of making a second
context inside the SQLFunctionHashEntry to keep those trees in
until we don't need them anymore (which we don't once the last
CachedPlanSource is made). So that part wasn't hard. I then
discovered that the error reporting needed some tweaking, which
wasn't hard either.

So attached is v11. 0001-0006 are identical to v10, and then
0007 is the delta. (I'd plan to squash that in final commit,
but I thought it'd be easier to review this way.)

Compared to v10, this means that parse analysis work as well as
planning work is done in fcontext, so there's a little bit more
temporary leakage during the first run of a lazy-evaluation SRF.
I still think that this is not going to amount to anything worth
worrying about, but maybe it slightly raises the interest level
of not doing everything in fcontext.

Another point is that the order of the subroutines in functions.c
is starting to feel rather random. I left them like this to
minimize the amount of pure code motion involved in 0007, but
I'm tempted to rearrange them into something closer to
order-of-execution before final commit.

Anyway, I feel pretty good about this patch now and am quite content
to stop here for PG 18.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/4115257.1743441667%40sss.pgh.pa.us#ae47bbce3b234245c530ad06eb819d8c

Attachment	Content-Type	Size
v11-0001-Support-cached-plans-that-work-from-a-parse-anal.patch	text/x-diff	17.1 KB
v11-0002-Provide-a-post-rewrite-callback-hook-in-plancach.patch	text/x-diff	5.5 KB
v11-0003-Factor-out-plpgsql-s-management-of-its-function-.patch	text/x-diff	48.8 KB
v11-0004-Restructure-check_sql_fn_retval.patch	text/x-diff	13.2 KB
v11-0005-Add-a-test-case-showing-undesirable-RLS-behavior.patch	text/x-diff	4.8 KB
v11-0006-Change-SQL-language-functions-to-use-the-plan-ca.patch	text/x-diff	57.9 KB
v11-0007-Delay-parse-analysis-and-rewrite-until-we-re-rea.patch	text/x-diff	23.6 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-02 18:09:16
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> Anyway, I feel pretty good about this patch now and am quite content
> to stop here for PG 18.

Since feature freeze is fast approaching, I did a tiny bit more
cosmetic work on this patchset and then pushed it. (There's still
plenty of time for adjustments if you have further comments.)

Thanks for working on this! This is something I've wanted to see
done ever since we invented plancache.

regards, tom lane

From:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-03 05:38:18
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane писал(а) 2025-04-02 21:09:
> I wrote:
>> Anyway, I feel pretty good about this patch now and am quite content
>> to stop here for PG 18.
>
> Since feature freeze is fast approaching, I did a tiny bit more
> cosmetic work on this patchset and then pushed it. (There's still
> plenty of time for adjustments if you have further comments.)
>
> Thanks for working on this! This is something I've wanted to see
> done ever since we invented plancache.
>
> regards, tom lane

Hi. I've looked through the latest patch series, but didn't have much
time to examine them thoroughly.
Haven't found any issue so far. From cosmetic things I've noticed - I
would set func->pcontext to NULL in sql_delete_callback() to avoid
dangling pointer if we error out early (but it seems only to be a matter
of taste).
--
Best regards,
Alexander Pyhalov,
Postgres Professional

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-03 19:00:00
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Tom,

02.04.2025 21:09, Tom Lane wrote:
> Since feature freeze is fast approaching, I did a tiny bit more
> cosmetic work on this patchset and then pushed it. (There's still
> plenty of time for adjustments if you have further comments.)

I've discovered that starting from 0dca5d68d, the following query:
CREATE FUNCTION f(x anyelement) RETURNS anyarray AS '' LANGUAGE SQL;
SELECT f(0);

triggers:
TRAP: failed Assert("fcache->func->rettype == VOIDOID"), File: "functions.c", Line: 1737, PID: 3784779

On 0dca5d68d~1, it raises:
ERROR: return type mismatch in function declared to return integer[]
DETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE/MERGE RETURNING.
CONTEXT: SQL function "f" during startup

Best regards,
Alexander Lakhin
Neon (https://neon.tech)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-03 19:13:50
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> I've discovered that starting from 0dca5d68d, the following query:
> CREATE FUNCTION f(x anyelement) RETURNS anyarray AS '' LANGUAGE SQL;
> SELECT f(0);

> triggers:
> TRAP: failed Assert("fcache->func->rettype == VOIDOID"), File: "functions.c", Line: 1737, PID: 3784779

Drat. I thought I'd tested the empty-function-body case, but
evidently that was a few changes too far back. Will fix,
thanks for catching it.

regards, tom lane

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-04 19:00:01
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Tom,

03.04.2025 22:13, Tom Lane wrpte:
> Drat. I thought I'd tested the empty-function-body case, but
> evidently that was a few changes too far back. Will fix,
> thanks for catching it.

I've stumbled upon another defect introduced with 0dca5d68d:
CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL
IMMUTABLE;
SELECT f(1);
SELECT f(1);

fails under Valgrind with:
2025-04-04 18:31:13.771 UTC [242811] LOG: statement: SELECT f(1);
==00:00:00:19.324 242811== Invalid read of size 4
==00:00:00:19.324 242811==    at 0x48D610: copyObjectImpl (copyfuncs.c:187)
==00:00:00:19.324 242811==    by 0x490194: _copyAlias (copyfuncs.funcs.c:48)
==00:00:00:19.324 242811==    by 0x48D636: copyObjectImpl (copyfuncs.switch.c:19)
==00:00:00:19.324 242811==    by 0x491CBF: _copyRangeFunction (copyfuncs.funcs.c:1279)
==00:00:00:19.324 242811==    by 0x48DC9C: copyObjectImpl (copyfuncs.switch.c:271)
==00:00:00:19.324 242811==    by 0x4A295C: list_copy_deep (list.c:1652)
==00:00:00:19.324 242811==    by 0x4900FA: copyObjectImpl (copyfuncs.c:192)
==00:00:00:19.324 242811==    by 0x4931E4: _copySelectStmt (copyfuncs.funcs.c:2109)
==00:00:00:19.324 242811==    by 0x48E00C: copyObjectImpl (copyfuncs.switch.c:436)
==00:00:00:19.324 242811==    by 0x492F85: _copyRawStmt (copyfuncs.funcs.c:2026)
==00:00:00:19.324 242811==    by 0x48DFBC: copyObjectImpl (copyfuncs.switch.c:421)
==00:00:00:19.324 242811==    by 0x76DCCE: CreateCachedPlan (plancache.c:213)
==00:00:00:19.324 242811== Address 0x11b478d0 is 4,032 bytes inside a block of size 8,192 alloc'd
==00:00:00:19.324 242811==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:19.324 242811==    by 0x7AA15D: AllocSetContextCreateInternal (aset.c:444)
==00:00:00:19.324 242811==    by 0x43CF56: init_sql_fcache (functions.c:616)
==00:00:00:19.324 242811==    by 0x43CF56: fmgr_sql (functions.c:1484)
==00:00:00:19.324 242811==    by 0x423E13: ExecInterpExpr (execExprInterp.c:926)
==00:00:00:19.324 242811==    by 0x41EB54: ExecInterpExprStillValid (execExprInterp.c:2299)
...

Best regards,
Alexander Lakhin
Neon (https://neon.tech)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-04 21:47:34
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> I've stumbled upon another defect introduced with 0dca5d68d:
> CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL
> IMMUTABLE;
> SELECT f(1);
> SELECT f(1);

Hmm, I see

regression=# CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL IMMUTABLE;
CREATE FUNCTION
regression=# SELECT f(1);
ERROR: column "x" does not exist
LINE 1: SELECT x FROM generate_series(1,1) g(i)
^
QUERY: SELECT x FROM generate_series(1,1) g(i)
CONTEXT: SQL function "f" statement 1
regression=# SELECT f(1);
ERROR: unrecognized node type: 2139062143
CONTEXT: SQL function "f" statement 1

Did you intend the typo? The "unrecognized node type" does indicate
a problem, but your message doesn't seem to indicate that you're
expecting a syntax error.

regards, tom lane

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-04 22:00:01
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

05.04.2025 00:47, Tom Lane wrote:
> Alexander Lakhin<exclusion(at)gmail(dot)com> writes:
>> I've stumbled upon another defect introduced with 0dca5d68d:
>> CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL
>> IMMUTABLE;
>> SELECT f(1);
>> SELECT f(1);
> Hmm, I see
>
> regression=# CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL IMMUTABLE;
> CREATE FUNCTION
> regression=# SELECT f(1);
> ERROR: column "x" does not exist
> LINE 1: SELECT x FROM generate_series(1,1) g(i)
> ^
> QUERY: SELECT x FROM generate_series(1,1) g(i)
> CONTEXT: SQL function "f" statement 1
> regression=# SELECT f(1);
> ERROR: unrecognized node type: 2139062143
> CONTEXT: SQL function "f" statement 1
>
> Did you intend the typo? The "unrecognized node type" does indicate
> a problem, but your message doesn't seem to indicate that you're
> expecting a syntax error.
>

Yes, the typo is intended. With Valgrind, I get the "column does not exist"
error on the first call and the Valgrind complaint on the second one.

Best regards,
Alexander Lakhin
Neon (https://neon.tech)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-09 06:19:22
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

In the department of no-good-deed-goes-unpunished ...

I noticed that avocet and trilobite (two of our CLOBBER_CACHE_ALWAYS
animals) have started to fail on the deadlock-parallel isolation
test, with symptoms that look like they're timing out. Poking at
it here with a somewhat faster machine (a Mac M4), I see that under
debug_discard_caches=1 that test went from

ok 27 - deadlock-parallel 26362 ms

immediately before commit 0dca5d68d to

ok 27 - deadlock-parallel 267869 ms

immediately after. This is evidently because the test involves a lot
of evaluations of an intentionally-not-inline-able SQL function.
Previously we cached the plans for that function for the life of the
outer query, but now they're getting clobbered and rebuilt each time.

This is not wrong; the wrong behavior was hanging onto a
potentially-obsolete plan. But it's pretty unfortunate for the
runtime of this test under debug_discard_caches=1.

The simplest fix is to force that test to use debug_discard_caches=0,
but I don't love that answer. Anybody have a better idea?

It looks like the runtime of the infinite_recurse test on these
animals has taken a hit too for the same reason, and there may be
other places.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject:	Re: SQLFunctionCache and generic plans
Date:	2025-04-09 21:01:56
Message-ID:	[email protected]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> I noticed that avocet and trilobite (two of our CLOBBER_CACHE_ALWAYS
> animals) have started to fail on the deadlock-parallel isolation
> test, with symptoms that look like they're timing out.
> ...
> The simplest fix is to force that test to use debug_discard_caches=0,
> but I don't love that answer. Anybody have a better idea?

I thought of a better way: we can bypass the need to use a non-inlined
SQL function by declaring internal-language aliases for the
auxiliary-lock functions that are falsely marked parallel-safe.
It's a little bit ugly (see [1]) but I think it beats not running
this test under debug_discard_caches at all.

regards, tom lane

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=837cc73af29cd9c63515a6f2a36f54dd703a3a3f