Lists: | pgsql-hackers |
---|
From: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Implementing SQL ASSERTION |
Date: | 2015-04-30 22:36:32 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi all,
I’m wondering if there are other people out there working on implementing SQL ASSERTION functionality?
I’ve recently spent a bit of time looking to implement the execution models described in “Applied Mathematics for Database Professionals” by Toon Koppelaars and Lex de Haan. I’ve gotten as far as execution model 3 and am now looking at deriving polarity of involved tables to do EM4 (described in some detail in “Deriving Production Rules for Constraint Maintenance”, Ceri & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 look rather more difficult but I’m intending to try and implement those, too.
If there are other people working on this stuff it would be great to collaborate.
Regards.
-Joe
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2015-05-01 18:51:21 |
Message-ID: | CA+TgmoaHqgEY99vfOYAYJ-TsVJn411N8FBxwWstP4_2BixNTzQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Apr 30, 2015 at 6:36 PM, Joe Wildish
<joe-postgresql(dot)com(at)elusive(dot)cx> wrote:
> I’m wondering if there are other people out there working on implementing SQL ASSERTION functionality?
>
> I’ve recently spent a bit of time looking to implement the execution models described in “Applied Mathematics for Database Professionals” by Toon Koppelaars and Lex de Haan. I’ve gotten as far as execution model 3 and am now looking at deriving polarity of involved tables to do EM4 (described in some detail in “Deriving Production Rules for Constraint Maintenance”, Ceri & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 look rather more difficult but I’m intending to try and implement those, too.
>
> If there are other people working on this stuff it would be great to collaborate.
I don't know of anyone working on this. It sounds very difficult.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2015-05-02 21:42:24 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On 1 May 2015, at 19:51, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Apr 30, 2015 at 6:36 PM, Joe Wildish
> <joe-postgresql(dot)com(at)elusive(dot)cx> wrote:
>> I’m wondering if there are other people out there working on implementing SQL ASSERTION functionality?
>>
>> I’ve recently spent a bit of time looking to implement the execution models described in “Applied Mathematics for Database Professionals” by Toon Koppelaars and Lex de Haan. I’ve gotten as far as execution model 3 and am now looking at deriving polarity of involved tables to do EM4 (described in some detail in “Deriving Production Rules for Constraint Maintenance”, Ceri & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 look rather more difficult but I’m intending to try and implement those, too.
>>
>> If there are other people working on this stuff it would be great to collaborate.
>
> I don't know of anyone working on this. It sounds very difficult.
The book I mention details a series of execution models, where each successive model aims to validate the assertion in a more efficient manner than the last. This is achieved by performing static analysis of the assertion's expression to determine under what circumstances the assertion need be (re)checked. Briefly:
EM1: after all DML statements;
EM2: only after DML statements involving tables mentioned in the assertion expression;
EM3: only after DML statements involving the columns mentioned in the assertion expression;
EM4: only after DML statements involving the columns, plus if the statement has a “polarity” that may affect the assertion expression.
“Polarity" here means that one is able to (statically) determine if only INSERTS and not DELETES can affect an expression or vice-versa.
EMs 5 and 6 are further enhancements that make use of querying the “transition effect” data of what actually changed in a statement, to determine if the assertion expression need be validated. I’ve not done as much reading around this topic yet so am concentrating on EMs 1-4.
I agree it is a difficult problem but there are a fair number of published academic papers relating to this topic. The AM4DP book draws a lot of this research together and presents the executions models.
I may start writing up on a blog of where I get to, and then post further to this list, if there is interest.
Regards.
-Joe
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2015-05-03 01:42:26 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sat, May 02, 2015 at 10:42:24PM +0100, Joe Wildish wrote:
>
> > On 1 May 2015, at 19:51, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Thu, Apr 30, 2015 at 6:36 PM, Joe Wildish
> > <joe-postgresql(dot)com(at)elusive(dot)cx> wrote:
> >> I’m wondering if there are other people out there working on implementing SQL ASSERTION functionality?
> >>
> >> I’ve recently spent a bit of time looking to implement the execution models described in “Applied Mathematics for Database Professionals” by Toon Koppelaars and Lex de Haan. I’ve gotten as far as execution model 3 and am now looking at deriving polarity of involved tables to do EM4 (described in some detail in “Deriving Production Rules for Constraint Maintenance”, Ceri & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 look rather more difficult but I’m intending to try and implement those, too.
> >>
> >> If there are other people working on this stuff it would be great to collaborate.
> >
> > I don't know of anyone working on this. It sounds very difficult.
>
> The book I mention details a series of execution models, where each successive model aims to validate the assertion in a more efficient manner than the last. This is achieved by performing static analysis of the assertion's expression to determine under what circumstances the assertion need be (re)checked. Briefly:
>
> EM1: after all DML statements;
> EM2: only after DML statements involving tables mentioned in the assertion expression;
> EM3: only after DML statements involving the columns mentioned in the assertion expression;
> EM4: only after DML statements involving the columns, plus if the statement has a “polarity” that may affect the assertion expression.
>
> “Polarity" here means that one is able to (statically) determine if only INSERTS and not DELETES can affect an expression or vice-versa.
>
> EMs 5 and 6 are further enhancements that make use of querying the “transition effect” data of what actually changed in a statement, to determine if the assertion expression need be validated. I’ve not done as much reading around this topic yet so am concentrating on EMs 1-4.
>
> I agree it is a difficult problem but there are a fair number of published academic papers relating to this topic. The AM4DP book draws a lot of this research together and presents the executions models.
>
> I may start writing up on a blog of where I get to, and then post further to this list, if there is interest.
I suspect that you would get a lot further with a PoC patch including
the needed documentation. Remember to include how this would work at
all the transaction isolation levels and combinations of same that we
support. Recall also to include the lock strength needed. Just about
anything can be done with a database-wide lock :)
Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2015-05-03 09:24:46 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On 3 May 2015, at 02:42, David Fetter <david(at)fetter(dot)org> wrote:
>
> On Sat, May 02, 2015 at 10:42:24PM +0100, Joe Wildish wrote:
>>
>> I may start writing up on a blog of where I get to, and then post further to this list, if there is interest.
>
> I suspect that you would get a lot further with a PoC patch including
> the needed documentation. Remember to include how this would work at
> all the transaction isolation levels and combinations of same that we
> support. Recall also to include the lock strength needed. Just about
> anything can be done with a database-wide lock :)
Thanks David. I’m obviously new here so I not that familiar with how one starts contributing.
Once I get to a decent level with the EM4 PoC I will post the details to this list. The general idea is that upon assertion creation, the expression is analysed to determine when it needs to be validated — corresponding internal "after statement” triggers are then created. There will definitely need to be some serialisation take place on the basis of when an assertion has been validated, but I’ve not got that far yet. I’ll be sure to include the details when I post though.
Regards.
-Joe
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)com(at)elusive(dot)cx>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2015-05-06 02:00:11 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 4/30/15 6:36 PM, Joe Wildish wrote:
> I’m wondering if there are other people out there working on implementing SQL ASSERTION functionality?
I was the last one, probably:
<http://www.postgresql.org/message-id/[email protected]>.
I intend to pick up that work sometime, but feel free to review the
thread for a start. The main question was how to manage transaction
isolation.
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-14 23:33:08 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hackers,
Attached is a WIP patch for SQL assertion. I am posting it for anyone who might be interested in seeing it, for comments/feedback, and to see if others are keen to collaborate on taking it further. It is not near production-ready (see thoughts on that below).
The patch builds on the work posted by Peter back in 2013. I've taken his code and updated it to conform to some general changes made to the codebase since then. The bulk of the new work I have done is around when an assertion needs to be checked. Essentially it is an implementation of the algorithm described by Ceri & Widom in "Deriving Production Rules for Constraint Maintenance” — http://infolab.stanford.edu/pub/papers/constraint-maintenance.ps
The general idea is to traverse the expression tree and derive the set of potentially invalidating operations. These operations are used to determine when the constraint trigger fires and causes a re-check. The detail is in the paper but some examples are:
* insertion into the subject of an exists cannot be invalidating;
* deletion from the subject of a not exists cannot be invalidating;
* update of columns in the target list of an exists cannot be invalidating;
* certain combinations of aggregates with comparison operations cannot be invalidating.
As an example of the last point, the expression "CHECK (10 > (SELECT COUNT(*) FROM t))" cannot be invalidated by a delete or an update but can be invalidated by an insert.
I have implemented most of the optimisations mentioned in the paper. There are one or two that I am unsure about, specifically how to deal with set-operations that are the subject of an exists. According to the paper, these are optimisable when they're the subject of an exists, but I think it is only applicable for union and not intersect or except, so I have skipped that particular optimisation for the time being.
The algorithm works under the assumption that when a recheck occurs the previous check result was true (the research report by Ceri & Widom does acknolwedge this assumption). However, unfortunately the SQL specification requires that both true and unknown be valid results for an assertion's check expression. This doesn't play too well with the algorithm so for the time being I have disallowed null. I think the solution here may be that when a null result for a check occurs, the assertion is changed to trigger on all operations against the involved tables; once it returns to true, the triggers can be returned to fire only on the derived invalidating operations. More thought required though. (note: having just written this paragraph, I've realised I can't right now think of a concrete example illustrating the point, so it may be that I'm wrong on this).
The paper does mention a set of optimisations that I have not yet attempted to implement. These are essentially the technique of evaluating the expression against the deltas of a change rather than the full tables. Clearly there is a large overlap with incremental maintainence of views and actually the two authors of the paper have a similiarly named paper called "Deriving Production Rules for Incremental View Maintanence". Although I have yet to finish reviewing all the literature on the subject, I suspect that realistically for this to make it into production, we'd need some implementation of these techniques to make the performance palatable.
Cheers,
-Joe
Attachment | Content-Type | Size |
---|---|---|
0001-SQL-assertion-WIP.patch | application/octet-stream | 151.2 KB |
From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-15 12:10:36 |
Message-ID: | alpine.DEB.2.20.1801151023320.3539@lancre |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hello Joe,
Just a reaction to the example, which is maybe addressed in the patch
which I have not investigated.
> * certain combinations of aggregates with comparison operations cannot
> be invalidating.
>
> As an example of the last point, the expression "CHECK (10 > (SELECT
> COUNT(*) FROM t))" cannot be invalidated by a delete or an update but
> can be invalidated by an insert.
I'm wondering about the effect of MVVC on this: if the check is performed
when the INSERT is done, concurrent inserting transactions would count the
current status which would be ok, but on commit all concurrent inserts
would be there and the count could not be ok anymore?
Maybe if the check was deferred, but this is not currently possible with
pg (eg the select can simply be put in a function), and I there might be
race conditions. ISTM that such a check would imply non trivial locking to
be okay, it is not just a matter of deciding whether to invoke the check
or not.
--
Fabien.
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-15 12:50:52 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Fabien,
>> * certain combinations of aggregates with comparison operations cannot be invalidating.
>>
>> As an example of the last point, the expression "CHECK (10 > (SELECT COUNT(*) FROM t))" cannot be invalidated by a delete or an update but can be invalidated by an insert.
>
> I'm wondering about the effect of MVVC on this: if the check is performed when the INSERT is done, concurrent inserting transactions would count the current status which would be ok, but on commit all concurrent inserts would be there and the count could not be ok anymore?
Yes, there was quite a bit of discussion in the original thread about concurrency. See here:
https://www.postgresql.org/message-id/flat/1384486216(dot)5008(dot)17(dot)camel%40vanquo(dot)pezone(dot)net#1384486216(dot)5008(dot)17(dot)camel(at)vanquo(dot)pezone(dot)net <https://www.postgresql.org/message-id/flat/1384486216(dot)5008(dot)17(dot)camel(at)vanquo(dot)pezone(dot)net#1384486216(dot)5008(dot)17(dot)[email protected]>
The patch doesn’t attempt to address concurrency (beyond the obvious benefit of reducing the circumstances under which the assertion is checked). I am working under the assumption that we will find some acceptable way for that to be resolved :-) And at the moment, working in serialisable mode addresses this issue. I think that is suggested in the thread actually (essentially, if you want to use assertions, you require that transactions be performed at serialisable isolation level).
> Maybe if the check was deferred, but this is not currently possible with pg (eg the select can simply be put in a function), and I there might be race conditions. ISTM that such a check would imply non trivial locking to be okay, it is not just a matter of deciding whether to invoke the check or not.
I traverse into SQL functions so that the analysis can capture invalidating operations from the expression inside the function. Only internal and SQL functions are considered legal. Other languages are rejected.
-Joe
From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-15 14:40:57 |
Message-ID: | alpine.DEB.2.20.1801151357080.11126@lancre |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
>> I'm wondering about the effect of MVVC on this: if the check is
>> performed when the INSERT is done, concurrent inserting transactions
>> would count the current status which would be ok, but on commit all
>> concurrent inserts would be there and the count could not be ok
>> anymore?
> The patch doesn’t attempt to address concurrency (beyond the obvious
> benefit of reducing the circumstances under which the assertion is
> checked). I am working under the assumption that we will find some
> acceptable way for that to be resolved :-) And at the moment, working in
> serialisable mode addresses this issue. I think that is suggested in the
> thread actually (essentially, if you want to use assertions, you require
> that transactions be performed at serialisable isolation level).
Thanks for the pointers. The "serializable" isolation level restriction
sounds reasonnable.
--
Fabien.
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Cc: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-15 16:35:20 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Jan 15, 2018 at 03:40:57PM +0100, Fabien COELHO wrote:
>
> >>I'm wondering about the effect of MVVC on this: if the check is
> >>performed when the INSERT is done, concurrent inserting transactions
> >>would count the current status which would be ok, but on commit all
> >>concurrent inserts would be there and the count could not be ok anymore?
>
> >The patch doesn’t attempt to address concurrency (beyond the obvious
> >benefit of reducing the circumstances under which the assertion is
> >checked). I am working under the assumption that we will find some
> >acceptable way for that to be resolved :-) And at the moment, working in
> >serialisable mode addresses this issue. I think that is suggested in the
> >thread actually (essentially, if you want to use assertions, you require
> >that transactions be performed at serialisable isolation level).
>
> Thanks for the pointers. The "serializable" isolation level restriction
> sounds reasonnable.
It sounds reasonable enough that I'd like to make a couple of Modest
Proposals™, to wit:
- We follow the SQL standard and make SERIALIZABLE the default
transaction isolation level, and
- We disallow writes at isolation levels other than SERIALIZABLE when
any ASSERTION could be in play.
That latter could range in implementation from crashingly unsubtle to
very precise.
Crashingly Unsubtle:
Disallow writes at any isolation level other than SERIALIZABLE.
Very Precise:
Disallow writes at any other isolation level when the ASSERTION
could come into play using the same machinery that enforces the
ASSERTION in the first place.
What say?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-01-15 21:14:02 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi David,
> On 15 Jan 2018, at 16:35, David Fetter <david(at)fetter(dot)org> wrote:
>
> It sounds reasonable enough that I'd like to make a couple of Modest
> Proposals™, to wit:
>
> - We follow the SQL standard and make SERIALIZABLE the default
> transaction isolation level, and
>
> - We disallow writes at isolation levels other than SERIALIZABLE when
> any ASSERTION could be in play.
Certainly it would be easy to put a test into the assertion check function to require the isolation level be serialisable. I didn’t realise that that was also the default level as per the standard. That need not necessarily be changed, of course; it would be obvious to the user that it was a requirement as the creation of an assertion would fail without it, as would any subsequent attempts to modify the involved tables.
-Joe
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-07 15:10:44 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Jan 15, 2018 at 09:14:02PM +0000, Joe Wildish wrote:
> Hi David,
>
> > On 15 Jan 2018, at 16:35, David Fetter <david(at)fetter(dot)org> wrote:
> >
> > It sounds reasonable enough that I'd like to make a couple of Modest
> > Proposals™, to wit:
> >
> > - We follow the SQL standard and make SERIALIZABLE the default
> > transaction isolation level, and
> >
> > - We disallow writes at isolation levels other than SERIALIZABLE when
> > any ASSERTION could be in play.
>
> Certainly it would be easy to put a test into the assertion check
> function to require the isolation level be serialisable. I didn’t
> realise that that was also the default level as per the standard.
> That need not necessarily be changed, of course; it would be obvious
> to the user that it was a requirement as the creation of an
> assertion would fail without it, as would any subsequent attempts to
> modify the involved tables.
This patch no longer applies. Any chance of a rebase?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-08 21:11:58 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi David,
>
> This patch no longer applies. Any chance of a rebase?
>
Of course. I’ll look at it this weekend,
Cheers,
-Joe
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-08 21:12:40 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Mar 08, 2018 at 09:11:58PM +0000, Joe Wildish wrote:
> Hi David,
>
> >
> > This patch no longer applies. Any chance of a rebase?
>
> Of course. I’ll look at it this weekend,
Much appreciate it!
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-09 17:37:23 |
Message-ID: | CA+TgmoZNCL9oB+mnYhZr_U7DEcQSsas-tWiZu6ocn=1=PLjdPA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Jan 15, 2018 at 11:35 AM, David Fetter <david(at)fetter(dot)org> wrote:
> - We follow the SQL standard and make SERIALIZABLE the default
> transaction isolation level, and
The consequences of such a decision would include:
- pgbench -S would run up to 10x slower, at least if these old
benchmark results are still valid:
- pgbench without -S would fail outright, because it doesn't have
provision to retry failed transactions.
https://commitfest.postgresql.org/16/1419/
- Many user applications would probably also experience similar difficulties.
- Parallel query would no longer work by default, unless this patch
gets committed:
https://commitfest.postgresql.org/17/1004/
I think a good deal of work to improve the performance of serializable
would need to be done before we could even think about making it the
default -- and even then, the fact that it really requires the
application to be retry-capable seems like a pretty major obstacle.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | David Fetter <david(at)fetter(dot)org>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-09 20:11:25 |
Message-ID: | CAEepm=1_eYb91Ts4wR1Y43BoDN-o1rR-Mm1_QkQ5+DU6QsxsjQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sat, Mar 10, 2018 at 6:37 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jan 15, 2018 at 11:35 AM, David Fetter <david(at)fetter(dot)org> wrote:
>> - We follow the SQL standard and make SERIALIZABLE the default
>> transaction isolation level, and
>
> The consequences of such a decision would include:
>
> - pgbench -S would run up to 10x slower, at least if these old
> benchmark results are still valid:
>
> https://www.postgresql.org/message-id/CA+TgmoZog1wFbyrqzJUkiLSXw5sDUjJGUeY0c2BqSG-tciSB7w@mail.gmail.com
>
> - pgbench without -S would fail outright, because it doesn't have
> provision to retry failed transactions.
>
> https://commitfest.postgresql.org/16/1419/
>
> - Many user applications would probably also experience similar difficulties.
>
> - Parallel query would no longer work by default, unless this patch
> gets committed:
>
> https://commitfest.postgresql.org/17/1004/
>
> I think a good deal of work to improve the performance of serializable
> would need to be done before we could even think about making it the
> default -- and even then, the fact that it really requires the
> application to be retry-capable seems like a pretty major obstacle.
Also:
- It's not available on hot standbys. Experimental patches have been
developed based on the read only safe snapshot concept, but some
tricky problems remain unsolved.
- Performance is terrible (conflicts are maximised) if you use any
index type except btree, unless some of these get committed:
https://commitfest.postgresql.org/17/1172/
https://commitfest.postgresql.org/17/1183/
https://commitfest.postgresql.org/17/1466/
--
Thomas Munro
http://www.enterprisedb.com
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, David Fetter <david(at)fetter(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-18 12:29:50 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
>
>>
>> This patch no longer applies. Any chance of a rebase?
>>
>
Attached is a rebased version of this patch. It takes into account the ACL checking changes and a few other minor amendments.
Cheers,
-Joe
Attachment | Content-Type | Size |
---|---|---|
0001-SQL-ASSERTION-prototype.patch | application/octet-stream | 150.8 KB |
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-18 19:36:53 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, Mar 18, 2018 at 12:29:50PM +0000, Joe Wildish wrote:
> >
> >>
> >> This patch no longer applies. Any chance of a rebase?
> >>
> >
>
>
> Attached is a rebased version of this patch. It takes into account the ACL checking changes and a few other minor amendments.
Thanks!
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-03-28 15:13:59 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, Mar 18, 2018 at 12:29:50PM +0000, Joe Wildish wrote:
> >
> >>
> >> This patch no longer applies. Any chance of a rebase?
>
> Attached is a rebased version of this patch. It takes into account
> the ACL checking changes and a few other minor amendments.
Sorry to bother you again, but this now doesn't compile atop master.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-04-29 18:18:00 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 28 Mar 2018, at 16:13, David Fetter <david(at)fetter(dot)org> wrote:
>
> Sorry to bother you again, but this now doesn't compile atop master.
Attached is a rebased patch for the prototype.
Cheers,
-Joe
Attachment | Content-Type | Size |
---|---|---|
0001-SQL-ASSERTION-prototype.patch | application/octet-stream | 151.0 KB |
unknown_filename | text/plain | 3 bytes |
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-04-29 19:36:59 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, Apr 29, 2018 at 07:18:00PM +0100, Joe Wildish wrote:
> On 28 Mar 2018, at 16:13, David Fetter <david(at)fetter(dot)org> wrote:
> >
> > Sorry to bother you again, but this now doesn't compile atop master.
>
> Attached is a rebased patch for the prototype.
Thanks!
This is great timing for the 12 cycle :)
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-24 14:06:02 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 29/04/2018 20:18, Joe Wildish wrote:
> On 28 Mar 2018, at 16:13, David Fetter <david(at)fetter(dot)org> wrote:
>>
>> Sorry to bother you again, but this now doesn't compile atop master.
>
> Attached is a rebased patch for the prototype.
I took a look at this.
This has been lying around for a few months, so it will need to be
rebased again. I applied this patch on top of
68e7e973d22274a089ce95200b3782f514f6d2f8, which was the HEAD around the
time this patch was created, and it applies cleanly there.
Please check you patch for whitespace errors:
warning: squelched 13 whitespace errors
warning: 18 lines add whitespace errors.
Also, reduce the amount of useless whitespace changes in the patch.
There are some compiler warnings:
constraint.c: In function 'CreateAssertion':
constraint.c:1211:2: error: ISO C90 forbids mixed declarations and code
[-Werror=declaration-after-statement]
constraint.c: In function 'oppositeDmlOp':
constraint.c:458:1: error: control reaches end of non-void function
[-Werror=return-type]
The version check in psql's describeAssertions() needs to be updated.
Also, you should use formatPGVersionNumber() to cope with two-part and
one-part version numbers.
All this new code in constraint.c that checks the assertion expression
needs more comments and documentation.
Stuff like this isn't going to work:
static int
funcMaskForFuncOid(Oid funcOid)
{
char *name = get_func_name(funcOid);
if (name == NULL)
return OTHER_FUNC;
else if (strncmp(name, "min", strlen("min")) == 0)
return MIN_AGG_FUNC;
else if (strncmp(name, "max", strlen("max")) == 0)
return MAX_AGG_FUNC;
You can assume from the name of a function what it's going to do.
Solving this properly might be hard.
The regression test crashes for me around
frame #4: 0x000000010d3a4cdc postgres`castNodeImpl(type=T_SubLink,
ptr=0x00007ff27006d230) at nodes.h:582
frame #5: 0x000000010d3a61c6
postgres`visitSubLink(node=0x00007ff270034040, info=0x00007ffee2a23930)
at constraint.c:843
This ought to be reproducible for you if you build with assertions.
My feeling is that if we want to move forward on this topic, we need to
solve the concurrency question first. All these optimizations for when
we don't need to check the assertion are cool, but they are just
optimizations that we can apply later on, once we have solved the
critical problems.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-24 23:04:12 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Peter,
> On 24 Sep 2018, at 15:06, Peter Eisentraut
> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>
> On 29/04/2018 20:18, Joe Wildish wrote:
>>
>> Attached is a rebased patch for the prototype.
>
> I took a look at this.
Thank you for reviewing.
> This has been lying around for a few months, so it will need to be
> rebased again.
>
> 8< - - - snipped for brevity - - - 8<
>
> All this new code in constraint.c that checks the assertion expression
> needs more comments and documentation.
All agreed. I’ll give the patch some TLC and get a new version that
addresses the above.
> Stuff like this isn't going to work:
>
> static int
> funcMaskForFuncOid(Oid funcOid)
> {
> char *name = get_func_name(funcOid);
>
> if (name == NULL)
> return OTHER_FUNC;
> else if (strncmp(name, "min", strlen("min")) == 0)
> return MIN_AGG_FUNC;
> else if (strncmp(name, "max", strlen("max")) == 0)
> return MAX_AGG_FUNC;
>
> You can assume from the name of a function what it's going to do.
> Solving this properly might be hard.
Agreed. My assumption was that we would record in the data dictionary the
behaviour (or “polarity") of each aggregate function with respect to the
various operators. Column in pg_aggregate? I don’t know how we’d record it
exactly. A bitmask would be a possibility. Also, I don’t know what we’d do
with custom aggregate functions (or indeed custom operators). Allowing end
users to determine the value would potentially lead to assertion checks
being incorrectly skipped. Maybe we’d say that custom aggregates always
have a neutral polarity and are therefore not subject to this
optimisation.
> This ought to be reproducible for you if you build with assertions.
Yes. I shall correct this when I do the aforementioned rebase and
application of TLC.
> My feeling is that if we want to move forward on this topic, we need to
> solve the concurrency question first. All these optimizations for when
> we don't need to check the assertion are cool, but they are just
> optimizations that we can apply later on, once we have solved the
> critical problems.
I obviously agree that the concurrency issue needs solving. But I don’t
see that at all as a separate matter from the algos. Far from being merely
optimisations, the research indicates we can go a lot further toward
reducing the need for rechecks and, therefore, reducing the chance of
concurrency conflicts from occurring in the first place. This is true
regardless of whatever mechanism we use to enforce correct behaviour under
concurrent modifications -- e.g. a lock on the ASSERTION object itself,
enforced use of SERIALIZABLE, etc.
By way of example (lifted directly from the AM4DP book):
CREATE TABLE employee (
id INTEGER PRIMARY KEY,
dept INTEGER NOT NULL,
job TEXT NOT NULL
);
CREATE ASSERTION department_managers_need_administrators CHECK
(NOT EXISTS
(SELECT dept
FROM employee a
WHERE EXISTS (SELECT * FROM employee b
WHERE a.dept = b.dept
AND b.job IN ('Manager', 'Senior Manager'))
AND NOT EXISTS (SELECT * FROM employee b
WHERE a.dept = b.dept
AND b.job = 'Administrator')));
The current implementation derives "DELETE(employee), INSERT(employee) and
UPDATE(employee.dept, employee.job)" as the set of invalidating operations
and triggers accordingly. However, in this case, we can supplement the
triggers by having them inspect the transition tables to see if the actual
data from the triggering DML statement could in fact affect the truth of
the expression: specifically, only do the recheck on DELETE of an
"Administrator", INSERT of a "Manager" or "Senior Manager", or UPDATE when
the new job is a "Manager" or "Senior Manager" or the old job was an
"Administrator".
Now, if this is a company with 10,000 employees, and would therefore
presumably only require a handful of managers, right? ;-), then the
potential for a concurrency conflict is massively reduced when compared to
rechecking every time the employee table is touched.
(This optimisation has some caveats and is reliant upon being able to
derive the key of an expression from the underlying base tables plus some
stuff about functional dependencies. I have started work on it but sadly
not had time to progress it in recent months).
Having said all that: there are obviously going to be some expressions
that cannot be proven to have no potential for invalidating the assertion
truth. I guess this is the prime concern from a concurrency PoV? Example:
CREATE TABLE t (
b BOOLEAN NOT NULL,
n INTEGER NOT NULL,
PRIMARY KEY (b, n)
);
CREATE ASSERTION sum_per_b_less_than_10 CHECK
(NOT EXISTS
(SELECT FROM (SELECT b, SUM(n)
FROM t
GROUP BY b) AS v(b, sum_n)
WHERE sum_n > 10));
Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)". I guess the
interesting case, from a concurrency perspective, is how do we avoid an
INSERT WHERE b IS TRUE from blocking an INSERT WHERE B IS FALSE? I don’t
have an answer to that unfortunately. Although my understanding was that
SSI could help in these sorts of cases, but I really haven't read or
looked into the detail (yet). Thoughts?
-Joe
From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-25 00:51:28 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
>>>>> "Joe" == Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> writes:
Joe> Agreed. My assumption was that we would record in the data
Joe> dictionary the behaviour (or “polarity") of each aggregate
Joe> function with respect to the various operators. Column in
Joe> pg_aggregate? I don’t know how we’d record it exactly.
I haven't looked at the background of this, but if what you want to know
is whether the aggregate function has the semantics of min() or max()
(and if so, which) then the place to look is pg_aggregate.aggsortop.
(For a given aggregate foo(x), the presence of an operator oid in
aggsortop means something like "foo(x) is equivalent to (select x from
... order by x using OP limit 1)", and the planner will replace the
aggregate by the applicable subquery if it thinks it'd be faster.)
As for operators, you can only make assumptions about their meaning if
the operator is a member of some opfamily that assigns it some
semantics. For example, the planner can assume that WHERE x=y AND x=1
implies that y=1 (assuming x and y are of appropriate types) not because
it assumes that "=" is the name of a transitive operator, but because
the operators actually selected for (x=1) and (x=y) are both "equality"
members of the same btree operator family. Likewise proving that (a>2)
implies (a>1) requires knowing that > is a btree comparison op.
--
Andrew (irc:RhodiumToad)
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-26 11:36:05 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 25/09/2018 01:04, Joe Wildish wrote:
> Having said all that: there are obviously going to be some expressions
> that cannot be proven to have no potential for invalidating the assertion
> truth. I guess this is the prime concern from a concurrency PoV?
Before we spend more time on this, I think we need to have at least a
plan for that. Perhaps we could should disallow cases that we can't
handle otherwise. But even that would need some analysis of which
practical cases we can and cannot handle, how we could extend support in
the future, etc.
In the meantime, I have committed parts of your gram.y changes that seem
to come up every time someone dusts off an assertions patch. Keep that
in mind when you rebase.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-26 18:47:30 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, Sep 25, 2018 at 12:04:12AM +0100, Joe Wildish wrote:
> Hi Peter,
>
> > My feeling is that if we want to move forward on this topic, we need to
> > solve the concurrency question first. All these optimizations for when
> > we don't need to check the assertion are cool, but they are just
> > optimizations that we can apply later on, once we have solved the
> > critical problems.
>
> Having said all that: there are obviously going to be some expressions
> that cannot be proven to have no potential for invalidating the assertion
> truth. I guess this is the prime concern from a concurrency PoV? Example:
>
> CREATE TABLE t (
> b BOOLEAN NOT NULL,
> n INTEGER NOT NULL,
> PRIMARY KEY (b, n)
> );
>
> CREATE ASSERTION sum_per_b_less_than_10 CHECK
> (NOT EXISTS
> (SELECT FROM (SELECT b, SUM(n)
> FROM t
> GROUP BY b) AS v(b, sum_n)
> WHERE sum_n > 10));
>
> Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)".
So would DELETE(t), assuming n can be negative.
Is there some interesting and fairly easily documented subset of
ASSERTIONs that wouldn't have the "can't prove" property?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-29 16:35:18 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 26 Sep 2018, at 12:36, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>
> On 25/09/2018 01:04, Joe Wildish wrote:
>> Having said all that: there are obviously going to be some expressions
>> that cannot be proven to have no potential for invalidating the assertion
>> truth. I guess this is the prime concern from a concurrency PoV?
>
> Before we spend more time on this, I think we need to have at least a
> plan for that.
Having thought about this some more: the answer could lie in using predicate
locks, and enforcing that the transaction be SERIALIZABLE whenever an ASSERTION
is triggered.
To make use of the predicate locks we'd do a transformation on the ASSERTION
expression. I believe that there is derivation, similar to the one mentioned
up-thread re: "managers and administrators", that would essentially push
predicates into the expression on the basis of the changed data. The semantics
of the expression would remain unchanged, but it would mean that when the
expression is rechecked, the minimal set of data is read and would therefore not
conflict with other DML statements that had triggered the same ASSERTION but had
modified unrelated data. Example:
CREATE TABLE t
(n INTEGER NOT NULL,
m INTEGER NOT NULL,
k INTEGER NOT NULL,
PRIMARY KEY (n, m));
CREATE ASSERTION sum_k_at_most_10 CHECK
(NOT EXISTS
(SELECT * FROM
(SELECT n, sum(k)
FROM t
GROUP BY n)
AS r(n, ks)
WHERE ks > 10));
On an INSERT/DELETE/UPDATE of "t", we would transform the inner-most expression
of the ASSERTION to have a predicate of "WHERE n = NEW.n". In my experiments I
can see that doing so allows concurrent transactions to COMMIT that have
modified unrelated segments of "t" (assuming the planner uses Index Scan). The
efficacy of this would be dictated by the granularity of the SIREAD locks; my
understanding is that this can be as low as tuple-level in the case where Index
Scans are used (and this is borne out in my experiments - ie. you don't want a
SeqScan).
> Perhaps we could should disallow cases that we can't
> handle otherwise. But even that would need some analysis of which
> practical cases we can and cannot handle, how we could extend support in
> the future, etc.
The optimisation I mentioned up-thread, plus the one hypothesised here, both
rely on being able to derive the key of an expression from the underlying base
tables/other expressions. We could perhaps disallow ASSERTIONS that don't have
such properties?
Beyond that I think it starts to get difficult (impossible?) to know which
expressions are likely to be costly on the basis of static analysis. It could be
legitimate to have an ASSERTION defined over what turns out to be a small subset
of a very large table, for example.
-Joe
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-29 16:50:58 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Andrew,
On 25 Sep 2018, at 01:51, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> I haven't looked at the background of this, but if what you want to know
> is whether the aggregate function has the semantics of min() or max()
> (and if so, which) then the place to look is pg_aggregate.aggsortop.
Thanks for the pointer. I've had a quick look at pg_aggregate, and back
at my code, but I think there is more to it than just the sorting property.
Specifically we need to know about the aggregate function when combined with
connectors <, <=, < ANY, <= ANY, < ALL and <= ALL (and their equivalents
with ">" and ">="). Also, it looks like COUNT and SUM don't have a sortop
(the other aggregates I've catered for do though).
When I come to do the rework of the patch I'll take a more in-depth look
though, and see if this can be utilised.
> As for operators, you can only make assumptions about their meaning if
> the operator is a member of some opfamily that assigns it some
> semantics.
I had clocked the BT semantics stuff when doing the PoC patch. I have used
the "get_op_btree_interpretation" function for determining operator meaning.
-Joe
From: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-29 17:07:41 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi David,
> On 26 Sep 2018, at 19:47, David Fetter <david(at)fetter(dot)org> wrote:
>
>> Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)".
>
> So would DELETE(t), assuming n can be negative.
Oops, right you are. Bug in my implementation :-)
> Is there some interesting and fairly easily documented subset of
> ASSERTIONs that wouldn't have the "can't prove" property?
We can certainly know at the time the ASSERTION is created if we
can use the transition table optimisation, as that relies upon
the expression being written in such a way that a key can be
derived for each expression.
We could warn or disallow the creation on that basis. Ceri & Widom
mention this actually in their papers, and their view is that most
real-world use cases do indeed allow themselves to be optimised
using the transition tables.
-Joe
From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-09-29 23:07:29 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
>>>>> "Joe" == Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx> writes:
>> I haven't looked at the background of this, but if what you want to
>> know is whether the aggregate function has the semantics of min() or
>> max() (and if so, which) then the place to look is
>> pg_aggregate.aggsortop.
Joe> Thanks for the pointer. I've had a quick look at pg_aggregate, and
Joe> back at my code, but I think there is more to it than just the
Joe> sorting property. Specifically we need to know about the aggregate
Joe> function when combined with connectors <, <=, < ANY, <= ANY, < ALL
Joe> and <= ALL (and their equivalents with ">" and ">=").
The presence of an aggsortop means "this aggregate function is
interchangeable with (select x from ... order by x using OP limit 1)",
with all of the semantic consequences that implies. Since OP must be the
"<" or ">" member of a btree index opclass, the semantics of its
relationships with other members of the same opfamily can be deduced
from that.
Joe> Also, it looks like COUNT and SUM don't have a sortop
Right, because those currently have no semantics that PG needs to know
about or describe.
--
Andrew (irc:RhodiumToad)
From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | joe-postgresql(dot)org(at)elusive(dot)cx |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2018-11-29 15:54:14 |
Message-ID: | CA+q6zcUdBQE6Pim162Z+P92eOtNJZXP6LZXb2aphP7amnKAdpw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On Tue, Sep 25, 2018 at 1:04 AM Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>
> wrote:
>
> All agreed. I’ll give the patch some TLC and get a new version that
> addresses the above.
Hi,
Just a reminder, that the patch still needs to be rebased, could you please do
this? I'm moving the item to the next CF.
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
Cc: | joe-postgresql(dot)org(at)elusive(dot)cx, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Implementing SQL ASSERTION |
Date: | 2019-01-31 11:22:11 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2018-11-29 16:54:14 +0100, Dmitry Dolgov wrote:
> > On Tue, Sep 25, 2018 at 1:04 AM Joe Wildish <joe-postgresql(dot)org(at)elusive(dot)cx>
> > wrote:
> >
> > All agreed. I’ll give the patch some TLC and get a new version that
> > addresses the above.
>
> Hi,
>
> Just a reminder, that the patch still needs to be rebased, could you please do
> this? I'm moving the item to the next CF.
As nothing has happened, I'm marking this patch as returned with feedback.
Greetings,
Andres Freund