Cleaning up ERRCODE usage in our XML code

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-14 19:14:28
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I noticed while working on bug #18617 [1] that we are fairly slipshod
about which SQLSTATE to report when libxml2 returns an error. There
are places using ERRCODE_INTERNAL_ERROR for easily-triggered errors;
there are different places that use different ERRCODEs for exactly
the same condition; and there are places that use different ERRCODEs
for failures from xmlXPathCompile and xmlXPathCompiledEval. I found
that this last is problematic because some errors you might expect
to be reported during XPath compilation are not detected till
execution, notably namespace-related errors. That seems more like
a libxml2 implementation artifact than something we should expect to
be stable behavior, so I think we should avoid using different
ERRCODEs.

A lot of this can be blamed on there not being any especially on-point
SQLSTATE values back when this code was first written. I learned that
recent revisions of SQL have a whole new SQLSTATE class, class 10 =
"XQuery Error", so we have an opportunity to sync up with that as well
as be more self-consistent. The spec's subclass codes in this class
seem quite fine-grained. It might be an interesting exercise to try
to teach xml_errorHandler() to translate libxml2's error->code values
into fine-grained SQLSTATEs, but I've not attempted that; I'm not
sure whether there is a close mapping between what libxml2 reports
and the set of codes the SQL committee chose. What I've done in the
attached first-draft patch is just to select one relatively generic
code in class 10, 10608 = invalid_argument_for_xquery, and use that
where it seemed apropos.

This is pretty low-priority, so I'll stash it in the next CF.

regards, tom lane

[1] https://www.postgresql.org/message-id/356363.1726333674%40sss.pgh.pa.us

Attachment Content-Type Size
v1-clean-up-errcodes-for-xml.patch text/x-diff 8.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-20 19:00:42
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> [ v1-clean-up-errcodes-for-xml.patch ]

Per cfbot, rebased over d5622acb3. No functional changes.

regards, tom lane

Attachment Content-Type Size
v2-clean-up-errcodes-for-xml.patch text/x-diff 8.9 KB

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-23 11:19:17
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 20 Sep 2024, at 21:00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I wrote:
>> [ v1-clean-up-errcodes-for-xml.patch ]
>
> Per cfbot, rebased over d5622acb3. No functional changes.

Looking over these I don't see anything mis-characterized so +1 on going ahead
with these. It would be neat if we end up translating xml2 errors into XQuery
Error SQLSTATEs but this is a clear improvement over what we have until then.

There is an ERRCODE_INTERNAL_ERROR in xml_out_internal() which seems a tad odd
given that any error would be known to be parsing related and b) are caused by
libxml and not internally. Not sure if it's worth bothering with but with the
other ones improved it stood out.

--
Daniel Gustafsson


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-23 17:17:02
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
>> On 20 Sep 2024, at 21:00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Per cfbot, rebased over d5622acb3. No functional changes.

> Looking over these I don't see anything mis-characterized so +1 on going ahead
> with these. It would be neat if we end up translating xml2 errors into XQuery
> Error SQLSTATEs but this is a clear improvement over what we have until then.

Thanks for looking at it!

> There is an ERRCODE_INTERNAL_ERROR in xml_out_internal() which seems a tad odd
> given that any error would be known to be parsing related and b) are caused by
> libxml and not internally. Not sure if it's worth bothering with but with the
> other ones improved it stood out.

Yeah, I looked at that but wasn't sure what to do with it. We should
have validated the decl header when the XML value was created, so if
we get here then either the value got corrupted on-disk or in-transit,
or something forgot to do that validation, or libxml has changed its
mind since then about what's a valid decl. At least some of those
cases are probably legitimately called INTERNAL_ERROR. I thought for
awhile about ERRCODE_DATA_CORRUPTED, but I'm not convinced that's a
better fit.

regards, tom lane


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-24 06:48:06
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 23 Sep 2024, at 19:17, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
>
>> There is an ERRCODE_INTERNAL_ERROR in xml_out_internal() which seems a tad odd
>> given that any error would be known to be parsing related and b) are caused by
>> libxml and not internally. Not sure if it's worth bothering with but with the
>> other ones improved it stood out.
>
> Yeah, I looked at that but wasn't sure what to do with it. We should
> have validated the decl header when the XML value was created, so if
> we get here then either the value got corrupted on-disk or in-transit,
> or something forgot to do that validation, or libxml has changed its
> mind since then about what's a valid decl. At least some of those
> cases are probably legitimately called INTERNAL_ERROR. I thought for
> awhile about ERRCODE_DATA_CORRUPTED, but I'm not convinced that's a
> better fit.

I agree that it might not be an obvious better fit, but also not an obvious
worse fit. It will make it easier to filter on during fleet analysis so I
would be inclined to change it, but the main value of the patch are other hunks
so feel free to ignore.

--
Daniel Gustafsson


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2024-09-24 17:01:04
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
> On 23 Sep 2024, at 19:17, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeah, I looked at that but wasn't sure what to do with it. We should
>> have validated the decl header when the XML value was created, so if
>> we get here then either the value got corrupted on-disk or in-transit,
>> or something forgot to do that validation, or libxml has changed its
>> mind since then about what's a valid decl. At least some of those
>> cases are probably legitimately called INTERNAL_ERROR. I thought for
>> awhile about ERRCODE_DATA_CORRUPTED, but I'm not convinced that's a
>> better fit.

> I agree that it might not be an obvious better fit, but also not an obvious
> worse fit. It will make it easier to filter on during fleet analysis so I
> would be inclined to change it, but the main value of the patch are other hunks
> so feel free to ignore.

Fair enough. Pushed with ERRCODE_DATA_CORRUPTED used there.
Thanks again for reviewing.

regards, tom lane


From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2025-04-09 11:29:32
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14.09.24 21:14, Tom Lane wrote:
> +Section: Class 10 - XQuery Error
> +
> +# recent SQL versions define quite a few codes in this class, but for now
> +# we are only using this generic one
> +10608 E ERRCODE_INVALID_ARGUMENT_FOR_XQUERY invalid_argument_for_xquery

Could you share what SQL standard document version you got this from? I
don't see this particular code in any version I have.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2025-04-09 14:03:17
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter(at)eisentraut(dot)org> writes:
> On 14.09.24 21:14, Tom Lane wrote:
>> +Section: Class 10 - XQuery Error
>> +
>> +# recent SQL versions define quite a few codes in this class, but for now
>> +# we are only using this generic one
>> +10608 E ERRCODE_INVALID_ARGUMENT_FOR_XQUERY invalid_argument_for_xquery

> Could you share what SQL standard document version you got this from? I
> don't see this particular code in any version I have.

I think I stole it from DB2:

10608 An error was encountered in the argument of an XQuery function or operator.

https://www.ibm.com/docs/en/db2-for-zos/12.0.0?topic=codes-sqlstate-values-common-error#db2z_sqlstatevalues__classcode10

If you feel motivated to replace that with some finer-grained codes,
it's OK by me.

regards, tom lane


From: Chapman Flack <jcflack(at)acm(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up ERRCODE usage in our XML code
Date: 2025-04-09 14:43:22
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/09/25 10:03, Tom Lane wrote:
> I think I stole it from DB2:
>
> 10608 An error was encountered in the argument of an XQuery function or operator.

Interesting. ISO does seem to reserve a class 10 for XQuery errors, but
not to define any subclass codes within it.

Every other SQL/XML error you can get is in class 22, except for
"unmappable XML name" and "invalid XML character" in class 0N.

Regards,
-Chap