Обсуждение: PostgreSQL vs SQL/XML Standards

Поиск
Список
Период
Сортировка

PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
I have made another wiki page specifically about $subject. I hope
this was not presumptuous, and invite review / comment. I have not
linked to it from any other page yet.

https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2018-Oct-24, Chapman Flack wrote:

> Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
> I have made another wiki page specifically about $subject. I hope
> this was not presumptuous, and invite review / comment. I have not
> linked to it from any other page yet.
> 
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

Wow, that's ... overwhelming.  (I do wonder if we should stop relying on
libxml2 and instead look for something supporting XQuery).

Would you review Markus Winand patch here?
https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at
I think doing that would probably point out a couple of ways in which
our XMLTABLE implementation is non-conformant, and then fixes it :-)
I've been unsure as to applying it to all branches since 10 or just to
master.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Oct-24, Chapman Flack wrote:
>> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards

> Wow, that's ... overwhelming.  (I do wonder if we should stop relying on
> libxml2 and instead look for something supporting XQuery).

I think getting out from under libxml2's idiosyncrasies and security
lapses would be great, but is there a plausible alternative out there?

            regards, tom lane


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/25/18 10:39 AM, Tom Lane wrote:
> I think getting out from under libxml2's idiosyncrasies and security
> lapses would be great, but is there a plausible alternative out there?

Depends on whether anything in [1] sounds plausible.

-Chap


[1]:
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2018-Oct-25, Chapman Flack wrote:

> On 10/25/18 10:39 AM, Tom Lane wrote:
> > I think getting out from under libxml2's idiosyncrasies and security
> > lapses would be great, but is there a plausible alternative out there?
> 
> Depends on whether anything in [1] sounds plausible.
> 
> [1]:
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward

Heh, I didn't notice this part of the document.  Integrating a C runtime
of a Java library sounds nightmarish -- I wouldn't even think about
that.

XQilla seems to depend on Xerces, and seems to have died in 2011.

Zorba appears to have been taken propietary, from the looks of its last
commits.

Maybe the best way forward is to implement all the JSON functionality
and remove the SQL/XML bits.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Chapman Flack wrote:

> On 10/25/18 10:39 AM, Tom Lane wrote:
> > I think getting out from under libxml2's idiosyncrasies and security
> > lapses would be great, but is there a plausible alternative out there?
>
> Depends on whether anything in [1] sounds plausible.
>
> [1]:
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward

Heh, I didn't notice this part of the document.  Integrating a C runtime
of a Java library sounds nightmarish -- I wouldn't even think about
that.

XQilla seems to depend on Xerces, and seems to have died in 2011.

Zorba appears to have been taken propietary, from the looks of its last
commits.

Maybe the best way forward is to implement all the JSON functionality
and remove the SQL/XML bits.

It can be bigger compatibility break in Postgres history. SQL/XML functions are widely used.

Regards

Pavel


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Andreas Karlsson
Дата:
On 10/25/2018 03:53 PM, Chapman Flack wrote:
> On 10/25/18 10:39 AM, Tom Lane wrote:
>> I think getting out from under libxml2's idiosyncrasies and security
>> lapses would be great, but is there a plausible alternative out there?
> 
> Depends on whether anything in [1] sounds plausible.

The libraries we depend on should really either be available in the 
package repositories of the major Linux distribution or be something we 
can put in our own repository and maintain without too much pain. So 
using Saxon/C does not seem like a realistic option.

Andreas


Re: PostgreSQL vs SQL/XML Standards

От
Andrew Dunstan
Дата:

On 10/25/2018 11:23 AM, Andreas Karlsson wrote:
> On 10/25/2018 03:53 PM, Chapman Flack wrote:
>> On 10/25/18 10:39 AM, Tom Lane wrote:
>>> I think getting out from under libxml2's idiosyncrasies and security
>>> lapses would be great, but is there a plausible alternative out there?
>>
>> Depends on whether anything in [1] sounds plausible.
>
> The libraries we depend on should really either be available in the 
> package repositories of the major Linux distribution or be something 
> we can put in our own repository and maintain without too much pain. 
> So using Saxon/C does not seem like a realistic option.
>


Yeah, very good point. xqilla/xerces-C appears to be widely available 
(Centos and ubuntu, at least).

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/25/18 11:08 AM, Alvaro Herrera wrote:

> XQilla seems to depend on Xerces, and seems to have died in 2011.

¿Eh? The latest release, 2.3.4 [1], is dated 2018-07-03.

It looks like the latest development has been happening on the
xquilla_2_3 branch. Sometimes project "activity" statistics rely
exclusively on the "master" branch ("xqilla" branch in this case),
and are deceptive if the project isn't being developed exclusively
by coding on master and backpatching to others.

I've noticed I'm facing the same thing in PL/Java ... plenty of
development lately, but on the REL1_5_STABLE branch. GitHub's
project statistics (and also Open Hub's) are just looking at master
and saying the project's been dead for two years. Now that 1.5.1 is
released, as soon as I get some of REL1_5_STABLE merged /up/ into
master, the statistics will probably magically show it's been alive
all along.

> Zorba appears to have been taken propietary, from the looks of its last
> commits.

It does seem harder to see what's going on there, but the commits
with "copyright changed" as the message turn out to be changing
only the copyright holder in the Apache 2.0 license from
"The FLWOR Foundation" to "zorba.io". But Matthias Brantner
participated with interest in the 2010 thread here where Zorba
was brought up before[2], so he may know something.

> Integrating a C runtime of a Java library sounds nightmarish --
> I wouldn't even think about that.

Or whether or not nightmarish, certainly duplicative of something
we can kinda already do.

In a way, some of the pressure is off, because if you need
a true XMLQUERY or XMLTABLE, you can get them with the Saxon-in-PL/Java
implementation, and right now in any supported PG version. You just
have to spell them funny, doing without the sugary syntax built into
the parser. They're missing some of the automatic casts from
the standard at the moment, which isn't really a loss of function,
as explicit casts can be added to any query needing them ... a temporary
annoyance until the rest of that example's in place.

But a roadmap that could lead to eventual availability of one of the
C/C++ implementations would be nice too.

-Chap


[1]: https://sourceforge.net/projects/xqilla/files/
[2]:
https://www.postgresql.org/message-id/7DDDB18E-041F-4238-B91D-3277EB1CE5BC%4028msec.com


Re: PostgreSQL vs SQL/XML Standards

От
Jesper Pedersen
Дата:
On 10/25/18 2:33 PM, Andrew Dunstan wrote:
> 
> Yeah, very good point. xqilla/xerces-C appears to be widely available 
> (Centos and ubuntu, at least).
> 

xqilla/xerces-c are in the Fedora/RHEL repo too.

Best regards,
  Jesper


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi


But a roadmap that could lead to eventual availability of one of the
C/C++ implementations would be nice too.

Somebody should to do some work and write patch :/. Although libxml2 is after feature freeze - it is code widely used. The change of XML support should be safe, because there can be lot of work.

I am thinking so I can fix some issues related to XMLTABLE. Please, send me more examples and test cases.

Regards

Pavel

Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2018-Oct-25, Pavel Stehule wrote:

> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.

Please see Markus Winand's patch that I referenced upthread.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/25/18 11:15, Pavel Stehule wrote:
> čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com>
> napsal:
>> Maybe the best way forward is to implement all the JSON functionality
>> and remove the SQL/XML bits.
> 
> It can be bigger compatibility break in Postgres history. SQL/XML functions
> are widely used.

It seems to me that evolution to the 2006+ standard version could be done
mostly non-disruptively (provided an agreeable library can be found).

I think Tom's suggestion[1] to just make XML OPTION CONTENT mean what
it means in 2006+ would be an easy change to make immediately, and would
not disrupt anybody ... it would only make some things succeed that now
fail, and it would match what our documentation already says. It would
make our XML type equivalent to 2006+ XML(CONTENT(ANY)).

Beyond that, further steps toward 2006+ could largely avoid disruption.

If we implement the typmod'ed XML types, surely the parser would simply
treat untypmod'ed 'XML' as meaning XML(CONTENT(ANY)). (The standard does
allow for the typmod to be missing, and leaves it "implementation-defined
whether SEQUENCE, CONTENT(ANY), or CONTENT(UNTYPED) is implicit", so
that's all by the book.)

The existing functions xpath and xpath_exists can be kept unchanged,
as their names are distinct from anything in the standard. A library
that supports XQuery is likely also to support XPath in "1.0 compatibility
mode", so those functions could keep their semantics.

The current xmlvalidate() has the wrong semantics and return type, but it
also does nothing but ereport unimplemented, so no current uses would be
hurt by redefining it.

XMLTABLE would be the headache. Using the standard name for something
that ain't the standard function has not left any painless way that the
standard function could be added. OTOH, it has only been in the wild
since 10, so renaming it to something else (xpath_table?) will probably
be more painless if done soon than it ever would be later.

On 10/25/18 11:23, Andreas Karlsson wrote:
> The libraries we depend on should really either be available in the
> package repositories of the major Linux distribution or be something
> we can put in our own repository and maintain without too much pain.
> So using Saxon/C does not seem like a realistic option.

That makes good sense. The approach I proposed in [2] would be to
target the XQC API as an integration point. If there is one library
that might be most acceptable (it seems xqilla is in several repositories),
it could become a preferred or supported choice, but others could be
available if an administrator wanted to separately obtain them, perhaps
because of better performance on a particular workload, or avoidance of
some bug that a given workload turns up.

-Chap

[1]: https://www.postgresql.org/message-id/22271.1540458133%40sss.pgh.pa.us
[2]:
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#One_proposal


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


XMLTABLE would be the headache. Using the standard name for something
that ain't the standard function has not left any painless way that the
standard function could be added. OTOH, it has only been in the wild
since 10, so renaming it to something else (xpath_table?) will probably
be more painless if done soon than it ever would be later.


I don't share your opinion. XMLTABLE implements subset of standard. More it is well compatible with Oracle (in this subset).

If we have library with XPath 2.0 or higher, we can continue with it.

Regards

Pavel

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/25/18 09:56, Alvaro Herrera wrote:
> Would you review Markus Winand patch here?
> https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at
> I think doing that would probably point out a couple of ways in which
> our XMLTABLE implementation is non-conformant, and then fixes it :-)
> I've been unsure as to applying it to all branches since 10 or just to
> master.

Well, modulo the key observation that it simply *is not* conformant
until it accepts XML Query expressions and uses the XPath 2.0 type system
and data model and the SQL/XML 2006+ casting rules ...

... and there is no ISO standard that says anything about how an XPath 1.0-
based quasi-XMLTABLE-ish function ought to behave, so it's hard to say
that anything this function does is right or wrong, per ISO ...

I think all of the changes in these patches do make it a more useful
quasi-XMLTABLE-ish function, as the pre-patch behaviors were less useful
(if not outright bewildering). And they produce output that better matches
what the XQuery-based ISO rules produce (for the subset of queries that
mean the same thing as XQuery and as XPath 1.0).

I also looked at the (not yet applied?)
XML-XPath-comments-processing-instructions-array-ind patch and I think
it, too, makes the behavior more useful. I did not actually take the
time to build a PostgreSQL with the patch, but I took the two added
regression queries, syntax-desugared them[1] and called the Saxon XQuery-
based "xmltable" example with them, and got the same expected results:

SELECT xmltable.* FROM
(SELECT '<root><element>a1a<!-- aaaa -->a2a<?aaaaa pi?> <!--z-->
bbbb<x>xxx</x>cccc</element></root>'::xml AS ".") AS p,
"xmltable"('/root', PASSING => p, COLUMNS => ARRAY[
'string(element)', 'string(element/comment()[2])',
'string(element/processing-instruction())', 'string(element/text()[1])',
'serialize(element)'])
AS (element text, cmnt text, pi text, t1 text, x text);

       element        | cmnt | pi | t1  |
     x

----------------------+------+----+-----+---------------------------------------------------------------------------------
 a1aa2a   bbbbxxxcccc | z    | pi | a1a | <element>a1a<!-- aaaa
-->a2a<?aaaaa pi?> <!--z-->  bbbb<x>xxx</x>cccc</element>
(1 row)

SELECT xmltable.* FROM
(SELECT '<root><element>a1a<!-- aaaa -->a2a<?aaaaa pi?> <!--z-->
bbbb<x>xxx</x>cccc</element></root>'::xml AS ".") AS p,
"xmltable"('/root', PASSING => p, COLUMNS => ARRAY[
'string(element/text())', 'string(element/comment()[2])',
'string(element/processing-instruction())', 'string(element/text()[1])',
'serialize(element)'])
AS (element text, cmnt text, pi text, t1 text, x text);

ERROR:  java.sql.SQLException: A sequence of more than one item is not
allowed as the first argument of fn:string() (text("a1a"), text("a2a"))


Agreement. Agreement is good. :)

So I think they are worth applying. I can't bring myself to a strong
opinion on whether they are or aren't worth backpatching; if it were
the function described by the standard, they'd be bugs and they would
be, but does making a non-standard function behave slightly more like
the standard function that it isn't count as a bug fix or an enhancement?

My overall feeling, at least in directing my own effort, is that I'd rather
spend time toward getting the real XQuery-based semantics in place somehow,
and ongoing enhancements to the XPath-1.0-based stuff feel more like
pouring treasure down a hole.

But these enhancements seem like good ones, and if there's interest in
patching a couple more, the "unexpected XPath object type {2,3}" in [2]
might be good candidates. That would be backpatchable, as the current
behavior clearly isn't useful.

One other thing: I think the commit message on the context-item patch
is really somewhat misleading: "According to the SQL standard, the context
of XMLTABLE's XPath row_expression is the document node of the XML input
document..." Really the standard says nothing of the sort. It is a
limitation of XPath 1.0 that the input even has to be a document at all.
In the real XMLTABLE, you could be passing a context item that is a
document node (of either 'document' or 'content' flavor), or a one-item
'sequence' holding an atomic type like a number or date, or even a naked
XML node like a PI or comment or attribute node you're more used to seeing
only inside a document. The real rule is just that the context item is
exactly the thing you passed (or a copy of it, when the rules say so).
It collapses in the XPath 1.0 case to having to be a document node, simply
because that's the only thing you can pass in.

What was wrong with the pre-patch code was that it wasn't just using
the 'doc' it was given, but actually calling xmlDocGetRootElement() on it
and setting the CI to one of its children. The patch just directly
assigns doc to xpathctx->node, which I would call correct, not because
it's a document node, but because it's the thing that was passed.

-Chap




[1] You might notice in addition to desugaring the XMLTABLE syntax, I
wrapped the text-returning column paths in string(), and wrapped the
xml-returning one in serialize() and changed its result column to text.
Those changes are just to work around the parts of the Saxon example that
aren't implemented yet, as explained in [3].

[2] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XMLTABLE

[3] https://tada.github.io/pljava/examples/saxon.html#Using_the_Saxon_examples


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/25/18 23:16, Pavel Stehule wrote:
>> XMLTABLE would be the headache. Using the standard name for something
>> that ain't the standard function has not left any painless way that the
>> standard function could be added. OTOH, it has only been in the wild
>> since 10, so renaming it to something else (xpath_table?) will probably
>> be more painless if done soon than it ever would be later.
>>
> I don't share your opinion. XMLTABLE implements subset of standard. More it
> is well compatible with Oracle (in this subset).
> 
> If we have library with XPath 2.0 or higher, we can continue with it.

The difficulty here is that the expression language required by the standard
is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be
parsed as an XQuery expression. (So, /syntactically/, yes, "subset".)

For XPath 2.0, that is no problem, because an XPath 2.0 expression and
the identically-spelled XQuery expression /mean the same thing/.

For XPath 1.0, it is very definitely a problem, because an XPath 1.0
expression and the identically-spelled XQuery expression /do not mean
the same thing/. Some of the important semantic differences are in [1].

So, if a future PostgreSQL version has an XMLTABLE function that accepts
XQuery, as the standard requires, and existing users upgrade and they have
XMLTABLE query expressions written as XPath 1.0, those queries will be
accepted and parsed, but they will not mean the same thing. The function
will not be able to tell when it is being called with XQuery semantics
intended, vs. when it is being called with XPath 1.0 semantics intended.

Now, perhaps there is a nicer way than renaming the function. It could
work like overloading. Create two trivial domains over text, say xpath1
and xquery, and have two XMLTABLE functions with different first parameter
types. Then if you called with the expression '"cat" < "dog"'::xquery
you would get the correct result 't', and with '"cat" < "dog"'::xpath1
you would get the (also correct) result 'f'.

(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)

If you have convenient access to Oracle to check compatibility, could you
compare this query?

SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');

(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)

-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 26. 10. 2018 v 6:25 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 10/25/18 23:16, Pavel Stehule wrote:
>> XMLTABLE would be the headache. Using the standard name for something
>> that ain't the standard function has not left any painless way that the
>> standard function could be added. OTOH, it has only been in the wild
>> since 10, so renaming it to something else (xpath_table?) will probably
>> be more painless if done soon than it ever would be later.
>>
> I don't share your opinion. XMLTABLE implements subset of standard. More it
> is well compatible with Oracle (in this subset).
>
> If we have library with XPath 2.0 or higher, we can continue with it.

The difficulty here is that the expression language required by the standard
is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be
parsed as an XQuery expression. (So, /syntactically/, yes, "subset".)

For XPath 2.0, that is no problem, because an XPath 2.0 expression and
the identically-spelled XQuery expression /mean the same thing/.

For XPath 1.0, it is very definitely a problem, because an XPath 1.0
expression and the identically-spelled XQuery expression /do not mean
the same thing/. Some of the important semantic differences are in [1].

So, if a future PostgreSQL version has an XMLTABLE function that accepts
XQuery, as the standard requires, and existing users upgrade and they have
XMLTABLE query expressions written as XPath 1.0, those queries will be
accepted and parsed, but they will not mean the same thing. The function
will not be able to tell when it is being called with XQuery semantics
intended, vs. when it is being called with XPath 1.0 semantics intended.

If we have a library with XQuery, then we can change the behave. But world accepting of XQuery is not wide, unfortunately.

When I wrote and tested XMLTABLE, I found only few examples of where XQuery was used. So first there should be any library with XQUery implementation that can be used in Postgres. This library should be fast, well tested without memory leaks. Elsewhere discussion is premature. I am not terrible happy from libxml2 design, documentation, manuals - and I will not against we can migrate to some better. On second hand - libxml2 code is working - and it is widely used. It can be big mistake if we use Java library and if we create dependency on Java. After, there are only few libs that doesn't significantly better than libxml2.



Now, perhaps there is a nicer way than renaming the function. It could
work like overloading. Create two trivial domains over text, say xpath1
and xquery, and have two XMLTABLE functions with different first parameter
types. Then if you called with the expression '"cat" < "dog"'::xquery
you would get the correct result 't', and with '"cat" < "dog"'::xpath1
you would get the (also correct) result 'f'.

Probably it can works, but needs more work on Postgres infrastructure. If you overload functions like this, then type should be used every time.


(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)

If you have convenient access to Oracle to check compatibility, could you
compare this query?

SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');

(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)

I will test it.

Regards

Pavel


-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Pavel Stehule wrote:

> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.

Please see Markus Winand's patch that I referenced upthread.

here is a fix of some XMLTABLE mentioned issues.

Regards

Pavel

 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:

(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)

If you have convenient access to Oracle to check compatibility, could you
compare this query?

SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');

(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)



I have a access to too old 11.2 Oracle.  There I had to modify query because there is not boolean type. I replaced bool by int, but I got a error
ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work with/without string() wrappings.

Regards

Pavel Stehule



Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


po 29. 10. 2018 v 11:05 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:

(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)

If you have convenient access to Oracle to check compatibility, could you
compare this query?

SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');

(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)



I have a access to too old 11.2 Oracle.  There I had to modify query because there is not boolean type. I replaced bool by int, but I got a error
ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work with/without string() wrappings.

The problem is in last line - the expression  "sale/@taxable = false()" is not valid on Oracle. Using string() wrapping is a issue, because it returns "true", "false", but Oracle int doesn't accept it.

Pavel


Regards

Pavel Stehule



Re: PostgreSQL vs SQL/XML Standards

От
Thomas Kellerer
Дата:
>> I have a access to too old 11.2 Oracle.  There I had to modify query
>> because there is not boolean type. I replaced bool by int, but I got a
>> error
>> ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work
>> with/without string() wrappings.
>>
>The problem is in last line - the expression  "sale/@taxable = false()" is
>not valid on Oracle. Using string() wrapping is a issue, because it returns
">true", "false", but Oracle int doesn't accept it.

That line seems to be valid - but you need to pass an XMLTYPE value, not a
VARCHAR

https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33

As far as I can tell inside XQuery Oracle does support boolean, but not as a
return type 





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Pavel Stehule wrote:

> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.

Please see Markus Winand's patch that I referenced upthread.

here is a fix of some XMLTABLE mentioned issues.

this update allows cast boolean to numeric types from XPath expressions

Regards

Pavel


Regards

Pavel

 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 10/29/18 6:40 AM, Thomas Kellerer wrote:
> That line seems to be valid - but you need to pass an XMLTYPE value,
> not a VARCHAR
> 
> https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33

Oh, of course! Thank you. I had forgotten pass the context item
as explicitly an XML value.

That illustrates that, in a proper XMLTABLE, you can pass things
that are not XML values. When a varchar is passed in, the context
item has type xs:string. The third PATH tried to follow a node path
against a non-node context item, and correctly reported the error.

And thanks for the dbfiddle pointer. I can now confirm (in both 11g.2
and 18c):

SELECT *
FROM XMLTABLE('.'
  PASSING xmltype('<sale hatsize="7" customer="alice" taxable="false"/>')
  COLUMNS
    a varchar(10) PATH '"cat" < "dog"',
    b varchar(10) PATH '"cat" > "dog"',
    c varchar(10) PATH 'sale/@taxable = false()'
);

A     B     C
true     false     true

Or as numbers (There's just no SQL boolean type in Oracle, even 18g!):

SELECT *
FROM XMLTABLE('.'
  PASSING xmltype('<sale hatsize="7" customer="alice" taxable="false"/>')
  COLUMNS
    a NUMBER PATH '"cat" < "dog"',
    b NUMBER PATH '"cat" > "dog"',
    c NUMBER PATH 'sale/@taxable = false()'
);

A     B     C
1     0     1


I removed the string() wrappings, which were only to allow the
same query to work in PG, but Pavel's proposed patches will make
them unnecessary.

Note, however, that the first proposed patch will work for the
first query (varchar results) and fail for the second (number
results). The second patch will work for the second query, but
produce the wrong strings ("1" or "0" instead of "true" or "false")
for the first. A proper XMLTABLE needs to apply the appropriate
conversion determined by the SQL type of the output column.

I believe a patch to do that correctly is possible; xml.c has
access to the type oids for the output columns, after all.

The fact that PG will return false|false|false or 0|0|0 instead
of true|false|true or 1|0|1 cannot be fixed by a patch. That is
the consequence of evaluating in XPath 1.0 (in XPath 2.0, which is
a subset of XQuery, the results would be correct).

On the same lines, we can take my original example where I forgot
to type the context item as XML, and make that work in Oracle too:

SELECT *
FROM XMLTABLE('.'
  PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
  COLUMNS
    a varchar(10) PATH 'substring-after(., "taxable=")'
);

A
"false"/>

A proper XMLTABLE is happy to be passed an atomic value, such as
a string, as the context item or any named parameter, and apply
type-appropriate operators and functions to it. XPath 1.0 blocks
that for PG.

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Pavel Stehule wrote:

> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.

Please see Markus Winand's patch that I referenced upthread.

here is a fix of some XMLTABLE mentioned issues.

this update allows cast boolean to numeric types from XPath expressions

Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.

Regards

Pavel
 

Regards

Pavel


Regards

Pavel

 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Markus Winand
Дата:
> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's
code,but it was inspiration for me. I found native solution from libxml2. 
>
> Regards
>
> Pavel

Better than my patch.

But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x
XMLPATH 'node()'); 
                    x
-----------------------------------------
 prec1arg&ent1<n2>&deep</n2>post
(1 row)

Output is not the original XML.

I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype
applied:

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x
XMLPATH '/'); 
             x
---------------------------
 pre&ent1&deeppost
(1 row)

Oracle gives in both cases XML.

To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should
bereversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for
allother cases? 

I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach
butI don’t think that proves a lot. 

-markus

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
        xmltype    *result;

-       if (cur->type == XML_ELEMENT_NODE)
+       if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
        {
                xmlBufferPtr buf;
                xmlNodePtr      cur_copy;




Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:

> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel

Better than my patch.

But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH 'node()');
                    x
-----------------------------------------
 prec1arg&amp;ent1<n2>&amp;deep</n2>post
(1 row)

Output is not the original XML.

I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH '/');
             x
---------------------------
 pre&amp;ent1&amp;deeppost
(1 row)

Oracle gives in both cases XML.

To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?

I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.

-markus

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
        xmltype    *result;

-       if (cur->type == XML_ELEMENT_NODE)
+       if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
        {
                xmlBufferPtr buf;
                xmlNodePtr      cur_copy;


I used your patch and append regress tests. I checked the result against Oracle.

Regards

Pavel
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Markus Winand
Дата:

On 2018-11-9, at 05:07 , Pavel Stehule <pavel.stehule@gmail.com> wrote:



čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:

> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel

Better than my patch.

But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH 'node()');
                    x
-----------------------------------------
 prec1arg&amp;ent1<n2>&amp;deep</n2>post
(1 row)

Output is not the original XML.

I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH '/');
             x
---------------------------
 pre&amp;ent1&amp;deeppost
(1 row)

Oracle gives in both cases XML.

To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?

I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.

-markus

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
        xmltype    *result;

-       if (cur->type == XML_ELEMENT_NODE)
+       if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
        {
                xmlBufferPtr buf;
                xmlNodePtr      cur_copy;


I used your patch and append regress tests. I checked the result against Oracle.

Regards

Pavel

Fine from my side.

-markus

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


po 12. 11. 2018 v 6:58 odesílatel Markus Winand <markus.winand@winand.at> napsal:

On 2018-11-9, at 05:07 , Pavel Stehule <pavel.stehule@gmail.com> wrote:



čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:

> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel

Better than my patch.

But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH 'node()');
                    x
-----------------------------------------
 prec1arg&amp;ent1<n2>&amp;deep</n2>post
(1 row)

Output is not the original XML.

I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:

# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH '/');
             x
---------------------------
 pre&amp;ent1&amp;deeppost
(1 row)

Oracle gives in both cases XML.

To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?

I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.

-markus

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
 {
        xmltype    *result;

-       if (cur->type == XML_ELEMENT_NODE)
+       if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
        {
                xmlBufferPtr buf;
                xmlNodePtr      cur_copy;


I used your patch and append regress tests. I checked the result against Oracle.

Regards

Pavel

Fine from my side.

super

Thank you for investigation and tests, examples.

It is assigned to January commitfest.

Regards

Pavel


-markus

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 11/12/18 04:58, Pavel Stehule wrote:
> It is assigned to January commitfest.

When I build this patch against master (4203842), it builds, but breaks
make check. The diffs seem only incidental (loss of some error context
output), but there are no make check errors building 4203842 unmodified,
so the patch has introduced them. Example:

*** ../postgresql/src/test/regress/expected/xml.out    Sat Dec 29 19:58:41 2018
--- src/test/regress/results/xml.out   Sat Dec 29 19:59:11 2018
***************
*** 9,16 ****
  LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
                                         ^
  DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
- <wrong
-       ^

=== ABOUT THE DOCUMENTATION ===

While this patch fixes bugs in the embedding of XPath 1.0 into PostgreSQL,
and that is good, I still think there is an elephant in the room, namely
that XPath 1.0 isn't the right language to embed. Clearly, that can't be
fixed as long as we rely on libxml, but I think our documentation could
(should) do a better job of warning users of the surprises that result.

My ideas for improving the documentation are not yet so finished as to
offer a patch, but here are some thoughts:

Of the 13 mentions in func.sgml of XPath (not counting its appearances
in function and parameter names), only one specifically says XPath 1.0,
and it is buried down at functions-xml-processing, not somewhere common
to all of them, like the introduction to functions-xml.

Really, all of our functions that refer to "XPath" mean XPath 1.0,
and because that version was essentially thrown away to produce the 2.0
and later versions, and that happened a dozen years ago, the documentation
needs to make that clear. Users who have learned XQuery, or XPath in the
last dozen years, need to know this is something very different, to avoid
walking into the traps.

I think:

1) Every reference to "XPath" or an "XPath expression" (that is the first
   in the description of a function or parameter) should be spelled out as
   "XPath 1.0".

2) There should be a section in one central place (clearly applying to
   all of the XML functions), perhaps in the functions-xml introduction,
   or a <footnote> to a paragraph there, or in an appendix (new section
   under SQL Conformance?) that summarizes the gulf between XPath 1.0
   and the later, XQuery-compatible versions. There should be cross-
   reference links to this section from the "XPath 1.0" mentions in
   individual function descriptions. The section could be a condensation
   of [1] (and perhaps contain a link to [1], if links out to the wiki
   are allowed), and link to the W3C compatibility notes at [2] and [3].

3) Currently, the description of XMLEXISTS notes that "the SQL standard
   specifies ... the construct to take an XQuery expression ... but
   PostgreSQL currently only supports XPath, which is a subset of XQuery".
   The description of XMLTABLE does not have such a note, and needs one.
   In both cases, the note should probably be nearer the top of the
   description (for XMLEXISTS, it is currently at the bottom, after the
   examples).

4) That note currently says "only supports XPath, which is a subset
   of XQuery". That would be a fair claim if we meant XPath 2.0
   or later, but not XPath 1.0 (which was not a subset of any version
   of XQuery, and has fundamental incompatibilities with it). The note,
   for both XMLEXISTS and XMLTABLE, should simply say "only supports
   XPath 1.0", with a link to the section summarizing the incompatibilities.

Those are my thoughts on how a patch to the documentation could be
organized. Do they seem reasonable?

-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
[2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
[3]
https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 30. 12. 2018 v 3:46 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 11/12/18 04:58, Pavel Stehule wrote:
> It is assigned to January commitfest.

When I build this patch against master (4203842), it builds, but breaks
make check. The diffs seem only incidental (loss of some error context
output), but there are no make check errors building 4203842 unmodified,
so the patch has introduced them. Example:

*** ../postgresql/src/test/regress/expected/xml.out    Sat Dec 29 xml19:58:41 2018
--- src/test/regress/results/xml.out   Sat Dec 29 19:59:11 2018
***************
*** 9,16 ****
  LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
                                         ^
  DETAIL:  line 1: Couldn't find end of Start Tag wrong line 1
- <wrong
-       ^

Unfortunately, there is a different releases of libxml2 with different error reporting and it is hard (impossible) to prepare for all variants. :-/

I prepare xml.out for my FC29 (fresh libxml2) and for no support xml. Other I prepare by patching - and this error (in context) is expected.


=== ABOUT THE DOCUMENTATION ===

While this patch fixes bugs in the embedding of XPath 1.0 into PostgreSQL,
and that is good, I still think there is an elephant in the room, namely
that XPath 1.0 isn't the right language to embed. Clearly, that can't be
fixed as long as we rely on libxml, but I think our documentation could
(should) do a better job of warning users of the surprises that result.

My ideas for improving the documentation are not yet so finished as to
offer a patch, but here are some thoughts:

Of the 13 mentions in func.sgml of XPath (not counting its appearances
in function and parameter names), only one specifically says XPath 1.0,
and it is buried down at functions-xml-processing, not somewhere common
to all of them, like the introduction to functions-xml.

Really, all of our functions that refer to "XPath" mean XPath 1.0,
and because that version was essentially thrown away to produce the 2.0
and later versions, and that happened a dozen years ago, the documentation
needs to make that clear. Users who have learned XQuery, or XPath in the
last dozen years, need to know this is something very different, to avoid
walking into the traps.

I think:

1) Every reference to "XPath" or an "XPath expression" (that is the first
   in the description of a function or parameter) should be spelled out as
   "XPath 1.0".

2) There should be a section in one central place (clearly applying to
   all of the XML functions), perhaps in the functions-xml introduction,
   or a <footnote> to a paragraph there, or in an appendix (new section
   under SQL Conformance?) that summarizes the gulf between XPath 1.0
   and the later, XQuery-compatible versions. There should be cross-
   reference links to this section from the "XPath 1.0" mentions in
   individual function descriptions. The section could be a condensation
   of [1] (and perhaps contain a link to [1], if links out to the wiki
   are allowed), and link to the W3C compatibility notes at [2] and [3].

3) Currently, the description of XMLEXISTS notes that "the SQL standard
   specifies ... the construct to take an XQuery expression ... but
   PostgreSQL currently only supports XPath, which is a subset of XQuery".
   The description of XMLTABLE does not have such a note, and needs one.
   In both cases, the note should probably be nearer the top of the
   description (for XMLEXISTS, it is currently at the bottom, after the
   examples).

4) That note currently says "only supports XPath, which is a subset
   of XQuery". That would be a fair claim if we meant XPath 2.0
   or later, but not XPath 1.0 (which was not a subset of any version
   of XQuery, and has fundamental incompatibilities with it). The note,
   for both XMLEXISTS and XMLTABLE, should simply say "only supports
   XPath 1.0", with a link to the section summarizing the incompatibilities.

Those are my thoughts on how a patch to the documentation could be
organized. Do they seem reasonable?

I agree with more stronger detail description about difference between XPath, XPath2 and XQuery.

Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery. ..."

I don't think so link to wiki with any proposals is good idea. Documentation should to describe current state, not what can be at some future.

Regards

Pavel


-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
[2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
[3]
https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 12/30/18 03:23, Pavel Stehule wrote:
> I agree with more stronger detail description about difference between
> XPath, XPath2 and XQuery.
> 
> Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery.
> ..."

I'll be happy to work on some wording to help detail the differences.
I don't think it can be condensed to a single sentence, or even if it
could, it would not serve users well; what's helpful is to know
the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage
that is now XPath, because that's what helps to understand what the
challenges will be when porting queries from other systems, how to recognize
when a query won't be possible to port at all, etc.

So I think it needs to be a short section. I am still undecided what's
the best place in the manual for the section to go:

- Added to the introductory material under functions-xml

- A long <footnote> attached to a short remark in that introductory matter
  (is there any precedent? I see there are <footnote>s already in the
  manual, but I haven't checked how long they get)

- A new appendix (or new section in the SQL Conformance appendix)
  linked from a short statement in the functions-xml introductory matter.

> I don't think so link to wiki with any proposals is good idea.

That makes sense. I was thinking only of linking directly to the specific
section on XPath 1.0 compatibility issues, not to the wiki page as a whole.
I see in grepping through the *.sgml that there are not many links to the
wiki.postgresql.org, and the ones that exist are only in release notes,
Further Information, The Source Code Repository, and sepgsql.

So precedent seems to stand against using a link to the wiki section.
So, the material /in/ that wiki section [1] (but omitting the two paras
about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs
to go into the new section in the manual.

I assume it's ok to directly include links [2] and [3] out to the w3.org
compatibility notes ... those are stable URLs to reference material.

=== BY REF and BY VALUE ===

How difficult would it be to make the grammar constructs that currently
accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
and BY VALUE? The documentation ought to state that, while both forms are
accepted and ignored, PostgreSQL's actual behavior corresponds to what the
standard calls BY VALUE (using a string serialization as the XML datatype's
internal form makes the BY REF semantics impossible). So it's when people
try to port queries that explicitly say BY REF that they need to know there
may be trouble ahead.

-Chap


>> [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
>> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
>> [3] https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 03:23, Pavel Stehule wrote:
> I agree with more stronger detail description about difference between
> XPath, XPath2 and XQuery.
>
> Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery.
> ..."

I'll be happy to work on some wording to help detail the differences.
I don't think it can be condensed to a single sentence, or even if it
could, it would not serve users well; what's helpful is to know
the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage
that is now XPath, because that's what helps to understand what the
challenges will be when porting queries from other systems, how to recognize
when a query won't be possible to port at all, etc.

So I think it needs to be a short section. I am still undecided what's
the best place in the manual for the section to go:


+1
 
- Added to the introductory material under functions-xml

- A long <footnote> attached to a short remark in that introductory matter
  (is there any precedent? I see there are <footnote>s already in the
  manual, but I haven't checked how long they get)

- A new appendix (or new section in the SQL Conformance appendix)
  linked from a short statement in the functions-xml introductory matter.

> I don't think so link to wiki with any proposals is good idea.

That makes sense. I was thinking only of linking directly to the specific
section on XPath 1.0 compatibility issues, not to the wiki page as a whole.
I see in grepping through the *.sgml that there are not many links to the
wiki.postgresql.org, and the ones that exist are only in release notes,
Further Information, The Source Code Repository, and sepgsql.

So precedent seems to stand against using a link to the wiki section.
So, the material /in/ that wiki section [1] (but omitting the two paras
about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs
to go into the new section in the manual.

I assume it's ok to directly include links [2] and [3] out to the w3.org
compatibility notes ... those are stable URLs to reference material.

=== BY REF and BY VALUE ===

How difficult would it be to make the grammar constructs that currently
accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
and BY VALUE? The documentation ought to state that, while both forms are
accepted and ignored, PostgreSQL's actual behavior corresponds to what the
standard calls BY VALUE (using a string serialization as the XML datatype's
internal form makes the BY REF semantics impossible). So it's when people
try to port queries that explicitly say BY REF that they need to know there
may be trouble ahead.

This is difficult question - a implementation is probably very easy, but it is hard to accept to possible break compatibility due syntactic sugar.

This is not probably related to just XPath/XQuery question - but it is related to different design of XML datatype (based on PostgreSQL TOAST) against ANSI/SQL (Oracle - clob).

So this is complicated topic and my opinion is better to don't touch it because we can't to fix it, change it - and I am not sure so ANSI/SQL is significantly better than PostgreSQL implementation.
 

-Chap


>> [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
>> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
>> [3] https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 12/30/18 15:18, Pavel Stehule wrote:
> ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> How difficult would it be to make the grammar constructs that currently
>> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
>> and BY VALUE?
> 
> This is difficult question - a implementation is probably very easy, but it
> is hard to accept to possible break compatibility due syntactic sugar.
> 
> This is not probably related to just XPath/XQuery question - but it is
> related to different design of XML datatype (based on PostgreSQL TOAST)
> against ANSI/SQL (Oracle - clob).
> 
> So this is complicated topic and my opinion is better to don't touch it
> because we can't to fix it, change it - and I am not sure so ANSI/SQL is
> significantly better than PostgreSQL implementation.

I am not sure I understand your point. It appears that Oracle (18c),
just like PostgreSQL, really only supports BY VALUE semantics. Here is
an Oracle fiddle that shows it:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd

When the same element is passed via two parameters, an 'is' test (node
identity equality) of the two parameters returns false, indicating that
Oracle has used BY VALUE semantics, not BY REF.

Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when
no passing method is specified (i.e., BY VALUE is the default). Oracle also
uses BY VALUE when BY REF is explicitly requested, which seems rather rude,
but that must be the behavior PostgreSQL is imitating with the choice to
accept and ignore BY REF.

But the PostgreSQL situation is a little more strange. PG uses BY VALUE
semantics as the default when no passing method is specified. PG also uses
BY VALUE semantics when BY REF is explicitly requested, which is rude,
just like Oracle. But why should an explicit specification of BY VALUE
(which is, after all, the semantics we're going to use anyway!) produce
this?

ERROR:  syntax error at or near "value"

To me, that doesn't seem like least astonishment.

I am not seeing what would be complicated about removing that astonishment
by simply allowing the grammar productions to also consume BY VALUE and
ignore it.

-Chap


doc: where best to add ~ 400 words to the manual?

От
Chapman Flack
Дата:
I would like to add material to the manual, detailing the differences
between the XPath 1.0 language supported in PG, and the XQuery/XPath 2.0+
expected by the standard.

A good preview of what that would look like is the "related to version
of XPath" wiki section at [1].

That wiki section (minus the two inside-baseball paragraphs about
commit 3963574 and its reversion) comes to a bit under 400 words.

So that's roughly the amount of material I'm thinking to add.
Where should it go?

I have thought of:

1. A new sect2 near the top of the "functions-xml" sect1. This makes sense
because the material applies to all the functions below that involve XPath.
But that seems like the wrong place for 400 words of gory details.

2. A short remark at the top of functions-xml, linked to a footnote with
the gory details. But that would be a long footnote. The longest currently
is a footnote with 101 words in the start tutorial.

3. A sect2 at the very end of functions-xml, linked to from a short remark
at the top.

4. A new appendix, or a new conformance section in the features.sgml
appendix, linked to from a short remark at the top of functions-xml.

Or is there an even better idea I have not thought of?

-Chap


[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 15:18, Pavel Stehule wrote:
> ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> How difficult would it be to make the grammar constructs that currently
>> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
>> and BY VALUE?
>
> This is difficult question - a implementation is probably very easy, but it
> is hard to accept to possible break compatibility due syntactic sugar.
>
> This is not probably related to just XPath/XQuery question - but it is
> related to different design of XML datatype (based on PostgreSQL TOAST)
> against ANSI/SQL (Oracle - clob).
>
> So this is complicated topic and my opinion is better to don't touch it
> because we can't to fix it, change it - and I am not sure so ANSI/SQL is
> significantly better than PostgreSQL implementation.

I am not sure I understand your point. It appears that Oracle (18c),
just like PostgreSQL, really only supports BY VALUE semantics. Here is
an Oracle fiddle that shows it:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd

When the same element is passed via two parameters, an 'is' test (node
identity equality) of the two parameters returns false, indicating that
Oracle has used BY VALUE semantics, not BY REF.

Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when
no passing method is specified (i.e., BY VALUE is the default). Oracle also
uses BY VALUE when BY REF is explicitly requested, which seems rather rude,
but that must be the behavior PostgreSQL is imitating with the choice to
accept and ignore BY REF.

But the PostgreSQL situation is a little more strange. PG uses BY VALUE
semantics as the default when no passing method is specified. PG also uses
BY VALUE semantics when BY REF is explicitly requested, which is rude,
just like Oracle. But why should an explicit specification of BY VALUE
(which is, after all, the semantics we're going to use anyway!) produce
this?

ERROR:  syntax error at or near "value"

To me, that doesn't seem like least astonishment.

I am not seeing what would be complicated about removing that astonishment
by simply allowing the grammar productions to also consume BY VALUE and
ignore it.

ok - I am not against implementation of ignored BY VALUE. But I don't like a idea to disable BY REF.

Regards

Pavel

-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Arthur Zakirov
Дата:
Hello Pavel,

On 09.11.2018 07:07, Pavel Stehule wrote:
> I used your patch and append regress tests. I checked the result against 
> Oracle.

I checked the patch with Chap cases. The patch fixes handling of 
boolean, number types which mentioned in the wiki.

I have a few comments related to the code and the documentation. I 
attached the patch, which fixes it.

There is an example in the documentation:

SELECT xmltable.*
   FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
        element
----------------------
    Hello2a2   bbbCC

With the patch XMLTABLE returns different result now.

copy_and_safe_free_xmlchar() function should be hid by #ifdef 
USE_LIBXML, otherwise I get an error if I build the Postgres without 
--with-libxml.

There is a comment within XmlTableGetValue(). I changed it, mainly I 
used Markus patch from the related thread mentioned by Alvaro.

Please see the changes in the patch.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

čt 10. 1. 2019 v 14:00 odesílatel Arthur Zakirov <a.zakirov@postgrespro.ru> napsal:
Hello Pavel,

On 09.11.2018 07:07, Pavel Stehule wrote:
> I used your patch and append regress tests. I checked the result against
> Oracle.

I checked the patch with Chap cases. The patch fixes handling of
boolean, number types which mentioned in the wiki.

I have a few comments related to the code and the documentation. I
attached the patch, which fixes it.

There is an example in the documentation:

SELECT xmltable.*
   FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
        element
----------------------
    Hello2a2   bbbCC

With the patch XMLTABLE returns different result now.

copy_and_safe_free_xmlchar() function should be hid by #ifdef
USE_LIBXML, otherwise I get an error if I build the Postgres without
--with-libxml.

There is a comment within XmlTableGetValue(). I changed it, mainly I
used Markus patch from the related thread mentioned by Alvaro.

Please see the changes in the patch.

I merged your changes, and fixed regress tests.

Thank you for patch

Regards

Pavel



--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Вложения

Re: doc: where best to add ~ 400 words to the manual?

От
Chapman Flack
Дата:
On 12/30/18 22:23, Chapman Flack wrote:
> I would like to add material to the manual, detailing the differences
> between the XPath 1.0 language supported in PG, and the XQuery/XPath 2.0+
> expected by the standard.
> ...
> I have thought of:
> ...
> 3. A sect2 at the very end of functions-xml, linked to from a short remark
> at the top.

(3) seems to be supported by (the only) precedent, the "Limits and
Compatibility" sect3 within functions-posix-regexp.

So, absent objection, I'll work on a Limits and Compatibility sect2
within functions-xml.

-Chap


Re: doc: where best to add ~ 400 words to the manual?

От
Chapman Flack
Дата:
On 01/10/19 23:48, Chapman Flack wrote:
> On 12/30/18 22:23, Chapman Flack wrote:
> (3) seems to be supported by (the only) precedent, the "Limits and
> Compatibility" sect3 within functions-posix-regexp.
> 
> So, absent objection, I'll work on a Limits and Compatibility sect2
> within functions-xml.

One fly in the ointment: tables of contents seem to list sect1 and sect2
elements, but not sect3. So the "Limits and Compatibility" sect3 under
functions-posix-regexp does not clutter the overall "Functions and
Operators" ToC, but a "Limits and Compatibility" sect2 under
functions-xml would be visible there. Tucking it as a sect3 within an
existing sect2 would prevent that, but it is common information that
pertains to more than one of them, so that doesn't seem quite right.

Or is it ok to have a "Limits and Compatibility" visible in the ToC
under XML Functions?

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 12/30/18 03:23, Pavel Stehule wrote:
> Unfortunately, there is a different releases of libxml2 with different
> error reporting and it is hard (impossible) to prepare for all variants. :-/
>
> I prepare xml.out for my FC29 (fresh libxml2) and for no support xml.
> Other I prepare by patching - and this error (in context) is expected.

It turns out that the variant was already accounted for in the xml_2.out
variant result file, it just needed to have the new results added.

Done in xmltable-xpath-result-processing-bugfix-4.patch attached.


On 12/31/18 01:03, Pavel Stehule wrote:
> po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
>> semantics as the default when no passing method is specified. PG also uses
>> BY VALUE semantics when BY REF is explicitly requested, which is rude,
>> just like Oracle. But why should an explicit specification of BY VALUE
>> (which is, after all, the semantics we're going to use anyway!) produce
>> this?
>>
>> ERROR:  syntax error at or near "value"
>>
>> To me, that doesn't seem like least astonishment.
>>
>> I am not seeing what would be complicated about removing that astonishment
>> by simply allowing the grammar productions to also consume BY VALUE and
>> ignore it.
> 
> ok - I am not against implementation of ignored BY VALUE. But I don't like
> a idea to disable BY REF.

Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with
some corresponding documentation adjustments.

I am still working on more extensive documentation, but it seemed best
to include the changes related to BY REF / BY VALUE in the same patch
with the grammar change.

-Chap

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 13. 1. 2019 v 0:39 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 03:23, Pavel Stehule wrote:
> Unfortunately, there is a different releases of libxml2 with different
> error reporting and it is hard (impossible) to prepare for all variants. :-/
>
> I prepare xml.out for my FC29 (fresh libxml2) and for no support xml.
> Other I prepare by patching - and this error (in context) is expected.

It turns out that the variant was already accounted for in the xml_2.out
variant result file, it just needed to have the new results added.

Done in xmltable-xpath-result-processing-bugfix-4.patch attached.


On 12/31/18 01:03, Pavel Stehule wrote:
> po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
>> semantics as the default when no passing method is specified. PG also uses
>> BY VALUE semantics when BY REF is explicitly requested, which is rude,
>> just like Oracle. But why should an explicit specification of BY VALUE
>> (which is, after all, the semantics we're going to use anyway!) produce
>> this?
>>
>> ERROR:  syntax error at or near "value"
>>
>> To me, that doesn't seem like least astonishment.
>>
>> I am not seeing what would be complicated about removing that astonishment
>> by simply allowing the grammar productions to also consume BY VALUE and
>> ignore it.
>
> ok - I am not against implementation of ignored BY VALUE. But I don't like
> a idea to disable BY REF.

Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with
some corresponding documentation adjustments.

I am still working on more extensive documentation, but it seemed best
to include the changes related to BY REF / BY VALUE in the same patch
with the grammar change.

looks well, thank you for patch

Pavel


-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
There is a bug that remains, in the
else if (xpathobj->type == XPATH_STRING)
case.

As it is now, it simply passes the string value of the result
into the output column's type-input function, regardless of the
output column type.

If the output column type is xml, this will attempt to parse the
string as xml. The result should simply be xml content consisting of
a text node representing the string (as by XMLTEXT()). If it contains
XML metacharacters, they should be escaped.

For a non-xml output column, the string should be used directly,
as it is now.

# select * from xmltable('.' passing xmlelement(name a)
columns a text path '"<foo/>"', b xml path '"<foo/>"');
   a    |   b
--------+--------
 <foo/> | <foo/>

Oracle fiddle for comparison:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=26f91a9e55a6908c2bcf848b464ca381

A     B
<foo/>     <foo/>

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

po 14. 1. 2019 v 2:41 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
There is a bug that remains, in the
else if (xpathobj->type == XPATH_STRING)
case.

As it is now, it simply passes the string value of the result
into the output column's type-input function, regardless of the
output column type.

If the output column type is xml, this will attempt to parse the
string as xml. The result should simply be xml content consisting of
a text node representing the string (as by XMLTEXT()). If it contains
XML metacharacters, they should be escaped.

For a non-xml output column, the string should be used directly,
as it is now.

# select * from xmltable('.' passing xmlelement(name a)
columns a text path '"<foo/>"', b xml path '"<foo/>"');
   a    |   b
--------+--------
 <foo/> | <foo/>

Oracle fiddle for comparison:

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=26f91a9e55a6908c2bcf848b464ca381

A       B
<foo/>  &lt;foo/&gt;

should be fixed in last patch

Regards

Pavel


-Chap
Вложения

House style for DocBook documentation?

От
Chapman Flack
Дата:
Hi,

Is there, somewhere, a written-up "house style" for what DocBook 4.2
elements to use for which types of content in the manual?

In func.sgml I'm seeing what looks like a variety of examples, and
I'm not sure which ones to try to follow.

Thanks,
-Chap


Re: House style for DocBook documentation?

От
Alvaro Herrera
Дата:
Hi

On 2019-Jan-18, Chapman Flack wrote:

> Is there, somewhere, a written-up "house style" for what DocBook 4.2
> elements to use for which types of content in the manual?
> 
> In func.sgml I'm seeing what looks like a variety of examples, and
> I'm not sure which ones to try to follow.

I don't think we do.  I'd suggest to come up with something and then see
if it makes sense to patch the docs to apply it regularly.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
Hi,

On 01/19/19 08:35, Alvaro Herrera wrote:
>> Is there, somewhere, a written-up "house style" for what DocBook 4.2
>> elements to use for which types of content in the manual?
>> ...
> I don't think we do.  I'd suggest to come up with something and then see
> if it makes sense to patch the docs to apply it regularly.

I think my ambition at the moment is just to complete a particular
addition to func.sgml and do so as consistently as I can manage with
what's there now.

I have noticed a couple of things:

- 'SQL' is often marked up as <acronym>SQL</acronym>, but far from always.

- no such markup is applied to 'JSON' or 'XML' at all, at least
  not in func.sgml.

- there is a README.links with this guideline:

  o  Do not use text with <ulink> so the URL appears in printed output

  but a grep -r in doc/src/sgml turns up 112 uses that observe the
  guideline, and 147 that supply link text.

(thinks to self half-seriously about an XSL transform for generating
printed output that could preserve link-texted links, add raised numbers,
and produce a numbered URLs section at the back)

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Working slowly through the documentation, I came upon:

 For XMLTABLE:

  - The xmltable function produces a table based on the given XML value,
    an XPath filter to extract rows, and an optional set of column
    definitions.                            ^^^^^^^^
    ...
    The mandatory COLUMNS clause specifies the list of columns ...
        ^^^^^^^^^
    if the COLUMNS clause is omitted, the rows in the result set contain
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    a single column of type xml containing the data matched by
    row_expression.

   This documentation seems undecided on whether the COLUMNS clause
   is mandatory or optional.

   It is mandatory in the SQL standard. It's mandatory in our grammar.
   We give a syntax_error if it's omitted.

   Is some of the documentation left over from an earlier contemplated
   design of having the clause be optional?

   Oracle does seem to allow the clause to be omitted, and produces a
   single xml column, as described. Was there an earlier plan to imitate
   Oracle's nonstandard behavior on that point? (Hardly seems worth the
   effort, as porting an Oracle query depending on it would simply entail
   adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and
   standard.)

 - It is possible for a default_expression to reference the value of
   output columns that appear prior to it in the column list, so the
   default of one column may be based on the value of another column.

  Is there an example that clearly shows this to work? If I write a
  default_expression referring to a prior column in /xmltable's own/
  column list, I get an undefined_column error. I can successfully refer
  to a column of /an earlier FROM item in the SELECT/, but I am not sure
  that demonstrates the behavior claimed here.

  There is what looks like an example among the regression tests
  (the one with DEFAULT ascii(_path) - 54), but that seems only to
  demonstrate xmltable getting invoked four times (as documented for
  LATERAL), not a single xmltable invocation producing multiple rows
  with recomputed defaults.

  If it's any comfort, I haven't gotten Oracle's xmltable to recognize
  earlier columns in its own column list either.

 - Unlike regular PostgreSQL functions, column_expression and
   default_expression are not evaluated to a simple value before calling
   the function. column_expression is normally evaluated exactly once
   per input row, and default_expression is evaluated each time a default
   is needed for a field.

  I've already covered the question about default_expression, but what
  this passage says about column_expression seems, at least, ambiguously
  worded, too:

  It goes without saying that /the XPath evaluator/ evaluates the
  column_expression exactly once per input row. In the standard, that's
  the only per-row evaluation happening; the column_expression SQL value
  only gets compiled to an XPath expression once at the start. (In fact,
  in the standard, it can't even be an arbitrary SQL expression, only a
  string literal. Oracle enforces that too.)

  It seems that our implementation is meant to extend the standard and
  actually allow the column_expression to vary per-row, and go through
  the XPath expression compiler each time. The regression test with
   COLUMNS a int PATH '' || lower(_path) || 'c'
  seems to be intended to confirm that behavior. But again, I think
  it is only confirming that LATERAL results in xmltable being called
  four consecutive times, with a different PATH in each call. It does
  not seem to demonstrate a single xmltable call doing anything special
  with recompiling a column path.

Am I overlooking something?

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 20. 1. 2019 v 5:37 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Working slowly through the documentation, I came upon:

 For XMLTABLE:

  - The xmltable function produces a table based on the given XML value,
    an XPath filter to extract rows, and an optional set of column
    definitions.                            ^^^^^^^^
    ...
    The mandatory COLUMNS clause specifies the list of columns ...
        ^^^^^^^^^
    if the COLUMNS clause is omitted, the rows in the result set contain
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    a single column of type xml containing the data matched by
    row_expression.

   This documentation seems undecided on whether the COLUMNS clause
   is mandatory or optional.

   It is mandatory in the SQL standard. It's mandatory in our grammar.
   We give a syntax_error if it's omitted.

   Is some of the documentation left over from an earlier contemplated
   design of having the clause be optional?

   Oracle does seem to allow the clause to be omitted, and produces a
   single xml column, as described. Was there an earlier plan to imitate
   Oracle's nonstandard behavior on that point? (Hardly seems worth the
   effort, as porting an Oracle query depending on it would simply entail
   adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and
   standard.)
 
If I remember, described functionality was implemented in early patches, but was removed to simplify code. To now, there was not a request to do it.

Unfortunately, the documentation was not fixed.
 

 - It is possible for a default_expression to reference the value of
   output columns that appear prior to it in the column list, so the
   default of one column may be based on the value of another column.

  Is there an example that clearly shows this to work? If I write a
  default_expression referring to a prior column in /xmltable's own/
  column list, I get an undefined_column error. I can successfully refer
  to a column of /an earlier FROM item in the SELECT/, but I am not sure
  that demonstrates the behavior claimed here.

  There is what looks like an example among the regression tests
  (the one with DEFAULT ascii(_path) - 54), but that seems only to
  demonstrate xmltable getting invoked four times (as documented for
  LATERAL), not a single xmltable invocation producing multiple rows
  with recomputed defaults.

  If it's any comfort, I haven't gotten Oracle's xmltable to recognize
  earlier columns in its own column list either.

 - Unlike regular PostgreSQL functions, column_expression and
   default_expression are not evaluated to a simple value before calling
   the function. column_expression is normally evaluated exactly once
   per input row, and default_expression is evaluated each time a default
   is needed for a field.

  I've already covered the question about default_expression, but what
  this passage says about column_expression seems, at least, ambiguously
  worded, too:

  It goes without saying that /the XPath evaluator/ evaluates the
  column_expression exactly once per input row. In the standard, that's
  the only per-row evaluation happening; the column_expression SQL value
  only gets compiled to an XPath expression once at the start. (In fact,
  in the standard, it can't even be an arbitrary SQL expression, only a
  string literal. Oracle enforces that too.)

column expressions are evaluated once per row, but XPath  expression is compiled per row too, if I remember well. We designed it more tolerant as we expected possibility to store XPath expression in one column and data in second column.

Regards

Pavel
 

  It seems that our implementation is meant to extend the standard and
  actually allow the column_expression to vary per-row, and go through
  the XPath expression compiler each time. The regression test with
   COLUMNS a int PATH '' || lower(_path) || 'c'
  seems to be intended to confirm that behavior. But again, I think
  it is only confirming that LATERAL results in xmltable being called
  four consecutive times, with a different PATH in each call. It does
  not seem to demonstrate a single xmltable call doing anything special
  with recompiling a column path.

Am I overlooking something?

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/19/19 23:49, Pavel Stehule wrote:

> If I remember, described functionality was implemented in early patches,
> but was removed to simplify code. To now, there was not a request to do it.
> 
> Unfortunately, the documentation was not fixed.

I'll do that, as I'm working in there anyway. :)

> column expressions are evaluated once per row, but XPath  expression is
> compiled per row too, if I remember well.

I looked for evidence of that in the code, but did not find it; the
compilation appears to happen in XmlTableSetColumnFilter, which is
called from tfuncInitialize.

I can't guarantee I didn't miss something, though.

> We designed it more tolerant as
> we expected possibility to store XPath expression in one column and data in
> second column.

Perhaps if I could see an example showing the functionality... The nearest
I could find in the regression tests was the test with
 COLUMNS a int PATH '' || lower(_path) || 'c'
but, again, I think that test only demonstrates how LATERAL works, not
any behavior specific to xmltable.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 20. 1. 2019 v 6:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/19/19 23:49, Pavel Stehule wrote:

> If I remember, described functionality was implemented in early patches,
> but was removed to simplify code. To now, there was not a request to do it.
>
> Unfortunately, the documentation was not fixed.

I'll do that, as I'm working in there anyway. :)

> column expressions are evaluated once per row, but XPath  expression is
> compiled per row too, if I remember well.

I looked for evidence of that in the code, but did not find it; the
compilation appears to happen in XmlTableSetColumnFilter, which is
called from tfuncInitialize.

it is called per input row.


I can't guarantee I didn't miss something, though.

> We designed it more tolerant as
> we expected possibility to store XPath expression in one column and data in
> second column.

Perhaps if I could see an example showing the functionality... The nearest
I could find in the regression tests was the test with
 COLUMNS a int PATH '' || lower(_path) || 'c'
but, again, I think that test only demonstrates how LATERAL works, not
any behavior specific to xmltable.

the main reason for this was not to support support some specific patterns - just used design doesn't requires stronger changes in executor or internal multi call caching of some internal data. Probably you can find discussion related to this topic in mailing list archive.

sure - using any expression in PATH clause should to demonstrate this functionality.



Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Hi,

On 01/20/19 00:26, Pavel Stehule wrote:
>>> column expressions are evaluated once per row, but XPath  expression is
>>> compiled per row too, if I remember well.
>>
>> I looked for evidence of that in the code, but did not find it; the
>> compilation appears to happen in XmlTableSetColumnFilter, which is
>> called from tfuncInitialize.
> 
> it is called per input row.

I could be misunderstanding what you mean by 'input row' here.

If I write a simple xmltable query where the row_expression produces
six rows:

SELECT *
FROM
  xmltable('pg_am/row'
    PASSING table_to_xml('pg_am', true, false, '')
    COLUMNS amname text PATH 'amname');

six rows are produced, though a breakpoint set on XmlTableSetColumnFilter
fires only once, from tfuncInitialize at the start of xmltable's execution.


By contrast, in the regression test example with PATH ''||lower(_path)||'c',
four rows are produced and the breakpoint fires four times.

However ... that isn't because one call to xmltable is producing four rows
and recomputing the column_expression each time.

It's because that xmltable is the RHS of a LATERAL, and the LHS of the
LATERAL is producing four tuples with different values of columns the RHS
depends on, so the RHS (xmltable) is being called four different times,
producing one row each time, still with XmlTableSetColumnFilter being called
only during initialization.


> sure - using any expression in PATH clause should to demonstrate this
> functionality.

Well, there seem to be two distinct features touched on in the docs:

1. The column_expression is allowed to be an expression, not restricted
   to a string literal as it is in the standard (and Oracle).

2. Not only is it an expression, but it's an expression whose evaluation
   is deferred and can happen more than once in the same xmltable call.

The example in the regression tests certainly demonstrates (1). Without (1),
it would be a syntax error.

I think the same example would produce the same output even with feature (2)
absent. It's LATERAL doing the magic there. So I am doubtful that it
demonstrates (2).

I put some effort last night into trying to even construct any query that
would demonstrate (2), and I came up short, but that could be my lack of
imagination. (Somewhere in that effort I happened to notice that xmltable
doesn't seem to be parsed successfully inside a ROWS FROM (...) construct,
which might be another issue for another time....)

So, if you have a way to build a query that demonstrates (2), my aha! moment
might then arrive.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 20. 1. 2019 v 17:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Hi,

On 01/20/19 00:26, Pavel Stehule wrote:
>>> column expressions are evaluated once per row, but XPath  expression is
>>> compiled per row too, if I remember well.
>>
>> I looked for evidence of that in the code, but did not find it; the
>> compilation appears to happen in XmlTableSetColumnFilter, which is
>> called from tfuncInitialize.
>
> it is called per input row.

I could be misunderstanding what you mean by 'input row' here.

input row mean a row of processed relation. The xml value from this row can be transformed to 0..M output rows.

The column filter expressions are evaluated once per input rows, default expressions are evaluated when it is necessary - possibly once for any output row
 

If I write a simple xmltable query where the row_expression produces
six rows:

SELECT *
FROM
  xmltable('pg_am/row'
    PASSING table_to_xml('pg_am', true, false, '')
    COLUMNS amname text PATH 'amname');

six rows are produced, though a breakpoint set on XmlTableSetColumnFilter
fires only once, from tfuncInitialize at the start of xmltable's execution.


By contrast, in the regression test example with PATH ''||lower(_path)||'c',
four rows are produced and the breakpoint fires four times.

it is expected - the input relation has four lines - the function was 4x initialized. In this case 1 call of xmltable produces 1 row.
 

However ... that isn't because one call to xmltable is producing four rows
and recomputing the column_expression each time.

It's because that xmltable is the RHS of a LATERAL, and the LHS of the
LATERAL is producing four tuples with different values of columns the RHS
depends on, so the RHS (xmltable) is being called four different times,
producing one row each time, still with XmlTableSetColumnFilter being called
only during initialization.


> sure - using any expression in PATH clause should to demonstrate this
> functionality.

Well, there seem to be two distinct features touched on in the docs:

1. The column_expression is allowed to be an expression, not restricted
   to a string literal as it is in the standard (and Oracle).

2. Not only is it an expression, but it's an expression whose evaluation
   is deferred and can happen more than once in the same xmltable call.

yes, it is any expressions used there are evaluated per 1 call. If input relation has N rows, then xmltable is initialized N times - by different words. xmltable is evaluated independently for any processed XML document.



The example in the regression tests certainly demonstrates (1). Without (1),
it would be a syntax error.

I think the same example would produce the same output even with feature (2)
absent. It's LATERAL doing the magic there. So I am doubtful that it
demonstrates (2).

LATERAL is necessary, because  XMLTABLE can be used only in FROM clause, and in this case XMLTABLE has mutable parameters.


I put some effort last night into trying to even construct any query that
would demonstrate (2), and I came up short, but that could be my lack of
imagination. (Somewhere in that effort I happened to notice that xmltable
doesn't seem to be parsed successfully inside a ROWS FROM (...) construct,
which might be another issue for another time....)

So, if you have a way to build a query that demonstrates (2), my aha! moment
might then arrive.

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/20/19 11:55, Pavel Stehule wrote:
> input row mean a row of processed relation. The xml value from this row can
> be transformed to 0..M output rows.
> 
> The column filter expressions are evaluated once per input rows, default
> expressions are evaluated when it is necessary - possibly once for any
> output row
> ...
> it is expected - the input relation has four lines - the function was 4x
> initialized. In this case 1 call of xmltable produces 1 row.

Good ... I think we're converging on a shared understanding.

I am just used to speaking of xmltable as a simple function that executes
one row_expression one time against one supplied input, and generates 0..M
output rows from it. If there are multiple rows in a relation that you
want to apply xmltable to, that's a simple matter of calling xmltable
multiple times (which is just what SQL is doing when the xmltable is on
the RHS of an explicit or implied LATERAL).

The upshot seems to be that there is nothing necessarily special about
how xmltable treats its column_expressions: it compiles them once upon
entry to the function, as one would naïvely expect. (Or, if there is
anything more special about how the column_expression is being handled,
it seems not to be necessary, as the naïve behavior would be adequate.)

Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
functions," is more likely to perplex readers than to enlighten them.
What it says about column_expression does not seem to lead to any useful
difference from the behavior if it were "just like regular PostgreSQL
functions".

The part about usefully using volatile functions in default_expression
remains important to mention.

The statement in an earlier paragraph that "It is possible for a
default_expression to reference the value of output columns that appear
prior to it in the column list" still may need some rework, because it
does not seem possible to refer to prior columns /within xmltable's own
column list/ (though that could be useful, and I think it is intended
in the standard). Doesn't seem to work in Oracle either....

While it does seem possible to refer to columns supplied by
/earlier FROM items in the containing SELECT/, that simply results in
multiple calls of xmltable, just as in the column_expression case.

>> I think the same example would produce the same output even with feature
>> (2)
>> absent. It's LATERAL doing the magic there. So I am doubtful that it
>> demonstrates (2).
> 
> LATERAL is necessary, because  XMLTABLE can be used only in FROM clause,
> and in this case XMLTABLE has mutable parameters.

For what it's worth, if I repeat the query with the word LATERAL removed,
it works just the same. I think that's simply because the LATERAL behavior
is implied for a function-call FROM item, so the explicit word isn't needed.
The main thing is, evaluation proceeds in the way described under LATERAL
in the ref page for SELECT.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
functions," is more likely to perplex readers than to enlighten them.
What it says about column_expression does not seem to lead to any useful
difference from the behavior if it were "just like regular PostgreSQL
functions".

regular Postgres' functions has evaluated all arguments before own execution. I think so this note is related much more to expressions used as defaults.


The part about usefully using volatile functions in default_expression
remains important to mention.

The statement in an earlier paragraph that "It is possible for a
default_expression to reference the value of output columns that appear
prior to it in the column list" still may need some rework, because it
does not seem possible to refer to prior columns /within xmltable's own
column list/ (though that could be useful, and I think it is intended
in the standard). Doesn't seem to work in Oracle either....

While it does seem possible to refer to columns supplied by
/earlier FROM items in the containing SELECT/, that simply results in
multiple calls of xmltable, just as in the column_expression case.

>> I think the same example would produce the same output even with feature
>> (2)
>> absent. It's LATERAL doing the magic there. So I am doubtful that it
>> demonstrates (2).
>
> LATERAL is necessary, because  XMLTABLE can be used only in FROM clause,
> and in this case XMLTABLE has mutable parameters.

For what it's worth, if I repeat the query with the word LATERAL removed,
it works just the same. I think that's simply because the LATERAL behavior
is implied for a function-call FROM item, so the explicit word isn't needed.
The main thing is, evaluation proceeds in the way described under LATERAL
in the ref page for SELECT.

In this case the LATERAL keyword is optional - with or without this keyword it is lateral join.

Regards

Pavel
 

Regards,
-Chap

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/20/19 12:48, Pavel Stehule wrote:
>>
>> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
>> functions," is more likely to perplex readers than to enlighten them.
>> What it says about column_expression does not seem to lead to any useful
>> difference from the behavior if it were "just like regular PostgreSQL
>> functions".
> 
> regular Postgres' functions has evaluated all arguments before own
> execution. I think so this note is related much more to expressions used as
> defaults.

Sure, but again, is there an example, or can one easily be constructed,
that shows the default expressions working in such a way?

I am not able to use a default expression to refer to an earlier
column in the column list of the xmltable call.

I am able to use a default expression to refer to a column of an earlier
FROM item in the enclosing SELECT. But such a query ends up having LATERAL
form (whether or not the word LATERAL is used), and re-executes xmltable
whenever the referenced column value changes. In that case, whether the
default argument is evaluated at function entry or later doesn't seem
to matter: the function is re-executed, so evaluating the new default
at the time of entry is sufficient.

So, I have still not been able to construct a query that requires the
deferred evaluation behavior. But perhaps there is a way I haven't
thought of.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/20/19 17:13, Chapman Flack wrote:
> On 01/20/19 12:48, Pavel Stehule wrote:
>> regular Postgres' functions has evaluated all arguments before own
>> execution. I think so this note is related much more to expressions used as
>> defaults.
> 
> Sure, but again, is there an example, or can one easily be constructed,
> that shows the default expressions working in such a way?

To make my question more concrete, here is the current regression test
query that uses an SQL expression for a default:

SELECT xmltable.*
FROM
  xmltest2,
  xmltable(('/d/r/' || lower(_path) || 'c')
    PASSING x
    COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
 a
----
 11
 12
 13
 14
(4 rows)


Here is the same query desugared into a call of the PL/Java "xmltable":

SELECT xmltable.*
FROM
  xmltest2,
  LATERAL (SELECT x AS ".",   ascii(_path) - 54 AS "A_DFT") AS p,
  "xmltable"(('/d/r/' || lower(_path) || 'c'),
    PASSING => p,
    COLUMNS => ARRAY[
      'let $a := x return xs:int(if (exists($a)) then $a else $A_DFT)'
    ]
  ) AS (a int);
 a
----
 11
 12
 13
 14
(4 rows)


So the PL/Java version works and produces the same results. And yet
it certainly is a "regular PostgreSQL function" made with CREATE FUNCTION,
no special treatment of arguments, all evaluated before entry in the usual
way.

So it appears that this example does not depend on any special treatment
of the default_expression.

Is there an example that can be constructed that would depend on the
special treatment (in which case, the PL/Java implementation would be
unable to produce the same result)?

Regards,
-Chap



... the xs:int(...) cast above is needed for now, just because the PL/Java
implementation does not yet include the standard's algorithm to find
the right cast automatically.


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/20/19 20:07, Chapman Flack wrote:

> So it appears that this example does not depend on any special treatment
> of the default_expression.
> 
> Is there an example that can be constructed that would depend on the
> special treatment (in which case, the PL/Java implementation would be
> unable to produce the same result)?

I see that I briefly forgot one difference that does matter, namely the
timing of a call to a volatile function like nextval. That detail certainly
needs to remain in the docs.

I am left wondering if that might be the only effect of the deferred
argument evaluation that really does matter.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/20/19 12:48, Pavel Stehule wrote:
>>
>> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
>> functions," is more likely to perplex readers than to enlighten them.
>> What it says about column_expression does not seem to lead to any useful
>> difference from the behavior if it were "just like regular PostgreSQL
>> functions".
>
> regular Postgres' functions has evaluated all arguments before own
> execution. I think so this note is related much more to expressions used as
> defaults.

Sure, but again, is there an example, or can one easily be constructed,
that shows the default expressions working in such a way?

I am not able to use a default expression to refer to an earlier
column in the column list of the xmltable call.


probably you can see the effect if you use some volatile function .. random(), nextval(),

I think so notice in documentation was not a motivation to use it. It was explanation of implementation and warnings against side effect.



I am able to use a default expression to refer to a column of an earlier
FROM item in the enclosing SELECT. But such a query ends up having LATERAL
form (whether or not the word LATERAL is used), and re-executes xmltable
whenever the referenced column value changes. In that case, whether the
default argument is evaluated at function entry or later doesn't seem
to matter: the function is re-executed, so evaluating the new default
at the time of entry is sufficient.

it has sense only for volatile functions. it was not often. On second hand deferred evaluation shoul not be a problem, and more, it is evaluated only when it is necessary, what is more sensible for me.


So, I have still not been able to construct a query that requires the
deferred evaluation behavior. But perhaps there is a way I haven't
thought of.

Regards,
-Chap

Re: House style for DocBook documentation?

От
Alvaro Herrera
Дата:
On 2019-Jan-19, Chapman Flack wrote:

> I have noticed a couple of things:
> 
> - 'SQL' is often marked up as <acronym>SQL</acronym>, but far from always.
>
> - no such markup is applied to 'JSON' or 'XML' at all, at least
>   not in func.sgml.

I think these inconsistencies just stem from lack of a strong reviewer
hand on the topic.  Let's change that?

> - there is a README.links with this guideline:
> 
>   o  Do not use text with <ulink> so the URL appears in printed output
> 
>   but a grep -r in doc/src/sgml turns up 112 uses that observe the
>   guideline, and 147 that supply link text.

I think the README.links was written in the SGML times; now that we're
in XML it may need to be reconsidered.

> (thinks to self half-seriously about an XSL transform for generating
> printed output that could preserve link-texted links, add raised numbers,
> and produce a numbered URLs section at the back)

Well, if you have the time and inclination, and you think such changes
are improvements, feel free to propose them.  Do keep in mind we have a
number of outputs that would be good to keep consistent.

Thanks,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 01/21/19 09:12, Alvaro Herrera wrote:

>> (thinks to self half-seriously about an XSL transform for generating
>> printed output that could preserve link-texted links, add raised numbers,
>> and produce a numbered URLs section at the back)
> 
> Well, if you have the time and inclination, and you think such changes
> are improvements, feel free to propose them.  Do keep in mind we have a
> number of outputs that would be good to keep consistent.

Well, "consistent" what I was half-thinking about, FSVO "consistent".

For me, if I'm looking at an online document, I would prefer to see
the descriptive text of the link, rather than a long jaggy URL. If I
want to see the URL, I can hover over it, and if I want to go there,
I can click it.

But the point's well taken that in /printed output/, that's of no use.
Which is, in a sense, an inconsistency: in one format, you can follow the
links, while in another, you're out of luck.

Maybe a simpler transform for printed output, rather than collecting
all URLs into one section at the back, would just be to follow any
<ulink> that has link text with a <footnote> containing the same ulink
without the link text, so it shows the URL, and that would be right at
the bottom of the same 'page'.

That'd be an introductory XSL exercise....

In practice, applying such a transform "for printed output" would
probably mean applying it when generating PDF output, which of course
can also be viewed online (and probably most often is, these days).

So preserving the original ulink run into the text is still of use,
in a PDF viewer that can follow links, but adding the footnote ensures
that an actual hard copy is still usable.

I wouldn't think it important to apply the same treatment when making HTML.
So maybe the right value of "consistent" is the one that comes from
listing out the various output formats and specifying the right
transformation to be applied to each one.

(Now wonders if there's a way to do the same transform into HTML while
styling the footnote and marker to hide behind @media print....)

Regards,
-Chap


Re: House style for DocBook documentation?

От
"Joshua D. Drake"
Дата:
On 1/21/19 8:46 AM, Chapman Flack wrote:
> On 01/21/19 09:12, Alvaro Herrera wrote:
>
>>> (thinks to self half-seriously about an XSL transform for generating
>>> printed output that could preserve link-texted links, add raised numbers,
>>> and produce a numbered URLs section at the back)
>> Well, if you have the time and inclination, and you think such changes
>> are improvements, feel free to propose them.  Do keep in mind we have a
>> number of outputs that would be good to keep consistent.
> Well, "consistent" what I was half-thinking about, FSVO "consistent".
>
> For me, if I'm looking at an online document, I would prefer to see
> the descriptive text of the link, rather than a long jaggy URL. If I
> want to see the URL, I can hover over it, and if I want to go there,
> I can click it.
>
> But the point's well taken that in /printed output/, that's of no use.
> Which is, in a sense, an inconsistency: in one format, you can follow the
> links, while in another, you're out of luck.
>
> Maybe a simpler transform for printed output, rather than collecting
> all URLs into one section at the back, would just be to follow any
> <ulink> that has link text with a <footnote> containing the same ulink
> without the link text, so it shows the URL, and that would be right at
> the bottom of the same 'page'.
>
> That'd be an introductory XSL exercise....
>
> In practice, applying such a transform "for printed output" would
> probably mean applying it when generating PDF output, which of course
> can also be viewed online (and probably most often is, these days).

I don't think that is a good idea. PDFs have had the ability to embed 
hyperlinks under descriptive text for years. If we are going to expand 
links for printed output, we should have a specific build / modification 
to the make file for printed output. I also wonder if we are trying to 
solve the 1% problem here. Who is really going to "print" our docs? If 
they do print the docs, they are likely not going to "type in" a long 
URL. They are going to go online (or to the PDF) and click the link that 
they saw within the printed page.

JD

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org
***  A fault and talent of mine is to tell it exactly how it is.  ***



Re: House style for DocBook documentation?

От
Alvaro Herrera
Дата:
On 2019-Jan-21, Chapman Flack wrote:

> But the point's well taken that in /printed output/, that's of no use.
> Which is, in a sense, an inconsistency: in one format, you can follow the
> links, while in another, you're out of luck.
> 
> Maybe a simpler transform for printed output, rather than collecting
> all URLs into one section at the back, would just be to follow any
> <ulink> that has link text with a <footnote> containing the same ulink
> without the link text, so it shows the URL, and that would be right at
> the bottom of the same 'page'.

Of course, the text would also be clickable, right?  I think putting the
URL in a footnote is good in that case; it works both on screen and on
paper, which should alleviate JD's concerns.

> I wouldn't think it important to apply the same treatment when making HTML.

Right, only PDF.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 01/21/19 12:07, Joshua D. Drake wrote:
> Who is really going to "print" our docs? If they do print the
> docs, they are likely not going to "type in" a long URL.

QR code in footnote (ducks and runs).


Re: House style for DocBook documentation?

От
"Joshua D. Drake"
Дата:
On 1/21/19 10:01 AM, Alvaro Herrera wrote:
> On 2019-Jan-21, Chapman Flack wrote:
>
>> But the point's well taken that in /printed output/, that's of no use.
>> Which is, in a sense, an inconsistency: in one format, you can follow the
>> links, while in another, you're out of luck.
>>
>> Maybe a simpler transform for printed output, rather than collecting
>> all URLs into one section at the back, would just be to follow any
>> <ulink> that has link text with a <footnote> containing the same ulink
>> without the link text, so it shows the URL, and that would be right at
>> the bottom of the same 'page'.
> Of course, the text would also be clickable, right?  I think putting the
> URL in a footnote is good in that case; it works both on screen and on
> paper, which should alleviate JD's concerns.

Yeah I could see that. I thought about that but was wondering if it was 
possible to auto cite?

JD


>
>> I wouldn't think it important to apply the same treatment when making HTML.
> Right, only PDF.
>

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org
***  A fault and talent of mine is to tell it exactly how it is.  ***



Re: House style for DocBook documentation?

От
"Joshua D. Drake"
Дата:
On 1/21/19 10:11 AM, Chapman Flack wrote:
> On 01/21/19 12:07, Joshua D. Drake wrote:
>> Who is really going to "print" our docs? If they do print the
>> docs, they are likely not going to "type in" a long URL.
> QR code in footnote (ducks and runs).

Funny, although I know why you said that. I don't think it is that bad 
of an idea. Everyone uses QR Codes now. Have a QR code that 
automatically opens a page that lists all the expanded links based on 
the page it is placed?

That is pretty cool but I am not sure if that is something that would 
happen in this round.

JD


>

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org
***  A fault and talent of mine is to tell it exactly how it is.  ***



Re: House style for DocBook documentation?

От
Alvaro Herrera
Дата:
On 2019-Jan-21, Joshua D. Drake wrote:

> On 1/21/19 10:01 AM, Alvaro Herrera wrote:
> > On 2019-Jan-21, Chapman Flack wrote:
> > 
> > > But the point's well taken that in /printed output/, that's of no use.
> > > Which is, in a sense, an inconsistency: in one format, you can follow the
> > > links, while in another, you're out of luck.
> > > 
> > > Maybe a simpler transform for printed output, rather than collecting
> > > all URLs into one section at the back, would just be to follow any
> > > <ulink> that has link text with a <footnote> containing the same ulink
> > > without the link text, so it shows the URL, and that would be right at
> > > the bottom of the same 'page'.
> > Of course, the text would also be clickable, right?  I think putting the
> > URL in a footnote is good in that case; it works both on screen and on
> > paper, which should alleviate JD's concerns.
> 
> Yeah I could see that. I thought about that but was wondering if it was
> possible to auto cite?

Sorry, what do you mean with auto cite?  Put all links at the end of the
section/chapter/book?  That seems less usable to me (you have to find
out the page where the links appear, and make sure to print the right
pages)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 01/21/19 13:14, Joshua D. Drake wrote:
>> Of course, the text would also be clickable, right?  I think putting the
>> URL in a footnote is good in that case; it works both on screen and on
>> paper, which should alleviate JD's concerns.
> 
> Yeah I could see that. I thought about that but was wondering if it was

It looks like the easiest way to integrate such a behavior into the
current Makefile would be not as a separate DocBook->DocBook transform
in advance, but simply by editing the existing stylesheet-fo.xsl that
produces the FO input for generating the PDF.

That would mean learning some FO, which I've wanted to do for a while,
but haven't yet, so it stops looking like something I might experiment
with this afternoon. OTOH, it could mean more flexibility in how the
presentation should look.

I note in passing that the google result [1] is nonempty....

-Chap

[1] https://www.google.com/search?q=xsl-fo+qr+code


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Hi,

In two places in the XMLTABLE implementation (XmlTableFetchRow, iterating
through a nodeset returned by the row_expression, and XmlTableGetValue,
going through a nodeset returned by the column_expression), the iteration
proceeds in index order through xpathobj->nodesetval->nodeTab.

The same happens in the older xml_xpathobjtoxmlarray, producing the array
result of the xpath() function.

In <libxml/xpath.h> one finds this unsettling comment:

 xmlNodePtr *nodeTab; /* array of nodes in no particular order */


So far, no matter what oddball XPath expressions I think up, nodeTab
does seem to end up in document order. It would be comforting to document
that, but the comment suggests it might not be guaranteed.

Are you aware of any statement I might have missed in the libxml docs,
where they commit to XPath evaluation returning a nodeset where nodeTab
has a predictable order? If there is a statement to that effect somewhere,
it might be worth citing in a comment in xml.c.

Regards,
-Chap



Here is a fairly oddball query, generating an XML document with about
3k names in descending order, partitioning those elements into subsets
having the same second character of the name, and unioning those back
together. You'd sort of expect that to produce a result nodeTab in goofy
order if anything could, but no, the results seem verifiably in descending
order every time, suggesting the result nodeTab is indeed being put
in document order before the XPath evaluator returns. If only they would
document that.

WITH nodeset_order_check(name, prev_name) AS (
 SELECT
  x, lag(x) OVER (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
  FROM
   (SELECT
     string_agg(DISTINCT
      'x/n[substring(.,2,1) = "' || substring(proname from 2 for 1) || '"]',
      '|')
     FROM pg_proc) AS rowx(rowx),
   (SELECT
     XMLELEMENT(NAME x,
                XMLAGG(XMLELEMENT(NAME n, proname) ORDER BY proname DESC))
     FROM pg_proc) AS src(src),
   XMLTABLE(rowx PASSING src COLUMNS x text PATH '.')
)
SELECT every(prev_name >= name IS NOT FALSE) FROM nodeset_order_check;
 every
-------
 t
(1 row)


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Hi,

On 01/21/19 01:33, Pavel Stehule wrote:
> ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:

>> form (whether or not the word LATERAL is used), and re-executes xmltable
>> whenever the referenced column value changes. In that case, whether the
>> default argument is evaluated at function entry or later doesn't seem
>> to matter: the function is re-executed, so evaluating the new default
>> at the time of entry is sufficient.
> 
> it has sense only for volatile functions. it was not often. On second hand
> deferred evaluation shoul not be a problem, and more, it is evaluated only
> when it is necessary, what is more sensible for me.

That makes sense. I trimmed the language about input rows and referring to
earlier columns, and put more emphasis on the usefulness of evaluating
volatile functions only when needed.

I am:

- re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
  (just so CF app does not lose track)

- re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged

- attaching for the first time xml-functions-type-docfix-1.patch

The doc patch is made to go on top of the passing-mechanisms patch
(there were some doc changes in passing-mechanisms, changed again in
the new patch to be links to the added Limits and Compatibility section).

I hope the patched docs describe accurately what we have at this point.

Regards,
-Chap

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Hi,

On 01/21/19 01:33, Pavel Stehule wrote:
> ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:

>> form (whether or not the word LATERAL is used), and re-executes xmltable
>> whenever the referenced column value changes. In that case, whether the
>> default argument is evaluated at function entry or later doesn't seem
>> to matter: the function is re-executed, so evaluating the new default
>> at the time of entry is sufficient.
>
> it has sense only for volatile functions. it was not often. On second hand
> deferred evaluation shoul not be a problem, and more, it is evaluated only
> when it is necessary, what is more sensible for me.

That makes sense. I trimmed the language about input rows and referring to
earlier columns, and put more emphasis on the usefulness of evaluating
volatile functions only when needed.

I am:

- re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
  (just so CF app does not lose track)

- re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged

- attaching for the first time xml-functions-type-docfix-1.patch

The doc patch is made to go on top of the passing-mechanisms patch
(there were some doc changes in passing-mechanisms, changed again in
the new patch to be links to the added Limits and Compatibility section).

I hope the patched docs describe accurately what we have at this point.

looks well

Pavel


Regards,
-Chap

Re: House style for DocBook documentation?

От
Peter Eisentraut
Дата:
On 19/01/2019 21:24, Chapman Flack wrote:
> (thinks to self half-seriously about an XSL transform for generating
> printed output that could preserve link-texted links, add raised numbers,
> and produce a numbered URLs section at the back)

External links already create footnotes in the PDF output.  Is that
different from what you are saying?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 01/25/19 09:01, Peter Eisentraut wrote:
> On 19/01/2019 21:24, Chapman Flack wrote:
>> (thinks to self half-seriously about an XSL transform for generating
>> printed output that could preserve link-texted links, add raised numbers,
>> and produce a numbered URLs section at the back)
> 
> External links already create footnotes in the PDF output.  Is that
> different from what you are saying?

That might, only, indicate that I was just thinking aloud in email and
had not gone and checked in PDF output to see how the links were handled.

Yes, it could very possibly indicate that.

If they are already processed that way, does that mean the

o  Do not use text with <ulink> so the URL appears in printed output

in README.links should be considered obsolete, and removed even, and
doc authors should feel free to put link text in <ulink> without
hesitation?

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/25/19 00:45, Pavel Stehule wrote:
> pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> I am:
>> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
>>   (just so CF app does not lose track)
>> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
>> - attaching for the first time xml-functions-type-docfix-1.patch
>>
>> The doc patch is made to go on top of the passing-mechanisms patch

Realized xmltable-xmlexists-passing-mechanisms-1.patch didn't add
a regression test.  Here attaching (or re-attaching):

- xmltable-xpath-result-processing-bugfix-5.patch - unchanged
- xmltable-xmlexists-passing-mechanisms-2.patch - now with test
- xml-functions-type-docfix-1.patch - unchanged

I'll venture a review opinion that all of this applies, builds, and passes
check-world on top of 18c0da8, and that, of the issues I had identified at
the start of this thread, these changes resolve the ones they set out to
resolve.

But the second two patches are my own work, so another reviewer is needed.
The passing-mechanisms patch is tiny while the docfix patch is not, so
there's an opening for a reviewer with an interest in documentation. :)

There is still nothing in this patch set to address [1], though that
also seems worth doing, perhaps in another patch, and probably not
difficult, perhaps needing only a regex.

And of course we're still saddled with all the unfixable limits
of XPath 1.0; this patch set is fixing a few peripheral fixable things
around that.

-Chap


[1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

As the reporter of the issues raised in this email thread, I've reviewed the first patch
and contributed the second and third.

WHAT THE PATCHES DO:

xmltable-xpath-result-processing-bugfix-5.patch contains code changes correcting
a subset of the issues that were raised in this email thread.

xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow the XML
parameter passing mechanism BY VALUE as well as BY REF. Both are ignored, but
formerly BY VALUE was a syntax error, which was unintuitive considering that BY VALUE
is the passing mechanism PostgreSQL implements (XML node identities are not preserved).

xml-functions-type-docfix-1.patch conforms the documentation to reflect the changes in
this patch set and the limitations identified in this thread.

WHAT I HAVE REVIEWED:

I have applied all three patches over 18c0da8 and confirmed that installcheck-world passes
and that the code changes resolve the issues they set out to resolve.

I've made no entry for "spec compliant" because the question is moot; the spec is written
in terms of the XQuery language, types, and concepts, and these facilities in PG are
implemented on XPath 1.0, which doesn't have those. But the changes in this patch set
do make the PG behaviors more, well, closely analogous to the way the spec compliant
functions would behave.

WHAT I HAVE NOT REVIEWED:

The passing-mechanisms and docfix patches are my own work, so there should be another
reviewer who is not me. I've looked closely at the technical, SQL/XML behavior aspects already,
but a reviewer with an eye for documentation would be welcome.

I'll venture my opinion that this is ready-for-committer to the extent of my own review, but will
leave the status at needs-review for a not-me reviewer to update.

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

so 26. 1. 2019 v 4:25 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

maybe more simple will be work with original commitfest subscriptions. I can do review of xmltable-xmlexists-passing-mechanisms-2.patch. Documentation review will be harder - I am not a native speaker and I have not a necessary knowledges of XQuery (probably only you have this knowledge).
 

As the reporter of the issues raised in this email thread, I've reviewed the first patch
and contributed the second and third.

WHAT THE PATCHES DO:

xmltable-xpath-result-processing-bugfix-5.patch contains code changes correcting
a subset of the issues that were raised in this email thread.

xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow the XML
parameter passing mechanism BY VALUE as well as BY REF. Both are ignored, but
formerly BY VALUE was a syntax error, which was unintuitive considering that BY VALUE
is the passing mechanism PostgreSQL implements (XML node identities are not preserved).

xml-functions-type-docfix-1.patch conforms the documentation to reflect the changes in
this patch set and the limitations identified in this thread.

WHAT I HAVE REVIEWED:

I have applied all three patches over 18c0da8 and confirmed that installcheck-world passes
and that the code changes resolve the issues they set out to resolve.

I've made no entry for "spec compliant" because the question is moot; the spec is written
in terms of the XQuery language, types, and concepts, and these facilities in PG are
implemented on XPath 1.0, which doesn't have those. But the changes in this patch set
do make the PG behaviors more, well, closely analogous to the way the spec compliant
functions would behave.

WHAT I HAVE NOT REVIEWED:

The passing-mechanisms and docfix patches are my own work, so there should be another
reviewer who is not me. I've looked closely at the technical, SQL/XML behavior aspects already,
but a reviewer with an eye for documentation would be welcome.

I'll venture my opinion that this is ready-for-committer to the extent of my own review, but will
leave the status at needs-review for a not-me reviewer to update.

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/25/19 19:37, Chapman Flack wrote:
> There is still nothing in this patch set to address [1], though that
> also seems worth doing, perhaps in another patch, and probably not
> difficult, perhaps needing only a regex.

Heck, it could be even simpler than that. If some XML input has a DTD,
the attempt to parse it as 'content' with libxml is sure to fail early
in the parse (because that's where the DTD is found). If libxml reports
enough error detail to show it failed because of a DTD—and it appears to:

DETAIL:  line 1: StartTag: invalid element name
<!DOCTYPE foo>

—then simply recognize that error and reparse as 'document' on the spot.
The early-failing first parse won't have cost much, and there is probably
next to nothing to gain by trying to be any more clever.

The one complication might that there seem to be versions of libxml that
report error detail differently (hence the variant regression test expected
files), so the code might have to recognize a couple different forms.

-Chap

> [1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net



Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 01/25/19 22:53, Pavel Stehule wrote:

> Documentation review will be harder - I am not a native speaker and I have
> not a necessary knowledges of XQuery (probably only you have this
> knowledge).

The doc patch is enough that I think it would be ideal to somehow attract
a native speaker who has interest in documentation to review it.
Even if that is someone without much XQuery-specific knowledge—I'll happily
answer whatever questions they ask about that. :)

-Chap


Re: House style for DocBook documentation?

От
Peter Eisentraut
Дата:
On 25/01/2019 15:37, Chapman Flack wrote:
>> External links already create footnotes in the PDF output.  Is that
>> different from what you are saying?
> That might, only, indicate that I was just thinking aloud in email and
> had not gone and checked in PDF output to see how the links were handled.
> 
> Yes, it could very possibly indicate that.
> 
> If they are already processed that way, does that mean the
> 
> o  Do not use text with <ulink> so the URL appears in printed output
> 
> in README.links should be considered obsolete, and removed even, and
> doc authors should feel free to put link text in <ulink> without
> hesitation?

I think it's obsolete, yes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


so 26. 1. 2019 v 1:38 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/25/19 00:45, Pavel Stehule wrote:
> pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> I am:
>> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
>>   (just so CF app does not lose track)
>> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
>> - attaching for the first time xml-functions-type-docfix-1.patch
>>
>> The doc patch is made to go on top of the passing-mechanisms patch

Realized xmltable-xmlexists-passing-mechanisms-1.patch didn't add
a regression test.  Here attaching (or re-attaching):

- xmltable-xpath-result-processing-bugfix-5.patch - unchanged
- xmltable-xmlexists-passing-mechanisms-2.patch - now with test
- xml-functions-type-docfix-1.patch - unchanged

I'll venture a review opinion that all of this applies, builds, and passes
check-world on top of 18c0da8, and that, of the issues I had identified at
the start of this thread, these changes resolve the ones they set out to
resolve.

But the second two patches are my own work, so another reviewer is needed.
The passing-mechanisms patch is tiny while the docfix patch is not, so
there's an opening for a reviewer with an interest in documentation. :)

There is still nothing in this patch set to address [1], though that
also seems worth doing, perhaps in another patch, and probably not
difficult, perhaps needing only a regex.

And of course we're still saddled with all the unfixable limits
of XPath 1.0; this patch set is fixing a few peripheral fixable things
around that.


I am sending a review of these patches

xmltable-xpath-result-processing-bugfix-5.patch - I'll skip it - just all tests passed

xmltable-xmlexists-passing-mechanisms-2.patch - this patch introduce new PASSING mechanism BY VALUE - it is just syntactic sugar due compatibility with standard. It is unhappy so previous implementation was broken and introduced "BY REF" instead "BY VALUE", but this bug should be fixed 10 years ago. It change nothing, all tests passed and the documentation looks ok.

Last patch is documentation only patch - I am thinking so the difference and limits our implementation of XPath based functions are described well and correctly.

I'll mark this patch as ready for commiters.

Regards

Pavel
 
-Chap


[1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net

Re: PostgreSQL vs SQL/XML Standards

От
Michael Paquier
Дата:
On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote:
> I'll mark this patch as ready for commiters.

For now I have moved the patch to the next CF, with the same status.
--
Michael

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 02/01/19 20:20, Michael Paquier wrote:
> On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote:
>> I'll mark this patch as ready for commiters.
> 
> For now I have moved the patch to the next CF, with the same status.

I wonder whether, given the move to next CF, it makes sense to change
the title of the CF entry from "XMLTABLE" to, more generically, XML
improvements, and get one or two more small changes in:

- get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to
  succeed even for content with DTDs, so that the content subtype really
  does fully include the document subtype, aligning it with the SQL:2006+
  standard. I think this would be a simple patch that I can deliver early
  this month, and Tom found reports where the current behavior already
  bites people in pg_restore. Its only effect would be to allow a currently-
  failing case to succeed (and stop biting people).

- get XMLEXISTS and XMLTABLE to allow passing of named parameters. I have
  less of a sense of how difficult that might be, but I see that the libxml
  xpath API does allow them. I don't know whether they were left out of the
  original development just arbitrarily, or whether some effort was made
  and ran into some problem.

  The value of supporting the named parameters, especially when the library
  limits us to XPath 1.0, is that the XPath 1.0 limitation that a value
  passed as the context item can only be an XML 'document' only applies to
  the context item, not to named parameters. So if somebody is trying to
  port an expression ...'foo(.)' PASSING bar... and bar is not in document
  form, there's a simple rewrite to ...'foo($a) PASSING bar AS a...
  whereas if we can't do named parameters, frustration ensues.

Again, the only effect of such a change would be to permit something that
currently isn't possible. I don't think I can promise to work on a patch
for the second issue, but they both seem worthwhile, and I'm happy to
work on the first.

It seems to me these changes and the doc patch to go with them are
closely enough related to fit in one CF entry that's still smaller and
simpler than many, and that shouldn't be too difficult to review for v12,
but I'll defer to more experienced voices.

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
Note: I sent an email last night with updated patches, which was not received because of a spamhaus reputation issue
formy email provider.
 

In working that out with my provider, at the moment I cannot send email at all, so I am using this comment to explain
whythe status went back to "needs review" with no new patches yet. I'll send them again when I can. 

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
[Resending to list so commitfest app will see it; the list blocked
this message the first time on a mail reputation issue. Sorry for
the duplication. I've removed the individual cc:s from this message.]


On 02/05/19 23:16, Chapman Flack wrote:
> I wonder whether, given the move to next CF, it makes sense to change
> the title of the CF entry from "XMLTABLE" to, more generically, XML
> improvements, and get one or two more small changes in:

Interpreting the crickets as approval, I have changed the title of the
CF entry, and the status back to Needs Review, with these patches
attached:

xmltable-xpath-result-processing-bugfix-6.patch
xmltable-xmlexists-passing-mechanisms-3.patch
xml-functions-type-docfix-2.patch
xml-content-2006-1.patch

That last one is new, and everything is rebased (onto 068503c).

xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
output for the no-libxml case that was left out of -5.

xml-functions-type-docfix-2.patch removes one more sentence I had meant
to remove[1] but forgotten to.

xml-content-2006-1.patch does this:

> - get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to
>   succeed even for content with DTDs, so that the content subtype really
>   does fully include the document subtype, aligning it with the SQL:2006+
>   standard. I think this would be a simple patch that I can deliver early
>   this month, and Tom found reports where the current behavior already
>   bites people in pg_restore. Its only effect would be to allow a currently-
>   failing case to succeed (and stop biting people).

It works as suggested in [2], just by intercepting the error if a
parse-as-content trips over a DTD, and retrying as a parse-as-document.

While that has a certain hacky smell, it also has the advantage of
handling what's probably an uncommon edge case in a way that adds no
upfront cost. (Other, 'tidier' approaches could involve evaluating a
regex first to decide how to parse--I believe everything that's allowed
ahead of a DTD makes a regular language--but that would add cycles to
every parse.)

In xml.c one does find the following comment:

 * TODO maybe libxml2's xmlreader is better? (do not construct DOM,
 * yet do not use SAX - see xmlreader.c)

and yes, I think a complete rewrite of xml_parse along those lines would
probably be a substantial win (why construct an internal DOM just to confirm
that the input is parsable, then throw it away?). But that would be a more
involved rewrite that I'm not volunteering to do.

This patch is a quick way to get the desired behavior given the current
implementation.

-Chap


[1]
https://www.postgresql.org/message-id/5C4A94A5.8010402%40anastigmatix.net
[2]
https://www.postgresql.org/message-id/5C4BDBFF.6040905%40anastigmatix.net


Вложения

Re: House style for DocBook documentation?

От
Peter Eisentraut
Дата:
On 2019-01-26 09:31, Peter Eisentraut wrote:
> On 25/01/2019 15:37, Chapman Flack wrote:
>>> External links already create footnotes in the PDF output.  Is that
>>> different from what you are saying?
>> That might, only, indicate that I was just thinking aloud in email and
>> had not gone and checked in PDF output to see how the links were handled.
>>
>> Yes, it could very possibly indicate that.
>>
>> If they are already processed that way, does that mean the
>>
>> o  Do not use text with <ulink> so the URL appears in printed output
>>
>> in README.links should be considered obsolete, and removed even, and
>> doc authors should feel free to put link text in <ulink> without
>> hesitation?
> 
> I think it's obsolete, yes.

Committed that change.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Ramanarayana
Дата:
Hi,
I have tested bug fixes provided by all the patches. They are working great. I found one minor issue 

 select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>' COLUMNS x XML PATH '/');

The above query returns the xml. But there is an extra plus symbol at the end

<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&amp;deep</n2>post</e>+

Regards,
Ram

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 03/01/19 07:15, Ramanarayana wrote:
> Hi,
> I have tested bug fixes provided by all the patches. They are working
> great. I found one minor issue
> 
>  select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi
> arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/');
> 
> The above query returns the xml. But there is an extra plus symbol at the
> end
> 
> <e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>+

Hi,

Are you sure that isn't the + added by psql when displaying a value
that contains a newline?

What happens if you repeat the query but after the psql command

  \a

to leave the output unaligned?

Thanks!
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Ramanarayana
Дата:
Hi,
Yes it is working fine with \a option in psql.

Cheers
Ram 4.0

Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
This CF entry shows Pavel and me as reviewers, but the included
patches were also produced by one or the other of us, so additional
review by someone who isn't us seems appropriate. :)

Would it make sense to remove one or both of us from the 'reviewers'
field in the app, to make it more obviously 'available' for reviewing?

One of the patches deals only with docs, so even someone who isn't
sure about reviewing XML functionality, but takes an interest in
documentation, would have a valuable role looking at that.

Regards,
-Chap


Re: Re: PostgreSQL vs SQL/XML Standards

От
David Steele
Дата:
On 3/6/19 6:16 PM, Chapman Flack wrote:
> This CF entry shows Pavel and me as reviewers, but the included
> patches were also produced by one or the other of us, so additional
> review by someone who isn't us seems appropriate. :)
> 
> Would it make sense to remove one or both of us from the 'reviewers'
> field in the app, to make it more obviously 'available' for reviewing?

I think it makes sense to remove both of you so I have done so.  I think 
it is assumed that co-authors will review each others work, and as you 
say it will make it harder for you to get additional reviewers.

Regards,
-- 
-David
david@pgmasters.net


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Feb-11, Chapman Flack wrote:

> Interpreting the crickets as approval, I have changed the title of the
> CF entry, and the status back to Needs Review, with these patches
> attached:

I just pushed this one to pg12 only:

> xmltable-xmlexists-passing-mechanisms-3.patch

I removed the words "first appears in SQL:2006" from the new doc
paragraph; we used to have similar phrases scattered here and there but
were removed years ago, on the grounds that only the most recent version
is relevant (I couldn't find the commit, though).  Mostly left your
patch alone otherwise, other than adding some additional mark-up to the
doc changes.

Thanks!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 3/7/19 10:08 AM, Alvaro Herrera wrote:
> I just pushed this one to pg12 only: 
>> xmltable-xmlexists-passing-mechanisms-3.patch

Thanks!

> I removed the words "first appears in SQL:2006" from the new doc
> paragraph; we used to have similar phrases scattered here and there but
> were removed years ago, on the grounds that only the most recent version
> is relevant (I couldn't find the commit, though).

In case there are similar qualifiers in the other patches (especially
the -docfix one, there are several), while respecting that general
principle, I would offer that there might be an argument for bending
it some here ...

... one of the main things challenging intuition and expectation in the
SQL/XML part of the standard is that there were radical changes to the
'XML' data type and the underlying data model between :2003 and :2006,
and most of what's in PostgreSQL was implemented to :2003 and still
conforms to that. (Being therefore, already, a bit of an exception
to the "only the most recent version is relevant" principle.)

So there could be good reason here to call out the distinctions when
they matter here, even if in other areas of the doc the practice is
not to call them out.

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Feb-11, Chapman Flack wrote:

> xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> output for the no-libxml case that was left out of -5.

Pushed this one, with some trivial changes: I renamed and relocated
Pavel's function to strdup-and-free and fused the new test cases to use
less queries for the same functionality.  Naturally I had to adjust the
expected files ... I tried to do my best but there's always a little
something that sneaks under one's nose.

Thank you three for your persistence on this!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Mar-07, Alvaro Herrera wrote:

> On 2019-Feb-11, Chapman Flack wrote:
> 
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
> 
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality.  Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.

So we now have a double-free bug here or something ...  Too tired right
now to do anything about it.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02

================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT                               '.
Program terminated with signal 11, Segmentation fault.
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2  0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117)
atxml.c:4540
 
#4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>,
node=0x14a83e0)at execScan.c:93
 
#8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at
execScan.c:143
#9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0,
sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0,
estate=0x14a82a0)at execMain.c:1643
 
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at
execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08,
altdest=0x14b8d08,completionTag=0x7ea42414 "") at pquery.c:770
 
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at
postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 02/15/19 11:31, Peter Eisentraut wrote:
>> On 25/01/2019 15:37, Chapman Flack wrote:
>>> If they are already processed that way, does that mean the
>>>
>>> o  Do not use text with <ulink> so the URL appears in printed output
>>>
>>> in README.links should be considered obsolete, and removed even, and
>>> doc authors should feel free to put link text in <ulink> without
>>> hesitation?
> 
> Committed that change.

I only now noticed that probably this should also have been changed:

-o  If you want to supply text, use <link>, else <xref>
+o  If you want to supply text, use <link> or <ulink>, else <xref>

Regards,
-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-07, Alvaro Herrera wrote:

> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality.  Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.

So we now have a double-free bug here or something ...  Too tired right
now to do anything about it.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02

================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT                               '.
Program terminated with signal 11, Segmentation fault.
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2  0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228

I have not any hypotheses what is reason - maybe we hit some libxml2 error in xmlCopyNode function - now it is called for larger set of node types.

What I see, a error handling in original code was not probably correct. Hard to say if attached patch fix it, but probably it is more correct than now.

Regards

Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-07, Alvaro Herrera wrote:

> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality.  Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.

So we now have a double-free bug here or something ...  Too tired right
now to do anything about it.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02

================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT                               '.
Program terminated with signal 11, Segmentation fault.
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2  0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228

I am able to emulate it on 32bit old scientific linux. So I hope I find fix

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 7:41 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-07, Alvaro Herrera wrote:

> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality.  Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.

So we now have a double-free bug here or something ...  Too tired right
now to do anything about it.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02

================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT                               '.
Program terminated with signal 11, Segmentation fault.
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0  0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2  0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3  0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4  0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5  tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6  0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7  0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8  ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9  0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228


looks like error in xmlXPathCompiledEval function, that produce little bit broken result for XML_DOCUMENT_NODE type. I hadn't this problem with libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.

Currently I had not fresh 32 bit system to check it.

I found a workaround - in this case copy (and release xmlNode) is not necessary.

please, apply attached patch.
 
Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> 
> Currently I had not fresh 32 bit system to check it.
> 
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.
> 
> please, apply attached patch.

Wow :-(  At this point I'm wondering if this should be put in back
branches as well ... I mean, distill part of commit 251cf2e27bec that
doesn't affect the behavior of text nodes, and put it on all branches
together with your fix?

Another thought: should we refuse to work on known-broken libxml2
versions?  Seems like this bug could affect other parts of code too -- I
see that xmlXPathCompiledEval() is called in file places (including two
in contrib/xml2).

Third thought: an alternative might be to create a wrapper for
xmlXPathCompiledEval that detects NULL content and fills in a pointer
that xmlFreeNode can free.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 13:20 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
>
> Currently I had not fresh 32 bit system to check it.
>
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.
>
> please, apply attached patch.

Wow :-(  At this point I'm wondering if this should be put in back
branches as well ... I mean, distill part of commit 251cf2e27bec that
doesn't affect the behavior of text nodes, and put it on all branches
together with your fix?

The problem is just for case result: XML_DOCUMENT_TYPE, target XML. For this case the previously used transformation doesn't work.

Is not problem to detect this situation. The content field has -1 instead 0.

Originally there was inverted logic, so xmlCopyNode and xmlFreeNode was not used for XML_DOCUMENT_TYPE, and then we didn't hit this bug.


Another thought: should we refuse to work on known-broken libxml2
versions?  Seems like this bug could affect other parts of code too -- I
see that xmlXPathCompiledEval() is called in file places (including two
in contrib/xml2).

Third thought: an alternative might be to create a wrapper for
xmlXPathCompiledEval that detects NULL content and fills in a pointer
that xmlFreeNode can free.

 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: House style for DocBook documentation?

От
Peter Eisentraut
Дата:
On 2019-03-08 05:04, Chapman Flack wrote:
> I only now noticed that probably this should also have been changed:
> 
> -o  If you want to supply text, use <link>, else <xref>
> +o  If you want to supply text, use <link> or <ulink>, else <xref>

The choice of <link> vs <xref> is for internal links.  For external
links you have to use <ulink>.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
> 
> Currently I had not fresh 32 bit system to check it.
> 
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.

Hmm ... going over the libxml2 2.7.6 source, I noticed that
xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for
XML_DOCUMENT_NODE.  Maybe a sufficient answer is to change the
xmlFreeNode there to xmlFreeNodeList.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Chapman Flack
Дата:
On 3/8/19 7:38 AM, Peter Eisentraut wrote:
> On 2019-03-08 05:04, Chapman Flack wrote:
>> -o  If you want to supply text, use <link>, else <xref>
>> +o  If you want to supply text, use <link> or <ulink>, else <xref>
> 
> The choice of <link> vs <xref> is for internal links.  For external
> links you have to use <ulink>.

Understood, but I was thinking of the unintended message possibly
taken by a fast reader who wants to create an external link but also
wants to supply text. "I want to supply text! Is ulink not an option?"

Perhaps:

o  For an internal link, use <link> if you will supply text, else <xref>
o  For an external link, use <ulink> with or without link text

-Chap


Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 15:31 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Pavel Stehule wrote:

> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
>
> Currently I had not fresh 32 bit system to check it.
>
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.

Hmm ... going over the libxml2 2.7.6 source, I noticed that
xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for
XML_DOCUMENT_NODE.  Maybe a sufficient answer is to change the
xmlFreeNode there to xmlFreeNodeList.

It fixes current issue, but I afraid so these two routines are not replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE is not checked.

You can see, from xmlNodeGetContent, XML_DOCUMENT_NODE type should to ignore content value, and newer returns this value.  Other interesting is xmlXPathOrderDocElems where content is used for counting, and probably from there is -1.

Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode

some like

if (cur_copy->type == XML_DOCUMENT_NODE)
  xmlFreeDoc((xmlDocPtr) cur_copy);
else
  xmlFreeNode(cur_copy);

This looks most correct fix for me. What do you think?

Pavel



--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
(I spent some time trying to reproduce the original bug, but was
interrupted for lunch before getting a useful installation.  I find it a
bit strange that it doesn't crash in x86_64, mind ...)

On 2019-Mar-08, Pavel Stehule wrote:

> It fixes current issue, but I afraid so these two routines are not
> replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE
> is not checked.

:-(

> Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> 
> some like
> 
> if (cur_copy->type == XML_DOCUMENT_NODE)
>   xmlFreeDoc((xmlDocPtr) cur_copy);
> else
>   xmlFreeNode(cur_copy);
> 
> This looks most correct fix for me. What do you think?

Seems like that should work, yeah ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Mar-08, Alvaro Herrera wrote:

> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> > 
> > some like
> > 
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> >   xmlFreeNode(cur_copy);
> > 
> > This looks most correct fix for me. What do you think?
> 
> Seems like that should work, yeah ...

Something like this perhaps?  Less repetitive ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Alvaro Herrera wrote:

> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> >   xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...

Something like this perhaps?  Less repetitive ...

+1

Pavel


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:


pá 8. 3. 2019 v 19:48 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Alvaro Herrera wrote:

> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> >   xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...

Something like this perhaps?  Less repetitive ...

Thank you for commit.

the commit message is not correct. xmlCopyNodes does owns works well, but the node is broken already, and because we should to call xmlFreeNode, we have a problem.

regards

Pavel


+1

Pavel


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Pavel Stehule
Дата:
Hi

pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Alvaro Herrera wrote:

> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> >   xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> >   xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...

Something like this perhaps?  Less repetitive ...

I looking to code

        void        (*nodefree) (xmlNodePtr) = NULL;
        volatile xmlBufferPtr buf = NULL;

should not be "nodefree" volatile too?

Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2019-Mar-14, Pavel Stehule wrote:

> I looking to code
> 
>         void        (*nodefree) (xmlNodePtr) = NULL;
>         volatile xmlBufferPtr buf = NULL;
> 
> should not be "nodefree" volatile too?

Ah, good question.  I remember I had it volatile and removed it for some
reason, though I don't remember why.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: House style for DocBook documentation?

От
Peter Eisentraut
Дата:
On 2019-03-08 15:38, Chapman Flack wrote:
> Perhaps:
> 
> o  For an internal link, use <link> if you will supply text, else <xref>
> o  For an external link, use <ulink> with or without link text

I have pushed an update to this effect.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PostgreSQL vs SQL/XML Standards

От
Alvaro Herrera
Дата:
On 2018-Oct-24, Chapman Flack wrote:

> Inspired by the wiki page on PostgreSQL vs SQL Standard in general,
> I have made another wiki page specifically about $subject. I hope
> this was not presumptuous, and invite review / comment. I have not
> linked to it from any other page yet.

In the SQL Standards session at the Unconference, I found out that the
committee produced this technical report in 2011:
https://www.iso.org/standard/41528.html "XQuery Regular Expression
Support in SQL"; it furthers our lack of support for XQuery in our
implementation SQL/XML.  That content is probably relevant for this
topic, even if we cannot do much about it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PostgreSQL vs SQL/XML Standards

От
Chapman Flack
Дата:
On 05/31/19 16:04, Alvaro Herrera wrote:

> https://www.iso.org/standard/41528.html "XQuery Regular Expression
> Support in SQL"

Although I hadn't seen that particular document, I did see those in the
SQL spec and mention them in the wiki page [1].

I should point out that's also a conformance note in the new SQL/JSON
support [2], as the standard specifies XQuery regular expressions for
jsonpath's like_regex predicate, where we're applying POSIX ones
instead.

Doesn't change my thinking much ... I think we should support the
stuff. :) ... using a non-in-house-developed XQuery library to do it,
or better yet, a choice of non-in-house-developed XQuery libraries.

I still think the second idea I suggested on the wiki seems like the
most promising road map [3], though it'll involve a few preparatory
stops along the way.

-Chap



[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XML_Query_regular_expressions

[2]
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md#sqljson-conformance

[3]

https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Proposal_2:_desugaring_to_calls_on_normal_extension_functions