Обсуждение: PostgreSQL vs SQL/XML Standards
Inspired by the wiki page on PostgreSQL vs SQL Standard in general, I have made another wiki page specifically about $subject. I hope this was not presumptuous, and invite review / comment. I have not linked to it from any other page yet. https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards -Chap
On 2018-Oct-24, Chapman Flack wrote: > Inspired by the wiki page on PostgreSQL vs SQL Standard in general, > I have made another wiki page specifically about $subject. I hope > this was not presumptuous, and invite review / comment. I have not > linked to it from any other page yet. > > https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards Wow, that's ... overwhelming. (I do wonder if we should stop relying on libxml2 and instead look for something supporting XQuery). Would you review Markus Winand patch here? https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at I think doing that would probably point out a couple of ways in which our XMLTABLE implementation is non-conformant, and then fixes it :-) I've been unsure as to applying it to all branches since 10 or just to master. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2018-Oct-24, Chapman Flack wrote: >> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards > Wow, that's ... overwhelming. (I do wonder if we should stop relying on > libxml2 and instead look for something supporting XQuery). I think getting out from under libxml2's idiosyncrasies and security lapses would be great, but is there a plausible alternative out there? regards, tom lane
On 10/25/18 10:39 AM, Tom Lane wrote: > I think getting out from under libxml2's idiosyncrasies and security > lapses would be great, but is there a plausible alternative out there? Depends on whether anything in [1] sounds plausible. -Chap [1]: https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward
On 2018-Oct-25, Chapman Flack wrote: > On 10/25/18 10:39 AM, Tom Lane wrote: > > I think getting out from under libxml2's idiosyncrasies and security > > lapses would be great, but is there a plausible alternative out there? > > Depends on whether anything in [1] sounds plausible. > > [1]: > https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward Heh, I didn't notice this part of the document. Integrating a C runtime of a Java library sounds nightmarish -- I wouldn't even think about that. XQilla seems to depend on Xerces, and seems to have died in 2011. Zorba appears to have been taken propietary, from the looks of its last commits. Maybe the best way forward is to implement all the JSON functionality and remove the SQL/XML bits. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Chapman Flack wrote:
> On 10/25/18 10:39 AM, Tom Lane wrote:
> > I think getting out from under libxml2's idiosyncrasies and security
> > lapses would be great, but is there a plausible alternative out there?
>
> Depends on whether anything in [1] sounds plausible.
>
> [1]:
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward
Heh, I didn't notice this part of the document. Integrating a C runtime
of a Java library sounds nightmarish -- I wouldn't even think about
that.
XQilla seems to depend on Xerces, and seems to have died in 2011.
Zorba appears to have been taken propietary, from the looks of its last
commits.
Maybe the best way forward is to implement all the JSON functionality
and remove the SQL/XML bits.
It can be bigger compatibility break in Postgres history. SQL/XML functions are widely used.
Regards
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/25/2018 03:53 PM, Chapman Flack wrote: > On 10/25/18 10:39 AM, Tom Lane wrote: >> I think getting out from under libxml2's idiosyncrasies and security >> lapses would be great, but is there a plausible alternative out there? > > Depends on whether anything in [1] sounds plausible. The libraries we depend on should really either be available in the package repositories of the major Linux distribution or be something we can put in our own repository and maintain without too much pain. So using Saxon/C does not seem like a realistic option. Andreas
On 10/25/2018 11:23 AM, Andreas Karlsson wrote: > On 10/25/2018 03:53 PM, Chapman Flack wrote: >> On 10/25/18 10:39 AM, Tom Lane wrote: >>> I think getting out from under libxml2's idiosyncrasies and security >>> lapses would be great, but is there a plausible alternative out there? >> >> Depends on whether anything in [1] sounds plausible. > > The libraries we depend on should really either be available in the > package repositories of the major Linux distribution or be something > we can put in our own repository and maintain without too much pain. > So using Saxon/C does not seem like a realistic option. > Yeah, very good point. xqilla/xerces-C appears to be widely available (Centos and ubuntu, at least). cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/25/18 11:08 AM, Alvaro Herrera wrote: > XQilla seems to depend on Xerces, and seems to have died in 2011. ¿Eh? The latest release, 2.3.4 [1], is dated 2018-07-03. It looks like the latest development has been happening on the xquilla_2_3 branch. Sometimes project "activity" statistics rely exclusively on the "master" branch ("xqilla" branch in this case), and are deceptive if the project isn't being developed exclusively by coding on master and backpatching to others. I've noticed I'm facing the same thing in PL/Java ... plenty of development lately, but on the REL1_5_STABLE branch. GitHub's project statistics (and also Open Hub's) are just looking at master and saying the project's been dead for two years. Now that 1.5.1 is released, as soon as I get some of REL1_5_STABLE merged /up/ into master, the statistics will probably magically show it's been alive all along. > Zorba appears to have been taken propietary, from the looks of its last > commits. It does seem harder to see what's going on there, but the commits with "copyright changed" as the message turn out to be changing only the copyright holder in the Apache 2.0 license from "The FLWOR Foundation" to "zorba.io". But Matthias Brantner participated with interest in the 2010 thread here where Zorba was brought up before[2], so he may know something. > Integrating a C runtime of a Java library sounds nightmarish -- > I wouldn't even think about that. Or whether or not nightmarish, certainly duplicative of something we can kinda already do. In a way, some of the pressure is off, because if you need a true XMLQUERY or XMLTABLE, you can get them with the Saxon-in-PL/Java implementation, and right now in any supported PG version. You just have to spell them funny, doing without the sugary syntax built into the parser. They're missing some of the automatic casts from the standard at the moment, which isn't really a loss of function, as explicit casts can be added to any query needing them ... a temporary annoyance until the rest of that example's in place. But a roadmap that could lead to eventual availability of one of the C/C++ implementations would be nice too. -Chap [1]: https://sourceforge.net/projects/xqilla/files/ [2]: https://www.postgresql.org/message-id/7DDDB18E-041F-4238-B91D-3277EB1CE5BC%4028msec.com
On 10/25/18 2:33 PM, Andrew Dunstan wrote: > > Yeah, very good point. xqilla/xerces-C appears to be widely available > (Centos and ubuntu, at least). > xqilla/xerces-c are in the Fedora/RHEL repo too. Best regards, Jesper
Hi
But a roadmap that could lead to eventual availability of one of the
C/C++ implementations would be nice too.
I am thinking so I can fix some issues related to XMLTABLE. Please, send me more examples and test cases.
Regards
Pavel
On 2018-Oct-25, Pavel Stehule wrote: > I am thinking so I can fix some issues related to XMLTABLE. Please, send me > more examples and test cases. Please see Markus Winand's patch that I referenced upthread. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 10/25/18 11:15, Pavel Stehule wrote: > čt 25. 10. 2018 v 17:09 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> > napsal: >> Maybe the best way forward is to implement all the JSON functionality >> and remove the SQL/XML bits. > > It can be bigger compatibility break in Postgres history. SQL/XML functions > are widely used. It seems to me that evolution to the 2006+ standard version could be done mostly non-disruptively (provided an agreeable library can be found). I think Tom's suggestion[1] to just make XML OPTION CONTENT mean what it means in 2006+ would be an easy change to make immediately, and would not disrupt anybody ... it would only make some things succeed that now fail, and it would match what our documentation already says. It would make our XML type equivalent to 2006+ XML(CONTENT(ANY)). Beyond that, further steps toward 2006+ could largely avoid disruption. If we implement the typmod'ed XML types, surely the parser would simply treat untypmod'ed 'XML' as meaning XML(CONTENT(ANY)). (The standard does allow for the typmod to be missing, and leaves it "implementation-defined whether SEQUENCE, CONTENT(ANY), or CONTENT(UNTYPED) is implicit", so that's all by the book.) The existing functions xpath and xpath_exists can be kept unchanged, as their names are distinct from anything in the standard. A library that supports XQuery is likely also to support XPath in "1.0 compatibility mode", so those functions could keep their semantics. The current xmlvalidate() has the wrong semantics and return type, but it also does nothing but ereport unimplemented, so no current uses would be hurt by redefining it. XMLTABLE would be the headache. Using the standard name for something that ain't the standard function has not left any painless way that the standard function could be added. OTOH, it has only been in the wild since 10, so renaming it to something else (xpath_table?) will probably be more painless if done soon than it ever would be later. On 10/25/18 11:23, Andreas Karlsson wrote: > The libraries we depend on should really either be available in the > package repositories of the major Linux distribution or be something > we can put in our own repository and maintain without too much pain. > So using Saxon/C does not seem like a realistic option. That makes good sense. The approach I proposed in [2] would be to target the XQC API as an integration point. If there is one library that might be most acceptable (it seems xqilla is in several repositories), it could become a preferred or supported choice, but others could be available if an administrator wanted to separately obtain them, perhaps because of better performance on a particular workload, or avoidance of some bug that a given workload turns up. -Chap [1]: https://www.postgresql.org/message-id/22271.1540458133%40sss.pgh.pa.us [2]: https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#One_proposal
XMLTABLE would be the headache. Using the standard name for something
that ain't the standard function has not left any painless way that the
standard function could be added. OTOH, it has only been in the wild
since 10, so renaming it to something else (xpath_table?) will probably
be more painless if done soon than it ever would be later.
I don't share your opinion. XMLTABLE implements subset of standard. More it is well compatible with Oracle (in this subset).
If we have library with XPath 2.0 or higher, we can continue with it.
Regards
Pavel
On 10/25/18 09:56, Alvaro Herrera wrote: > Would you review Markus Winand patch here? > https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at > I think doing that would probably point out a couple of ways in which > our XMLTABLE implementation is non-conformant, and then fixes it :-) > I've been unsure as to applying it to all branches since 10 or just to > master. Well, modulo the key observation that it simply *is not* conformant until it accepts XML Query expressions and uses the XPath 2.0 type system and data model and the SQL/XML 2006+ casting rules ... ... and there is no ISO standard that says anything about how an XPath 1.0- based quasi-XMLTABLE-ish function ought to behave, so it's hard to say that anything this function does is right or wrong, per ISO ... I think all of the changes in these patches do make it a more useful quasi-XMLTABLE-ish function, as the pre-patch behaviors were less useful (if not outright bewildering). And they produce output that better matches what the XQuery-based ISO rules produce (for the subset of queries that mean the same thing as XQuery and as XPath 1.0). I also looked at the (not yet applied?) XML-XPath-comments-processing-instructions-array-ind patch and I think it, too, makes the behavior more useful. I did not actually take the time to build a PostgreSQL with the patch, but I took the two added regression queries, syntax-desugared them[1] and called the Saxon XQuery- based "xmltable" example with them, and got the same expected results: SELECT xmltable.* FROM (SELECT '<root><element>a1a<!-- aaaa -->a2a<?aaaaa pi?> <!--z--> bbbb<x>xxx</x>cccc</element></root>'::xml AS ".") AS p, "xmltable"('/root', PASSING => p, COLUMNS => ARRAY[ 'string(element)', 'string(element/comment()[2])', 'string(element/processing-instruction())', 'string(element/text()[1])', 'serialize(element)']) AS (element text, cmnt text, pi text, t1 text, x text); element | cmnt | pi | t1 | x ----------------------+------+----+-----+--------------------------------------------------------------------------------- a1aa2a bbbbxxxcccc | z | pi | a1a | <element>a1a<!-- aaaa -->a2a<?aaaaa pi?> <!--z--> bbbb<x>xxx</x>cccc</element> (1 row) SELECT xmltable.* FROM (SELECT '<root><element>a1a<!-- aaaa -->a2a<?aaaaa pi?> <!--z--> bbbb<x>xxx</x>cccc</element></root>'::xml AS ".") AS p, "xmltable"('/root', PASSING => p, COLUMNS => ARRAY[ 'string(element/text())', 'string(element/comment()[2])', 'string(element/processing-instruction())', 'string(element/text()[1])', 'serialize(element)']) AS (element text, cmnt text, pi text, t1 text, x text); ERROR: java.sql.SQLException: A sequence of more than one item is not allowed as the first argument of fn:string() (text("a1a"), text("a2a")) Agreement. Agreement is good. :) So I think they are worth applying. I can't bring myself to a strong opinion on whether they are or aren't worth backpatching; if it were the function described by the standard, they'd be bugs and they would be, but does making a non-standard function behave slightly more like the standard function that it isn't count as a bug fix or an enhancement? My overall feeling, at least in directing my own effort, is that I'd rather spend time toward getting the real XQuery-based semantics in place somehow, and ongoing enhancements to the XPath-1.0-based stuff feel more like pouring treasure down a hole. But these enhancements seem like good ones, and if there's interest in patching a couple more, the "unexpected XPath object type {2,3}" in [2] might be good candidates. That would be backpatchable, as the current behavior clearly isn't useful. One other thing: I think the commit message on the context-item patch is really somewhat misleading: "According to the SQL standard, the context of XMLTABLE's XPath row_expression is the document node of the XML input document..." Really the standard says nothing of the sort. It is a limitation of XPath 1.0 that the input even has to be a document at all. In the real XMLTABLE, you could be passing a context item that is a document node (of either 'document' or 'content' flavor), or a one-item 'sequence' holding an atomic type like a number or date, or even a naked XML node like a PI or comment or attribute node you're more used to seeing only inside a document. The real rule is just that the context item is exactly the thing you passed (or a copy of it, when the rules say so). It collapses in the XPath 1.0 case to having to be a document node, simply because that's the only thing you can pass in. What was wrong with the pre-patch code was that it wasn't just using the 'doc' it was given, but actually calling xmlDocGetRootElement() on it and setting the CI to one of its children. The patch just directly assigns doc to xpathctx->node, which I would call correct, not because it's a document node, but because it's the thing that was passed. -Chap [1] You might notice in addition to desugaring the XMLTABLE syntax, I wrapped the text-returning column paths in string(), and wrapped the xml-returning one in serialize() and changed its result column to text. Those changes are just to work around the parts of the Saxon example that aren't implemented yet, as explained in [3]. [2] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XMLTABLE [3] https://tada.github.io/pljava/examples/saxon.html#Using_the_Saxon_examples
On 10/25/18 23:16, Pavel Stehule wrote: >> XMLTABLE would be the headache. Using the standard name for something >> that ain't the standard function has not left any painless way that the >> standard function could be added. OTOH, it has only been in the wild >> since 10, so renaming it to something else (xpath_table?) will probably >> be more painless if done soon than it ever would be later. >> > I don't share your opinion. XMLTABLE implements subset of standard. More it > is well compatible with Oracle (in this subset). > > If we have library with XPath 2.0 or higher, we can continue with it. The difficulty here is that the expression language required by the standard is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be parsed as an XQuery expression. (So, /syntactically/, yes, "subset".) For XPath 2.0, that is no problem, because an XPath 2.0 expression and the identically-spelled XQuery expression /mean the same thing/. For XPath 1.0, it is very definitely a problem, because an XPath 1.0 expression and the identically-spelled XQuery expression /do not mean the same thing/. Some of the important semantic differences are in [1]. So, if a future PostgreSQL version has an XMLTABLE function that accepts XQuery, as the standard requires, and existing users upgrade and they have XMLTABLE query expressions written as XPath 1.0, those queries will be accepted and parsed, but they will not mean the same thing. The function will not be able to tell when it is being called with XQuery semantics intended, vs. when it is being called with XPath 1.0 semantics intended. Now, perhaps there is a nicer way than renaming the function. It could work like overloading. Create two trivial domains over text, say xpath1 and xquery, and have two XMLTABLE functions with different first parameter types. Then if you called with the expression '"cat" < "dog"'::xquery you would get the correct result 't', and with '"cat" < "dog"'::xpath1 you would get the (also correct) result 'f'. (It would not be exactly overloading, because of the special sugared syntax known to the parser, but it could look like overloading, and be intuitive to the user.) If you have convenient access to Oracle to check compatibility, could you compare this query? SELECT * FROM XMLTABLE('.' PASSING '<sale hatsize="7" customer="alice" taxable="false"/>' COLUMNS a boolean PATH 'string("cat" < "dog")', b boolean PATH 'string("cat" > "dog")', c boolean PATH 'string(sale/@taxable = false())'); (I suspect in Oracle it would also work without the string() wrappings, but just to make it easy, I think this way it will work in both Oracle and PG—that is, not error, though results may differ.) -Chap [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
pá 26. 10. 2018 v 6:25 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 10/25/18 23:16, Pavel Stehule wrote:
>> XMLTABLE would be the headache. Using the standard name for something
>> that ain't the standard function has not left any painless way that the
>> standard function could be added. OTOH, it has only been in the wild
>> since 10, so renaming it to something else (xpath_table?) will probably
>> be more painless if done soon than it ever would be later.
>>
> I don't share your opinion. XMLTABLE implements subset of standard. More it
> is well compatible with Oracle (in this subset).
>
> If we have library with XPath 2.0 or higher, we can continue with it.
The difficulty here is that the expression language required by the standard
is XQuery, and an XPath expression (whether 1.0 or 2.0+) can always be
parsed as an XQuery expression. (So, /syntactically/, yes, "subset".)
For XPath 2.0, that is no problem, because an XPath 2.0 expression and
the identically-spelled XQuery expression /mean the same thing/.
For XPath 1.0, it is very definitely a problem, because an XPath 1.0
expression and the identically-spelled XQuery expression /do not mean
the same thing/. Some of the important semantic differences are in [1].
So, if a future PostgreSQL version has an XMLTABLE function that accepts
XQuery, as the standard requires, and existing users upgrade and they have
XMLTABLE query expressions written as XPath 1.0, those queries will be
accepted and parsed, but they will not mean the same thing. The function
will not be able to tell when it is being called with XQuery semantics
intended, vs. when it is being called with XPath 1.0 semantics intended.
If we have a library with XQuery, then we can change the behave. But world accepting of XQuery is not wide, unfortunately.
When I wrote and tested XMLTABLE, I found only few examples of where XQuery was used. So first there should be any library with XQUery implementation that can be used in Postgres. This library should be fast, well tested without memory leaks. Elsewhere discussion is premature. I am not terrible happy from libxml2 design, documentation, manuals - and I will not against we can migrate to some better. On second hand - libxml2 code is working - and it is widely used. It can be big mistake if we use Java library and if we create dependency on Java. After, there are only few libs that doesn't significantly better than libxml2.
Now, perhaps there is a nicer way than renaming the function. It could
work like overloading. Create two trivial domains over text, say xpath1
and xquery, and have two XMLTABLE functions with different first parameter
types. Then if you called with the expression '"cat" < "dog"'::xquery
you would get the correct result 't', and with '"cat" < "dog"'::xpath1
you would get the (also correct) result 'f'.
Probably it can works, but needs more work on Postgres infrastructure. If you overload functions like this, then type should be used every time.
(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)
If you have convenient access to Oracle to check compatibility, could you
compare this query?
SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');
(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)
I will test it.
Regards
Pavel
-Chap
[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
Hi
čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2018-Oct-25, Pavel Stehule wrote:
> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.
Please see Markus Winand's patch that I referenced upthread.
here is a fix of some XMLTABLE mentioned issues.
Regards
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)
If you have convenient access to Oracle to check compatibility, could you
compare this query?
SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');
(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)
Regards
Pavel Stehule
po 29. 10. 2018 v 11:05 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work with/without string() wrappings.(It would not be exactly overloading, because of the special sugared
syntax known to the parser, but it could look like overloading, and be
intuitive to the user.)
If you have convenient access to Oracle to check compatibility, could you
compare this query?
SELECT * FROM XMLTABLE('.'
PASSING '<sale hatsize="7" customer="alice" taxable="false"/>'
COLUMNS
a boolean PATH 'string("cat" < "dog")',
b boolean PATH 'string("cat" > "dog")',
c boolean PATH 'string(sale/@taxable = false())');
(I suspect in Oracle it would also work without the string() wrappings,
but just to make it easy, I think this way it will work in both Oracle
and PG—that is, not error, though results may differ.)I have a access to too old 11.2 Oracle. There I had to modify query because there is not boolean type. I replaced bool by int, but I got a error
The problem is in last line - the expression "sale/@taxable = false()" is not valid on Oracle. Using string() wrapping is a issue, because it returns "true", "false", but Oracle int doesn't accept it.
Pavel
RegardsPavel Stehule
>> I have a access to too old 11.2 Oracle. There I had to modify query >> because there is not boolean type. I replaced bool by int, but I got a >> error >> ORA-19224:XPTY-004 .. expected node()*, got xs:string - it doesn't work >> with/without string() wrappings. >> >The problem is in last line - the expression "sale/@taxable = false()" is >not valid on Oracle. Using string() wrapping is a issue, because it returns ">true", "false", but Oracle int doesn't accept it. That line seems to be valid - but you need to pass an XMLTYPE value, not a VARCHAR https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33 As far as I can tell inside XQuery Oracle does support boolean, but not as a return type -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hičt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:On 2018-Oct-25, Pavel Stehule wrote:
> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.
Please see Markus Winand's patch that I referenced upthread.here is a fix of some XMLTABLE mentioned issues.
this update allows cast boolean to numeric types from XPath expressions
Regards
Pavel
RegardsPavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 10/29/18 6:40 AM, Thomas Kellerer wrote: > That line seems to be valid - but you need to pass an XMLTYPE value, > not a VARCHAR > > https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=21cdf890a26e97fa8667b2d6a960bd33 Oh, of course! Thank you. I had forgotten pass the context item as explicitly an XML value. That illustrates that, in a proper XMLTABLE, you can pass things that are not XML values. When a varchar is passed in, the context item has type xs:string. The third PATH tried to follow a node path against a non-node context item, and correctly reported the error. And thanks for the dbfiddle pointer. I can now confirm (in both 11g.2 and 18c): SELECT * FROM XMLTABLE('.' PASSING xmltype('<sale hatsize="7" customer="alice" taxable="false"/>') COLUMNS a varchar(10) PATH '"cat" < "dog"', b varchar(10) PATH '"cat" > "dog"', c varchar(10) PATH 'sale/@taxable = false()' ); A B C true false true Or as numbers (There's just no SQL boolean type in Oracle, even 18g!): SELECT * FROM XMLTABLE('.' PASSING xmltype('<sale hatsize="7" customer="alice" taxable="false"/>') COLUMNS a NUMBER PATH '"cat" < "dog"', b NUMBER PATH '"cat" > "dog"', c NUMBER PATH 'sale/@taxable = false()' ); A B C 1 0 1 I removed the string() wrappings, which were only to allow the same query to work in PG, but Pavel's proposed patches will make them unnecessary. Note, however, that the first proposed patch will work for the first query (varchar results) and fail for the second (number results). The second patch will work for the second query, but produce the wrong strings ("1" or "0" instead of "true" or "false") for the first. A proper XMLTABLE needs to apply the appropriate conversion determined by the SQL type of the output column. I believe a patch to do that correctly is possible; xml.c has access to the type oids for the output columns, after all. The fact that PG will return false|false|false or 0|0|0 instead of true|false|true or 1|0|1 cannot be fixed by a patch. That is the consequence of evaluating in XPath 1.0 (in XPath 2.0, which is a subset of XQuery, the results would be correct). On the same lines, we can take my original example where I forgot to type the context item as XML, and make that work in Oracle too: SELECT * FROM XMLTABLE('.' PASSING '<sale hatsize="7" customer="alice" taxable="false"/>' COLUMNS a varchar(10) PATH 'substring-after(., "taxable=")' ); A "false"/> A proper XMLTABLE is happy to be passed an atomic value, such as a string, as the context item or any named parameter, and apply type-appropriate operators and functions to it. XPath 1.0 blocks that for PG. -Chap
po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:Hičt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:On 2018-Oct-25, Pavel Stehule wrote:
> I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> more examples and test cases.
Please see Markus Winand's patch that I referenced upthread.here is a fix of some XMLTABLE mentioned issues.this update allows cast boolean to numeric types from XPath expressions
Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
Regards
Pavel
RegardsPavelRegardsPavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > > po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal: > > > po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal: > Hi > > čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal: > On 2018-Oct-25, Pavel Stehule wrote: > > > I am thinking so I can fix some issues related to XMLTABLE. Please, send me > > more examples and test cases. > > Please see Markus Winand's patch that I referenced upthread. > > here is a fix of some XMLTABLE mentioned issues. > > this update allows cast boolean to numeric types from XPath expressions > > Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code,but it was inspiration for me. I found native solution from libxml2. > > Regards > > Pavel Better than my patch. But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below). # select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XMLPATH 'node()'); x ----------------------------------------- prec1arg&ent1<n2>&deep</n2>post (1 row) Output is not the original XML. I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied: # select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XMLPATH '/'); x --------------------------- pre&ent1&deeppost (1 row) Oracle gives in both cases XML. To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should bereversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for allother cases? I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach butI don’t think that proves a lot. -markus diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 37d85f7..7c1f884 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt) { xmltype *result; - if (cur->type == XML_ELEMENT_NODE) + if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE) { xmlBufferPtr buf; xmlNodePtr cur_copy;
čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:
> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel
Better than my patch.
But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH 'node()');
x
-----------------------------------------
prec1arg&ent1<n2>&deep</n2>post
(1 row)
Output is not the original XML.
I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/');
x
---------------------------
pre&ent1&deeppost
(1 row)
Oracle gives in both cases XML.
To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?
I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.
-markus
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
{
xmltype *result;
- if (cur->type == XML_ELEMENT_NODE)
+ if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
{
xmlBufferPtr buf;
xmlNodePtr cur_copy;
I used your patch and append regress tests. I checked the result against Oracle.
Regards
Pavel
Вложения
On 2018-11-9, at 05:07 , Pavel Stehule <pavel.stehule@gmail.com> wrote:čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:
> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel
Better than my patch.
But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH 'node()');
x
-----------------------------------------
prec1arg&ent1<n2>&deep</n2>post
(1 row)
Output is not the original XML.
I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/');
x
---------------------------
pre&ent1&deeppost
(1 row)
Oracle gives in both cases XML.
To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?
I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.
-markus
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
{
xmltype *result;
- if (cur->type == XML_ELEMENT_NODE)
+ if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
{
xmlBufferPtr buf;
xmlNodePtr cur_copy;I used your patch and append regress tests. I checked the result against Oracle.RegardsPavel
Fine from my side.
-markus
po 12. 11. 2018 v 6:58 odesílatel Markus Winand <markus.winand@winand.at> napsal:
On 2018-11-9, at 05:07 , Pavel Stehule <pavel.stehule@gmail.com> wrote:čt 8. 11. 2018 v 15:18 odesílatel Markus Winand <markus.winand@winand.at> napsal:
> On 2018-11-6, at 15:23 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> po 29. 10. 2018 v 11:45 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
>
>
> po 29. 10. 2018 v 10:11 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
> Hi
>
> čt 25. 10. 2018 v 21:47 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
> On 2018-Oct-25, Pavel Stehule wrote:
>
> > I am thinking so I can fix some issues related to XMLTABLE. Please, send me
> > more examples and test cases.
>
> Please see Markus Winand's patch that I referenced upthread.
>
> here is a fix of some XMLTABLE mentioned issues.
>
> this update allows cast boolean to numeric types from XPath expressions
>
> Attached patch solves some cast issues mentioned by Chap. It solves issue reported by Markus. I didn't use Markus's code, but it was inspiration for me. I found native solution from libxml2.
>
> Regards
>
> Pavel
Better than my patch.
But I think the chunk in xml_xmlnodetoxmltype of my patch is still needed — in one way or the other (see below).
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH 'node()');
x
-----------------------------------------
prec1arg&ent1<n2>&deep</n2>post
(1 row)
Output is not the original XML.
I dug a little further and found another case that doesn’t looks right even with my change to xml_xmlnodetoxmltype applied:
# select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/');
x
---------------------------
pre&ent1&deeppost
(1 row)
Oracle gives in both cases XML.
To fix that I included XML_DOCUMENT_NODE in the list of nodes that use xmlNodeDump. Now I wonder if that logic should be reversed to use the xmlXPathCastNodeToString branch in a few selected cases but default to the branch xmlNodeDump for all other cases?
I guess those few cases might be XML_ATTRIBUTE_NODE and XML_TEXT_NODE. Regression tests are happy with that approach but I don’t think that proves a lot.
-markus
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 37d85f7..7c1f884 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3682,7 +3682,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
{
xmltype *result;
- if (cur->type == XML_ELEMENT_NODE)
+ if (cur->type != XML_ATTRIBUTE_NODE && cur->type != XML_TEXT_NODE)
{
xmlBufferPtr buf;
xmlNodePtr cur_copy;I used your patch and append regress tests. I checked the result against Oracle.RegardsPavelFine from my side.
super
Thank you for investigation and tests, examples.
It is assigned to January commitfest.
Regards
Pavel
-markus
On 11/12/18 04:58, Pavel Stehule wrote: > It is assigned to January commitfest. When I build this patch against master (4203842), it builds, but breaks make check. The diffs seem only incidental (loss of some error context output), but there are no make check errors building 4203842 unmodified, so the patch has introduced them. Example: *** ../postgresql/src/test/regress/expected/xml.out Sat Dec 29 19:58:41 2018 --- src/test/regress/results/xml.out Sat Dec 29 19:59:11 2018 *************** *** 9,16 **** LINE 1: INSERT INTO xmltest VALUES (3, '<wrong'); ^ DETAIL: line 1: Couldn't find end of Start Tag wrong line 1 - <wrong - ^ === ABOUT THE DOCUMENTATION === While this patch fixes bugs in the embedding of XPath 1.0 into PostgreSQL, and that is good, I still think there is an elephant in the room, namely that XPath 1.0 isn't the right language to embed. Clearly, that can't be fixed as long as we rely on libxml, but I think our documentation could (should) do a better job of warning users of the surprises that result. My ideas for improving the documentation are not yet so finished as to offer a patch, but here are some thoughts: Of the 13 mentions in func.sgml of XPath (not counting its appearances in function and parameter names), only one specifically says XPath 1.0, and it is buried down at functions-xml-processing, not somewhere common to all of them, like the introduction to functions-xml. Really, all of our functions that refer to "XPath" mean XPath 1.0, and because that version was essentially thrown away to produce the 2.0 and later versions, and that happened a dozen years ago, the documentation needs to make that clear. Users who have learned XQuery, or XPath in the last dozen years, need to know this is something very different, to avoid walking into the traps. I think: 1) Every reference to "XPath" or an "XPath expression" (that is the first in the description of a function or parameter) should be spelled out as "XPath 1.0". 2) There should be a section in one central place (clearly applying to all of the XML functions), perhaps in the functions-xml introduction, or a <footnote> to a paragraph there, or in an appendix (new section under SQL Conformance?) that summarizes the gulf between XPath 1.0 and the later, XQuery-compatible versions. There should be cross- reference links to this section from the "XPath 1.0" mentions in individual function descriptions. The section could be a condensation of [1] (and perhaps contain a link to [1], if links out to the wiki are allowed), and link to the W3C compatibility notes at [2] and [3]. 3) Currently, the description of XMLEXISTS notes that "the SQL standard specifies ... the construct to take an XQuery expression ... but PostgreSQL currently only supports XPath, which is a subset of XQuery". The description of XMLTABLE does not have such a note, and needs one. In both cases, the note should probably be nearer the top of the description (for XMLEXISTS, it is currently at the bottom, after the examples). 4) That note currently says "only supports XPath, which is a subset of XQuery". That would be a fair claim if we meant XPath 2.0 or later, but not XPath 1.0 (which was not a subset of any version of XQuery, and has fundamental incompatibilities with it). The note, for both XMLEXISTS and XMLTABLE, should simply say "only supports XPath 1.0", with a link to the section summarizing the incompatibilities. Those are my thoughts on how a patch to the documentation could be organized. Do they seem reasonable? -Chap [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility [3] https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
ne 30. 12. 2018 v 3:46 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 11/12/18 04:58, Pavel Stehule wrote:
> It is assigned to January commitfest.
When I build this patch against master (4203842), it builds, but breaks
make check. The diffs seem only incidental (loss of some error context
output), but there are no make check errors building 4203842 unmodified,
so the patch has introduced them. Example:
*** ../postgresql/src/test/regress/expected/xml.out Sat Dec 29 xml19:58:41 2018
--- src/test/regress/results/xml.out Sat Dec 29 19:59:11 2018
***************
*** 9,16 ****
LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
^
DETAIL: line 1: Couldn't find end of Start Tag wrong line 1
- <wrong
- ^
Unfortunately, there is a different releases of libxml2 with different error reporting and it is hard (impossible) to prepare for all variants. :-/
I prepare xml.out for my FC29 (fresh libxml2) and for no support xml. Other I prepare by patching - and this error (in context) is expected.
=== ABOUT THE DOCUMENTATION ===
While this patch fixes bugs in the embedding of XPath 1.0 into PostgreSQL,
and that is good, I still think there is an elephant in the room, namely
that XPath 1.0 isn't the right language to embed. Clearly, that can't be
fixed as long as we rely on libxml, but I think our documentation could
(should) do a better job of warning users of the surprises that result.
My ideas for improving the documentation are not yet so finished as to
offer a patch, but here are some thoughts:
Of the 13 mentions in func.sgml of XPath (not counting its appearances
in function and parameter names), only one specifically says XPath 1.0,
and it is buried down at functions-xml-processing, not somewhere common
to all of them, like the introduction to functions-xml.
Really, all of our functions that refer to "XPath" mean XPath 1.0,
and because that version was essentially thrown away to produce the 2.0
and later versions, and that happened a dozen years ago, the documentation
needs to make that clear. Users who have learned XQuery, or XPath in the
last dozen years, need to know this is something very different, to avoid
walking into the traps.
I think:
1) Every reference to "XPath" or an "XPath expression" (that is the first
in the description of a function or parameter) should be spelled out as
"XPath 1.0".
2) There should be a section in one central place (clearly applying to
all of the XML functions), perhaps in the functions-xml introduction,
or a <footnote> to a paragraph there, or in an appendix (new section
under SQL Conformance?) that summarizes the gulf between XPath 1.0
and the later, XQuery-compatible versions. There should be cross-
reference links to this section from the "XPath 1.0" mentions in
individual function descriptions. The section could be a condensation
of [1] (and perhaps contain a link to [1], if links out to the wiki
are allowed), and link to the W3C compatibility notes at [2] and [3].
3) Currently, the description of XMLEXISTS notes that "the SQL standard
specifies ... the construct to take an XQuery expression ... but
PostgreSQL currently only supports XPath, which is a subset of XQuery".
The description of XMLTABLE does not have such a note, and needs one.
In both cases, the note should probably be nearer the top of the
description (for XMLEXISTS, it is currently at the bottom, after the
examples).
4) That note currently says "only supports XPath, which is a subset
of XQuery". That would be a fair claim if we meant XPath 2.0
or later, but not XPath 1.0 (which was not a subset of any version
of XQuery, and has fundamental incompatibilities with it). The note,
for both XMLEXISTS and XMLTABLE, should simply say "only supports
XPath 1.0", with a link to the section summarizing the incompatibilities.
Those are my thoughts on how a patch to the documentation could be
organized. Do they seem reasonable?
I agree with more stronger detail description about difference between XPath, XPath2 and XQuery.
Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery. ..."
I don't think so link to wiki with any proposals is good idea. Documentation should to describe current state, not what can be at some future.
Regards
Pavel
-Chap
[1]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
[2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
[3]
https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
On 12/30/18 03:23, Pavel Stehule wrote: > I agree with more stronger detail description about difference between > XPath, XPath2 and XQuery. > > Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery. > ..." I'll be happy to work on some wording to help detail the differences. I don't think it can be condensed to a single sentence, or even if it could, it would not serve users well; what's helpful is to know the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage that is now XPath, because that's what helps to understand what the challenges will be when porting queries from other systems, how to recognize when a query won't be possible to port at all, etc. So I think it needs to be a short section. I am still undecided what's the best place in the manual for the section to go: - Added to the introductory material under functions-xml - A long <footnote> attached to a short remark in that introductory matter (is there any precedent? I see there are <footnote>s already in the manual, but I haven't checked how long they get) - A new appendix (or new section in the SQL Conformance appendix) linked from a short statement in the functions-xml introductory matter. > I don't think so link to wiki with any proposals is good idea. That makes sense. I was thinking only of linking directly to the specific section on XPath 1.0 compatibility issues, not to the wiki page as a whole. I see in grepping through the *.sgml that there are not many links to the wiki.postgresql.org, and the ones that exist are only in release notes, Further Information, The Source Code Repository, and sepgsql. So precedent seems to stand against using a link to the wiki section. So, the material /in/ that wiki section [1] (but omitting the two paras about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs to go into the new section in the manual. I assume it's ok to directly include links [2] and [3] out to the w3.org compatibility notes ... those are stable URLs to reference material. === BY REF and BY VALUE === How difficult would it be to make the grammar constructs that currently accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF and BY VALUE? The documentation ought to state that, while both forms are accepted and ignored, PostgreSQL's actual behavior corresponds to what the standard calls BY VALUE (using a string serialization as the XML datatype's internal form makes the BY REF semantics impossible). So it's when people try to port queries that explicitly say BY REF that they need to know there may be trouble ahead. -Chap >> [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath >> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility >> [3] https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 03:23, Pavel Stehule wrote:
> I agree with more stronger detail description about difference between
> XPath, XPath2 and XQuery.
>
> Some like "The XPath 1.0 is ancestor of XPath 2.0, that is part of XQuery.
> ..."
I'll be happy to work on some wording to help detail the differences.
I don't think it can be condensed to a single sentence, or even if it
could, it would not serve users well; what's helpful is to know
the specifics of what changed between XPath 1.0 and the 2.0-3.0-3.1 lineage
that is now XPath, because that's what helps to understand what the
challenges will be when porting queries from other systems, how to recognize
when a query won't be possible to port at all, etc.
So I think it needs to be a short section. I am still undecided what's
the best place in the manual for the section to go:
+1
- Added to the introductory material under functions-xml
- A long <footnote> attached to a short remark in that introductory matter
(is there any precedent? I see there are <footnote>s already in the
manual, but I haven't checked how long they get)
- A new appendix (or new section in the SQL Conformance appendix)
linked from a short statement in the functions-xml introductory matter.
> I don't think so link to wiki with any proposals is good idea.
That makes sense. I was thinking only of linking directly to the specific
section on XPath 1.0 compatibility issues, not to the wiki page as a whole.
I see in grepping through the *.sgml that there are not many links to the
wiki.postgresql.org, and the ones that exist are only in release notes,
Further Information, The Source Code Repository, and sepgsql.
So precedent seems to stand against using a link to the wiki section.
So, the material /in/ that wiki section [1] (but omitting the two paras
about the DOCUMENT/CONTENT node-wrapping hack) is more or less what needs
to go into the new section in the manual.
I assume it's ok to directly include links [2] and [3] out to the w3.org
compatibility notes ... those are stable URLs to reference material.
=== BY REF and BY VALUE ===
How difficult would it be to make the grammar constructs that currently
accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
and BY VALUE? The documentation ought to state that, while both forms are
accepted and ignored, PostgreSQL's actual behavior corresponds to what the
standard calls BY VALUE (using a string serialization as the XML datatype's
internal form makes the BY REF semantics impossible). So it's when people
try to port queries that explicitly say BY REF that they need to know there
may be trouble ahead.
This is difficult question - a implementation is probably very easy, but it is hard to accept to possible break compatibility due syntactic sugar.
This is not probably related to just XPath/XQuery question - but it is related to different design of XML datatype (based on PostgreSQL TOAST) against ANSI/SQL (Oracle - clob).
So this is complicated topic and my opinion is better to don't touch it because we can't to fix it, change it - and I am not sure so ANSI/SQL is significantly better than PostgreSQL implementation.
-Chap
>> [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
>> [2] https://www.w3.org/TR/xpath20/#id-backwards-compatibility
>> [3] https://www.w3.org/TR/2010/REC-xpath-functions-20101214/#xpath1-compatibility
On 12/30/18 15:18, Pavel Stehule wrote: > ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net> > napsal: >> How difficult would it be to make the grammar constructs that currently >> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF >> and BY VALUE? > > This is difficult question - a implementation is probably very easy, but it > is hard to accept to possible break compatibility due syntactic sugar. > > This is not probably related to just XPath/XQuery question - but it is > related to different design of XML datatype (based on PostgreSQL TOAST) > against ANSI/SQL (Oracle - clob). > > So this is complicated topic and my opinion is better to don't touch it > because we can't to fix it, change it - and I am not sure so ANSI/SQL is > significantly better than PostgreSQL implementation. I am not sure I understand your point. It appears that Oracle (18c), just like PostgreSQL, really only supports BY VALUE semantics. Here is an Oracle fiddle that shows it: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd When the same element is passed via two parameters, an 'is' test (node identity equality) of the two parameters returns false, indicating that Oracle has used BY VALUE semantics, not BY REF. Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when no passing method is specified (i.e., BY VALUE is the default). Oracle also uses BY VALUE when BY REF is explicitly requested, which seems rather rude, but that must be the behavior PostgreSQL is imitating with the choice to accept and ignore BY REF. But the PostgreSQL situation is a little more strange. PG uses BY VALUE semantics as the default when no passing method is specified. PG also uses BY VALUE semantics when BY REF is explicitly requested, which is rude, just like Oracle. But why should an explicit specification of BY VALUE (which is, after all, the semantics we're going to use anyway!) produce this? ERROR: syntax error at or near "value" To me, that doesn't seem like least astonishment. I am not seeing what would be complicated about removing that astonishment by simply allowing the grammar productions to also consume BY VALUE and ignore it. -Chap
I would like to add material to the manual, detailing the differences between the XPath 1.0 language supported in PG, and the XQuery/XPath 2.0+ expected by the standard. A good preview of what that would look like is the "related to version of XPath" wiki section at [1]. That wiki section (minus the two inside-baseball paragraphs about commit 3963574 and its reversion) comes to a bit under 400 words. So that's roughly the amount of material I'm thinking to add. Where should it go? I have thought of: 1. A new sect2 near the top of the "functions-xml" sect1. This makes sense because the material applies to all the functions below that involve XPath. But that seems like the wrong place for 400 words of gory details. 2. A short remark at the top of functions-xml, linked to a footnote with the gory details. But that would be a long footnote. The longest currently is a footnote with 101 words in the start tutorial. 3. A sect2 at the very end of functions-xml, linked to from a short remark at the top. 4. A new appendix, or a new conformance section in the features.sgml appendix, linked to from a short remark at the top of functions-xml. Or is there an even better idea I have not thought of? -Chap [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Related_to_version_of_XPath
po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 15:18, Pavel Stehule wrote:
> ne 30. 12. 2018 v 20:06 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> How difficult would it be to make the grammar constructs that currently
>> accept "BY REF" (only as noise and ignore it) accept and ignore both BY REF
>> and BY VALUE?
>
> This is difficult question - a implementation is probably very easy, but it
> is hard to accept to possible break compatibility due syntactic sugar.
>
> This is not probably related to just XPath/XQuery question - but it is
> related to different design of XML datatype (based on PostgreSQL TOAST)
> against ANSI/SQL (Oracle - clob).
>
> So this is complicated topic and my opinion is better to don't touch it
> because we can't to fix it, change it - and I am not sure so ANSI/SQL is
> significantly better than PostgreSQL implementation.
I am not sure I understand your point. It appears that Oracle (18c),
just like PostgreSQL, really only supports BY VALUE semantics. Here is
an Oracle fiddle that shows it:
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0cb353da0d94c6d5c2659222a1e419fd
When the same element is passed via two parameters, an 'is' test (node
identity equality) of the two parameters returns false, indicating that
Oracle has used BY VALUE semantics, not BY REF.
Oracle uses BY VALUE when BY VALUE is explicitly requested, and also when
no passing method is specified (i.e., BY VALUE is the default). Oracle also
uses BY VALUE when BY REF is explicitly requested, which seems rather rude,
but that must be the behavior PostgreSQL is imitating with the choice to
accept and ignore BY REF.
But the PostgreSQL situation is a little more strange. PG uses BY VALUE
semantics as the default when no passing method is specified. PG also uses
BY VALUE semantics when BY REF is explicitly requested, which is rude,
just like Oracle. But why should an explicit specification of BY VALUE
(which is, after all, the semantics we're going to use anyway!) produce
this?
ERROR: syntax error at or near "value"
To me, that doesn't seem like least astonishment.
I am not seeing what would be complicated about removing that astonishment
by simply allowing the grammar productions to also consume BY VALUE and
ignore it.
ok - I am not against implementation of ignored BY VALUE. But I don't like a idea to disable BY REF.
Regards
Pavel
-Chap
Hello Pavel, On 09.11.2018 07:07, Pavel Stehule wrote: > I used your patch and append regress tests. I checked the result against > Oracle. I checked the patch with Chap cases. The patch fixes handling of boolean, number types which mentioned in the wiki. I have a few comments related to the code and the documentation. I attached the patch, which fixes it. There is an example in the documentation: SELECT xmltable.* FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text); element ---------------------- Hello2a2 bbbCC With the patch XMLTABLE returns different result now. copy_and_safe_free_xmlchar() function should be hid by #ifdef USE_LIBXML, otherwise I get an error if I build the Postgres without --with-libxml. There is a comment within XmlTableGetValue(). I changed it, mainly I used Markus patch from the related thread mentioned by Alvaro. Please see the changes in the patch. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Вложения
Hi
čt 10. 1. 2019 v 14:00 odesílatel Arthur Zakirov <a.zakirov@postgrespro.ru> napsal:
Hello Pavel,
On 09.11.2018 07:07, Pavel Stehule wrote:
> I used your patch and append regress tests. I checked the result against
> Oracle.
I checked the patch with Chap cases. The patch fixes handling of
boolean, number types which mentioned in the wiki.
I have a few comments related to the code and the documentation. I
attached the patch, which fixes it.
There is an example in the documentation:
SELECT xmltable.*
FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text);
element
----------------------
Hello2a2 bbbCC
With the patch XMLTABLE returns different result now.
copy_and_safe_free_xmlchar() function should be hid by #ifdef
USE_LIBXML, otherwise I get an error if I build the Postgres without
--with-libxml.
There is a comment within XmlTableGetValue(). I changed it, mainly I
used Markus patch from the related thread mentioned by Alvaro.
Please see the changes in the patch.
I merged your changes, and fixed regress tests.
Thank you for patch
Regards
Pavel
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Вложения
On 12/30/18 22:23, Chapman Flack wrote: > I would like to add material to the manual, detailing the differences > between the XPath 1.0 language supported in PG, and the XQuery/XPath 2.0+ > expected by the standard. > ... > I have thought of: > ... > 3. A sect2 at the very end of functions-xml, linked to from a short remark > at the top. (3) seems to be supported by (the only) precedent, the "Limits and Compatibility" sect3 within functions-posix-regexp. So, absent objection, I'll work on a Limits and Compatibility sect2 within functions-xml. -Chap
On 01/10/19 23:48, Chapman Flack wrote: > On 12/30/18 22:23, Chapman Flack wrote: > (3) seems to be supported by (the only) precedent, the "Limits and > Compatibility" sect3 within functions-posix-regexp. > > So, absent objection, I'll work on a Limits and Compatibility sect2 > within functions-xml. One fly in the ointment: tables of contents seem to list sect1 and sect2 elements, but not sect3. So the "Limits and Compatibility" sect3 under functions-posix-regexp does not clutter the overall "Functions and Operators" ToC, but a "Limits and Compatibility" sect2 under functions-xml would be visible there. Tucking it as a sect3 within an existing sect2 would prevent that, but it is common information that pertains to more than one of them, so that doesn't seem quite right. Or is it ok to have a "Limits and Compatibility" visible in the ToC under XML Functions? -Chap
On 12/30/18 03:23, Pavel Stehule wrote: > Unfortunately, there is a different releases of libxml2 with different > error reporting and it is hard (impossible) to prepare for all variants. :-/ > > I prepare xml.out for my FC29 (fresh libxml2) and for no support xml. > Other I prepare by patching - and this error (in context) is expected. It turns out that the variant was already accounted for in the xml_2.out variant result file, it just needed to have the new results added. Done in xmltable-xpath-result-processing-bugfix-4.patch attached. On 12/31/18 01:03, Pavel Stehule wrote: > po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net> > napsal: >> But the PostgreSQL situation is a little more strange. PG uses BY VALUE >> semantics as the default when no passing method is specified. PG also uses >> BY VALUE semantics when BY REF is explicitly requested, which is rude, >> just like Oracle. But why should an explicit specification of BY VALUE >> (which is, after all, the semantics we're going to use anyway!) produce >> this? >> >> ERROR: syntax error at or near "value" >> >> To me, that doesn't seem like least astonishment. >> >> I am not seeing what would be complicated about removing that astonishment >> by simply allowing the grammar productions to also consume BY VALUE and >> ignore it. > > ok - I am not against implementation of ignored BY VALUE. But I don't like > a idea to disable BY REF. Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with some corresponding documentation adjustments. I am still working on more extensive documentation, but it seemed best to include the changes related to BY REF / BY VALUE in the same patch with the grammar change. -Chap
Вложения
ne 13. 1. 2019 v 0:39 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 12/30/18 03:23, Pavel Stehule wrote:
> Unfortunately, there is a different releases of libxml2 with different
> error reporting and it is hard (impossible) to prepare for all variants. :-/
>
> I prepare xml.out for my FC29 (fresh libxml2) and for no support xml.
> Other I prepare by patching - and this error (in context) is expected.
It turns out that the variant was already accounted for in the xml_2.out
variant result file, it just needed to have the new results added.
Done in xmltable-xpath-result-processing-bugfix-4.patch attached.
On 12/31/18 01:03, Pavel Stehule wrote:
> po 31. 12. 2018 v 3:15 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
>> semantics as the default when no passing method is specified. PG also uses
>> BY VALUE semantics when BY REF is explicitly requested, which is rude,
>> just like Oracle. But why should an explicit specification of BY VALUE
>> (which is, after all, the semantics we're going to use anyway!) produce
>> this?
>>
>> ERROR: syntax error at or near "value"
>>
>> To me, that doesn't seem like least astonishment.
>>
>> I am not seeing what would be complicated about removing that astonishment
>> by simply allowing the grammar productions to also consume BY VALUE and
>> ignore it.
>
> ok - I am not against implementation of ignored BY VALUE. But I don't like
> a idea to disable BY REF.
Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with
some corresponding documentation adjustments.
I am still working on more extensive documentation, but it seemed best
to include the changes related to BY REF / BY VALUE in the same patch
with the grammar change.
looks well, thank you for patch
Pavel
-Chap
There is a bug that remains, in the else if (xpathobj->type == XPATH_STRING) case. As it is now, it simply passes the string value of the result into the output column's type-input function, regardless of the output column type. If the output column type is xml, this will attempt to parse the string as xml. The result should simply be xml content consisting of a text node representing the string (as by XMLTEXT()). If it contains XML metacharacters, they should be escaped. For a non-xml output column, the string should be used directly, as it is now. # select * from xmltable('.' passing xmlelement(name a) columns a text path '"<foo/>"', b xml path '"<foo/>"'); a | b --------+-------- <foo/> | <foo/> Oracle fiddle for comparison: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=26f91a9e55a6908c2bcf848b464ca381 A B <foo/> <foo/> -Chap
Hi
po 14. 1. 2019 v 2:41 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
There is a bug that remains, in the
else if (xpathobj->type == XPATH_STRING)
case.
As it is now, it simply passes the string value of the result
into the output column's type-input function, regardless of the
output column type.
If the output column type is xml, this will attempt to parse the
string as xml. The result should simply be xml content consisting of
a text node representing the string (as by XMLTEXT()). If it contains
XML metacharacters, they should be escaped.
For a non-xml output column, the string should be used directly,
as it is now.
# select * from xmltable('.' passing xmlelement(name a)
columns a text path '"<foo/>"', b xml path '"<foo/>"');
a | b
--------+--------
<foo/> | <foo/>
Oracle fiddle for comparison:
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=26f91a9e55a6908c2bcf848b464ca381
A B
<foo/> <foo/>
should be fixed in last patch
Regards
Pavel
-Chap
Вложения
Hi, Is there, somewhere, a written-up "house style" for what DocBook 4.2 elements to use for which types of content in the manual? In func.sgml I'm seeing what looks like a variety of examples, and I'm not sure which ones to try to follow. Thanks, -Chap
Hi On 2019-Jan-18, Chapman Flack wrote: > Is there, somewhere, a written-up "house style" for what DocBook 4.2 > elements to use for which types of content in the manual? > > In func.sgml I'm seeing what looks like a variety of examples, and > I'm not sure which ones to try to follow. I don't think we do. I'd suggest to come up with something and then see if it makes sense to patch the docs to apply it regularly. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 01/19/19 08:35, Alvaro Herrera wrote: >> Is there, somewhere, a written-up "house style" for what DocBook 4.2 >> elements to use for which types of content in the manual? >> ... > I don't think we do. I'd suggest to come up with something and then see > if it makes sense to patch the docs to apply it regularly. I think my ambition at the moment is just to complete a particular addition to func.sgml and do so as consistently as I can manage with what's there now. I have noticed a couple of things: - 'SQL' is often marked up as <acronym>SQL</acronym>, but far from always. - no such markup is applied to 'JSON' or 'XML' at all, at least not in func.sgml. - there is a README.links with this guideline: o Do not use text with <ulink> so the URL appears in printed output but a grep -r in doc/src/sgml turns up 112 uses that observe the guideline, and 147 that supply link text. (thinks to self half-seriously about an XSL transform for generating printed output that could preserve link-texted links, add raised numbers, and produce a numbered URLs section at the back) -Chap
Working slowly through the documentation, I came upon: For XMLTABLE: - The xmltable function produces a table based on the given XML value, an XPath filter to extract rows, and an optional set of column definitions. ^^^^^^^^ ... The mandatory COLUMNS clause specifies the list of columns ... ^^^^^^^^^ if the COLUMNS clause is omitted, the rows in the result set contain ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a single column of type xml containing the data matched by row_expression. This documentation seems undecided on whether the COLUMNS clause is mandatory or optional. It is mandatory in the SQL standard. It's mandatory in our grammar. We give a syntax_error if it's omitted. Is some of the documentation left over from an earlier contemplated design of having the clause be optional? Oracle does seem to allow the clause to be omitted, and produces a single xml column, as described. Was there an earlier plan to imitate Oracle's nonstandard behavior on that point? (Hardly seems worth the effort, as porting an Oracle query depending on it would simply entail adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and standard.) - It is possible for a default_expression to reference the value of output columns that appear prior to it in the column list, so the default of one column may be based on the value of another column. Is there an example that clearly shows this to work? If I write a default_expression referring to a prior column in /xmltable's own/ column list, I get an undefined_column error. I can successfully refer to a column of /an earlier FROM item in the SELECT/, but I am not sure that demonstrates the behavior claimed here. There is what looks like an example among the regression tests (the one with DEFAULT ascii(_path) - 54), but that seems only to demonstrate xmltable getting invoked four times (as documented for LATERAL), not a single xmltable invocation producing multiple rows with recomputed defaults. If it's any comfort, I haven't gotten Oracle's xmltable to recognize earlier columns in its own column list either. - Unlike regular PostgreSQL functions, column_expression and default_expression are not evaluated to a simple value before calling the function. column_expression is normally evaluated exactly once per input row, and default_expression is evaluated each time a default is needed for a field. I've already covered the question about default_expression, but what this passage says about column_expression seems, at least, ambiguously worded, too: It goes without saying that /the XPath evaluator/ evaluates the column_expression exactly once per input row. In the standard, that's the only per-row evaluation happening; the column_expression SQL value only gets compiled to an XPath expression once at the start. (In fact, in the standard, it can't even be an arbitrary SQL expression, only a string literal. Oracle enforces that too.) It seems that our implementation is meant to extend the standard and actually allow the column_expression to vary per-row, and go through the XPath expression compiler each time. The regression test with COLUMNS a int PATH '' || lower(_path) || 'c' seems to be intended to confirm that behavior. But again, I think it is only confirming that LATERAL results in xmltable being called four consecutive times, with a different PATH in each call. It does not seem to demonstrate a single xmltable call doing anything special with recompiling a column path. Am I overlooking something? Regards, -Chap
ne 20. 1. 2019 v 5:37 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Working slowly through the documentation, I came upon:
For XMLTABLE:
- The xmltable function produces a table based on the given XML value,
an XPath filter to extract rows, and an optional set of column
definitions. ^^^^^^^^
...
The mandatory COLUMNS clause specifies the list of columns ...
^^^^^^^^^
if the COLUMNS clause is omitted, the rows in the result set contain
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
a single column of type xml containing the data matched by
row_expression.
This documentation seems undecided on whether the COLUMNS clause
is mandatory or optional.
It is mandatory in the SQL standard. It's mandatory in our grammar.
We give a syntax_error if it's omitted.
Is some of the documentation left over from an earlier contemplated
design of having the clause be optional?
Oracle does seem to allow the clause to be omitted, and produces a
single xml column, as described. Was there an earlier plan to imitate
Oracle's nonstandard behavior on that point? (Hardly seems worth the
effort, as porting an Oracle query depending on it would simply entail
adding COLUMNS COLUMN_VALUE XML PATH '.' and then it's portable and
standard.)
If I remember, described functionality was implemented in early patches, but was removed to simplify code. To now, there was not a request to do it.
Unfortunately, the documentation was not fixed.
- It is possible for a default_expression to reference the value of
output columns that appear prior to it in the column list, so the
default of one column may be based on the value of another column.
Is there an example that clearly shows this to work? If I write a
default_expression referring to a prior column in /xmltable's own/
column list, I get an undefined_column error. I can successfully refer
to a column of /an earlier FROM item in the SELECT/, but I am not sure
that demonstrates the behavior claimed here.
There is what looks like an example among the regression tests
(the one with DEFAULT ascii(_path) - 54), but that seems only to
demonstrate xmltable getting invoked four times (as documented for
LATERAL), not a single xmltable invocation producing multiple rows
with recomputed defaults.
If it's any comfort, I haven't gotten Oracle's xmltable to recognize
earlier columns in its own column list either.
- Unlike regular PostgreSQL functions, column_expression and
default_expression are not evaluated to a simple value before calling
the function. column_expression is normally evaluated exactly once
per input row, and default_expression is evaluated each time a default
is needed for a field.
I've already covered the question about default_expression, but what
this passage says about column_expression seems, at least, ambiguously
worded, too:
It goes without saying that /the XPath evaluator/ evaluates the
column_expression exactly once per input row. In the standard, that's
the only per-row evaluation happening; the column_expression SQL value
only gets compiled to an XPath expression once at the start. (In fact,
in the standard, it can't even be an arbitrary SQL expression, only a
string literal. Oracle enforces that too.)
column expressions are evaluated once per row, but XPath expression is compiled per row too, if I remember well. We designed it more tolerant as we expected possibility to store XPath expression in one column and data in second column.
Regards
Pavel
It seems that our implementation is meant to extend the standard and
actually allow the column_expression to vary per-row, and go through
the XPath expression compiler each time. The regression test with
COLUMNS a int PATH '' || lower(_path) || 'c'
seems to be intended to confirm that behavior. But again, I think
it is only confirming that LATERAL results in xmltable being called
four consecutive times, with a different PATH in each call. It does
not seem to demonstrate a single xmltable call doing anything special
with recompiling a column path.
Am I overlooking something?
Regards,
-Chap
On 01/19/19 23:49, Pavel Stehule wrote: > If I remember, described functionality was implemented in early patches, > but was removed to simplify code. To now, there was not a request to do it. > > Unfortunately, the documentation was not fixed. I'll do that, as I'm working in there anyway. :) > column expressions are evaluated once per row, but XPath expression is > compiled per row too, if I remember well. I looked for evidence of that in the code, but did not find it; the compilation appears to happen in XmlTableSetColumnFilter, which is called from tfuncInitialize. I can't guarantee I didn't miss something, though. > We designed it more tolerant as > we expected possibility to store XPath expression in one column and data in > second column. Perhaps if I could see an example showing the functionality... The nearest I could find in the regression tests was the test with COLUMNS a int PATH '' || lower(_path) || 'c' but, again, I think that test only demonstrates how LATERAL works, not any behavior specific to xmltable. Regards, -Chap
ne 20. 1. 2019 v 6:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/19/19 23:49, Pavel Stehule wrote:
> If I remember, described functionality was implemented in early patches,
> but was removed to simplify code. To now, there was not a request to do it.
>
> Unfortunately, the documentation was not fixed.
I'll do that, as I'm working in there anyway. :)
> column expressions are evaluated once per row, but XPath expression is
> compiled per row too, if I remember well.
I looked for evidence of that in the code, but did not find it; the
compilation appears to happen in XmlTableSetColumnFilter, which is
called from tfuncInitialize.
it is called per input row.
I can't guarantee I didn't miss something, though.
> We designed it more tolerant as
> we expected possibility to store XPath expression in one column and data in
> second column.
Perhaps if I could see an example showing the functionality... The nearest
I could find in the regression tests was the test with
COLUMNS a int PATH '' || lower(_path) || 'c'
but, again, I think that test only demonstrates how LATERAL works, not
any behavior specific to xmltable.
the main reason for this was not to support support some specific patterns - just used design doesn't requires stronger changes in executor or internal multi call caching of some internal data. Probably you can find discussion related to this topic in mailing list archive.
sure - using any expression in PATH clause should to demonstrate this functionality.
Regards,
-Chap
Hi, On 01/20/19 00:26, Pavel Stehule wrote: >>> column expressions are evaluated once per row, but XPath expression is >>> compiled per row too, if I remember well. >> >> I looked for evidence of that in the code, but did not find it; the >> compilation appears to happen in XmlTableSetColumnFilter, which is >> called from tfuncInitialize. > > it is called per input row. I could be misunderstanding what you mean by 'input row' here. If I write a simple xmltable query where the row_expression produces six rows: SELECT * FROM xmltable('pg_am/row' PASSING table_to_xml('pg_am', true, false, '') COLUMNS amname text PATH 'amname'); six rows are produced, though a breakpoint set on XmlTableSetColumnFilter fires only once, from tfuncInitialize at the start of xmltable's execution. By contrast, in the regression test example with PATH ''||lower(_path)||'c', four rows are produced and the breakpoint fires four times. However ... that isn't because one call to xmltable is producing four rows and recomputing the column_expression each time. It's because that xmltable is the RHS of a LATERAL, and the LHS of the LATERAL is producing four tuples with different values of columns the RHS depends on, so the RHS (xmltable) is being called four different times, producing one row each time, still with XmlTableSetColumnFilter being called only during initialization. > sure - using any expression in PATH clause should to demonstrate this > functionality. Well, there seem to be two distinct features touched on in the docs: 1. The column_expression is allowed to be an expression, not restricted to a string literal as it is in the standard (and Oracle). 2. Not only is it an expression, but it's an expression whose evaluation is deferred and can happen more than once in the same xmltable call. The example in the regression tests certainly demonstrates (1). Without (1), it would be a syntax error. I think the same example would produce the same output even with feature (2) absent. It's LATERAL doing the magic there. So I am doubtful that it demonstrates (2). I put some effort last night into trying to even construct any query that would demonstrate (2), and I came up short, but that could be my lack of imagination. (Somewhere in that effort I happened to notice that xmltable doesn't seem to be parsed successfully inside a ROWS FROM (...) construct, which might be another issue for another time....) So, if you have a way to build a query that demonstrates (2), my aha! moment might then arrive. Regards, -Chap
ne 20. 1. 2019 v 17:06 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Hi,
On 01/20/19 00:26, Pavel Stehule wrote:
>>> column expressions are evaluated once per row, but XPath expression is
>>> compiled per row too, if I remember well.
>>
>> I looked for evidence of that in the code, but did not find it; the
>> compilation appears to happen in XmlTableSetColumnFilter, which is
>> called from tfuncInitialize.
>
> it is called per input row.
I could be misunderstanding what you mean by 'input row' here.
input row mean a row of processed relation. The xml value from this row can be transformed to 0..M output rows.
The column filter expressions are evaluated once per input rows, default expressions are evaluated when it is necessary - possibly once for any output row
If I write a simple xmltable query where the row_expression produces
six rows:
SELECT *
FROM
xmltable('pg_am/row'
PASSING table_to_xml('pg_am', true, false, '')
COLUMNS amname text PATH 'amname');
six rows are produced, though a breakpoint set on XmlTableSetColumnFilter
fires only once, from tfuncInitialize at the start of xmltable's execution.
By contrast, in the regression test example with PATH ''||lower(_path)||'c',
four rows are produced and the breakpoint fires four times.
it is expected - the input relation has four lines - the function was 4x initialized. In this case 1 call of xmltable produces 1 row.
However ... that isn't because one call to xmltable is producing four rows
and recomputing the column_expression each time.
It's because that xmltable is the RHS of a LATERAL, and the LHS of the
LATERAL is producing four tuples with different values of columns the RHS
depends on, so the RHS (xmltable) is being called four different times,
producing one row each time, still with XmlTableSetColumnFilter being called
only during initialization.
> sure - using any expression in PATH clause should to demonstrate this
> functionality.
Well, there seem to be two distinct features touched on in the docs:
1. The column_expression is allowed to be an expression, not restricted
to a string literal as it is in the standard (and Oracle).
2. Not only is it an expression, but it's an expression whose evaluation
is deferred and can happen more than once in the same xmltable call.
yes, it is any expressions used there are evaluated per 1 call. If input relation has N rows, then xmltable is initialized N times - by different words. xmltable is evaluated independently for any processed XML document.
The example in the regression tests certainly demonstrates (1). Without (1),
it would be a syntax error.
I think the same example would produce the same output even with feature (2)
absent. It's LATERAL doing the magic there. So I am doubtful that it
demonstrates (2).
LATERAL is necessary, because XMLTABLE can be used only in FROM clause, and in this case XMLTABLE has mutable parameters.
I put some effort last night into trying to even construct any query that
would demonstrate (2), and I came up short, but that could be my lack of
imagination. (Somewhere in that effort I happened to notice that xmltable
doesn't seem to be parsed successfully inside a ROWS FROM (...) construct,
which might be another issue for another time....)
So, if you have a way to build a query that demonstrates (2), my aha! moment
might then arrive.
Regards,
-Chap
On 01/20/19 11:55, Pavel Stehule wrote: > input row mean a row of processed relation. The xml value from this row can > be transformed to 0..M output rows. > > The column filter expressions are evaluated once per input rows, default > expressions are evaluated when it is necessary - possibly once for any > output row > ... > it is expected - the input relation has four lines - the function was 4x > initialized. In this case 1 call of xmltable produces 1 row. Good ... I think we're converging on a shared understanding. I am just used to speaking of xmltable as a simple function that executes one row_expression one time against one supplied input, and generates 0..M output rows from it. If there are multiple rows in a relation that you want to apply xmltable to, that's a simple matter of calling xmltable multiple times (which is just what SQL is doing when the xmltable is on the RHS of an explicit or implied LATERAL). The upshot seems to be that there is nothing necessarily special about how xmltable treats its column_expressions: it compiles them once upon entry to the function, as one would naïvely expect. (Or, if there is anything more special about how the column_expression is being handled, it seems not to be necessary, as the naïve behavior would be adequate.) Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL functions," is more likely to perplex readers than to enlighten them. What it says about column_expression does not seem to lead to any useful difference from the behavior if it were "just like regular PostgreSQL functions". The part about usefully using volatile functions in default_expression remains important to mention. The statement in an earlier paragraph that "It is possible for a default_expression to reference the value of output columns that appear prior to it in the column list" still may need some rework, because it does not seem possible to refer to prior columns /within xmltable's own column list/ (though that could be useful, and I think it is intended in the standard). Doesn't seem to work in Oracle either.... While it does seem possible to refer to columns supplied by /earlier FROM items in the containing SELECT/, that simply results in multiple calls of xmltable, just as in the column_expression case. >> I think the same example would produce the same output even with feature >> (2) >> absent. It's LATERAL doing the magic there. So I am doubtful that it >> demonstrates (2). > > LATERAL is necessary, because XMLTABLE can be used only in FROM clause, > and in this case XMLTABLE has mutable parameters. For what it's worth, if I repeat the query with the word LATERAL removed, it works just the same. I think that's simply because the LATERAL behavior is implied for a function-call FROM item, so the explicit word isn't needed. The main thing is, evaluation proceeds in the way described under LATERAL in the ref page for SELECT. Regards, -Chap
Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
functions," is more likely to perplex readers than to enlighten them.
What it says about column_expression does not seem to lead to any useful
difference from the behavior if it were "just like regular PostgreSQL
functions".
regular Postgres' functions has evaluated all arguments before own execution. I think so this note is related much more to expressions used as defaults.
The part about usefully using volatile functions in default_expression
remains important to mention.
The statement in an earlier paragraph that "It is possible for a
default_expression to reference the value of output columns that appear
prior to it in the column list" still may need some rework, because it
does not seem possible to refer to prior columns /within xmltable's own
column list/ (though that could be useful, and I think it is intended
in the standard). Doesn't seem to work in Oracle either....
While it does seem possible to refer to columns supplied by
/earlier FROM items in the containing SELECT/, that simply results in
multiple calls of xmltable, just as in the column_expression case.
>> I think the same example would produce the same output even with feature
>> (2)
>> absent. It's LATERAL doing the magic there. So I am doubtful that it
>> demonstrates (2).
>
> LATERAL is necessary, because XMLTABLE can be used only in FROM clause,
> and in this case XMLTABLE has mutable parameters.
For what it's worth, if I repeat the query with the word LATERAL removed,
it works just the same. I think that's simply because the LATERAL behavior
is implied for a function-call FROM item, so the explicit word isn't needed.
The main thing is, evaluation proceeds in the way described under LATERAL
in the ref page for SELECT.
In this case the LATERAL keyword is optional - with or without this keyword it is lateral join.
Regards
Pavel
Regards,
-Chap
On 01/20/19 12:48, Pavel Stehule wrote: >> >> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL >> functions," is more likely to perplex readers than to enlighten them. >> What it says about column_expression does not seem to lead to any useful >> difference from the behavior if it were "just like regular PostgreSQL >> functions". > > regular Postgres' functions has evaluated all arguments before own > execution. I think so this note is related much more to expressions used as > defaults. Sure, but again, is there an example, or can one easily be constructed, that shows the default expressions working in such a way? I am not able to use a default expression to refer to an earlier column in the column list of the xmltable call. I am able to use a default expression to refer to a column of an earlier FROM item in the enclosing SELECT. But such a query ends up having LATERAL form (whether or not the word LATERAL is used), and re-executes xmltable whenever the referenced column value changes. In that case, whether the default argument is evaluated at function entry or later doesn't seem to matter: the function is re-executed, so evaluating the new default at the time of entry is sufficient. So, I have still not been able to construct a query that requires the deferred evaluation behavior. But perhaps there is a way I haven't thought of. Regards, -Chap
On 01/20/19 17:13, Chapman Flack wrote: > On 01/20/19 12:48, Pavel Stehule wrote: >> regular Postgres' functions has evaluated all arguments before own >> execution. I think so this note is related much more to expressions used as >> defaults. > > Sure, but again, is there an example, or can one easily be constructed, > that shows the default expressions working in such a way? To make my question more concrete, here is the current regression test query that uses an SQL expression for a default: SELECT xmltable.* FROM xmltest2, xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54); a ---- 11 12 13 14 (4 rows) Here is the same query desugared into a call of the PL/Java "xmltable": SELECT xmltable.* FROM xmltest2, LATERAL (SELECT x AS ".", ascii(_path) - 54 AS "A_DFT") AS p, "xmltable"(('/d/r/' || lower(_path) || 'c'), PASSING => p, COLUMNS => ARRAY[ 'let $a := x return xs:int(if (exists($a)) then $a else $A_DFT)' ] ) AS (a int); a ---- 11 12 13 14 (4 rows) So the PL/Java version works and produces the same results. And yet it certainly is a "regular PostgreSQL function" made with CREATE FUNCTION, no special treatment of arguments, all evaluated before entry in the usual way. So it appears that this example does not depend on any special treatment of the default_expression. Is there an example that can be constructed that would depend on the special treatment (in which case, the PL/Java implementation would be unable to produce the same result)? Regards, -Chap ... the xs:int(...) cast above is needed for now, just because the PL/Java implementation does not yet include the standard's algorithm to find the right cast automatically.
On 01/20/19 20:07, Chapman Flack wrote: > So it appears that this example does not depend on any special treatment > of the default_expression. > > Is there an example that can be constructed that would depend on the > special treatment (in which case, the PL/Java implementation would be > unable to produce the same result)? I see that I briefly forgot one difference that does matter, namely the timing of a call to a volatile function like nextval. That detail certainly needs to remain in the docs. I am left wondering if that might be the only effect of the deferred argument evaluation that really does matter. Regards, -Chap
ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/20/19 12:48, Pavel Stehule wrote:
>>
>> Accordingly, I think the paragraph beginning "Unlike regular PostgreSQL
>> functions," is more likely to perplex readers than to enlighten them.
>> What it says about column_expression does not seem to lead to any useful
>> difference from the behavior if it were "just like regular PostgreSQL
>> functions".
>
> regular Postgres' functions has evaluated all arguments before own
> execution. I think so this note is related much more to expressions used as
> defaults.
Sure, but again, is there an example, or can one easily be constructed,
that shows the default expressions working in such a way?
I am not able to use a default expression to refer to an earlier
column in the column list of the xmltable call.
probably you can see the effect if you use some volatile function .. random(), nextval(),
I think so notice in documentation was not a motivation to use it. It was explanation of implementation and warnings against side effect.
I am able to use a default expression to refer to a column of an earlier
FROM item in the enclosing SELECT. But such a query ends up having LATERAL
form (whether or not the word LATERAL is used), and re-executes xmltable
whenever the referenced column value changes. In that case, whether the
default argument is evaluated at function entry or later doesn't seem
to matter: the function is re-executed, so evaluating the new default
at the time of entry is sufficient.
it has sense only for volatile functions. it was not often. On second hand deferred evaluation shoul not be a problem, and more, it is evaluated only when it is necessary, what is more sensible for me.
So, I have still not been able to construct a query that requires the
deferred evaluation behavior. But perhaps there is a way I haven't
thought of.
Regards,
-Chap
On 2019-Jan-19, Chapman Flack wrote: > I have noticed a couple of things: > > - 'SQL' is often marked up as <acronym>SQL</acronym>, but far from always. > > - no such markup is applied to 'JSON' or 'XML' at all, at least > not in func.sgml. I think these inconsistencies just stem from lack of a strong reviewer hand on the topic. Let's change that? > - there is a README.links with this guideline: > > o Do not use text with <ulink> so the URL appears in printed output > > but a grep -r in doc/src/sgml turns up 112 uses that observe the > guideline, and 147 that supply link text. I think the README.links was written in the SGML times; now that we're in XML it may need to be reconsidered. > (thinks to self half-seriously about an XSL transform for generating > printed output that could preserve link-texted links, add raised numbers, > and produce a numbered URLs section at the back) Well, if you have the time and inclination, and you think such changes are improvements, feel free to propose them. Do keep in mind we have a number of outputs that would be good to keep consistent. Thanks, -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/21/19 09:12, Alvaro Herrera wrote: >> (thinks to self half-seriously about an XSL transform for generating >> printed output that could preserve link-texted links, add raised numbers, >> and produce a numbered URLs section at the back) > > Well, if you have the time and inclination, and you think such changes > are improvements, feel free to propose them. Do keep in mind we have a > number of outputs that would be good to keep consistent. Well, "consistent" what I was half-thinking about, FSVO "consistent". For me, if I'm looking at an online document, I would prefer to see the descriptive text of the link, rather than a long jaggy URL. If I want to see the URL, I can hover over it, and if I want to go there, I can click it. But the point's well taken that in /printed output/, that's of no use. Which is, in a sense, an inconsistency: in one format, you can follow the links, while in another, you're out of luck. Maybe a simpler transform for printed output, rather than collecting all URLs into one section at the back, would just be to follow any <ulink> that has link text with a <footnote> containing the same ulink without the link text, so it shows the URL, and that would be right at the bottom of the same 'page'. That'd be an introductory XSL exercise.... In practice, applying such a transform "for printed output" would probably mean applying it when generating PDF output, which of course can also be viewed online (and probably most often is, these days). So preserving the original ulink run into the text is still of use, in a PDF viewer that can follow links, but adding the footnote ensures that an actual hard copy is still usable. I wouldn't think it important to apply the same treatment when making HTML. So maybe the right value of "consistent" is the one that comes from listing out the various output formats and specifying the right transformation to be applied to each one. (Now wonders if there's a way to do the same transform into HTML while styling the footnote and marker to hide behind @media print....) Regards, -Chap
On 1/21/19 8:46 AM, Chapman Flack wrote: > On 01/21/19 09:12, Alvaro Herrera wrote: > >>> (thinks to self half-seriously about an XSL transform for generating >>> printed output that could preserve link-texted links, add raised numbers, >>> and produce a numbered URLs section at the back) >> Well, if you have the time and inclination, and you think such changes >> are improvements, feel free to propose them. Do keep in mind we have a >> number of outputs that would be good to keep consistent. > Well, "consistent" what I was half-thinking about, FSVO "consistent". > > For me, if I'm looking at an online document, I would prefer to see > the descriptive text of the link, rather than a long jaggy URL. If I > want to see the URL, I can hover over it, and if I want to go there, > I can click it. > > But the point's well taken that in /printed output/, that's of no use. > Which is, in a sense, an inconsistency: in one format, you can follow the > links, while in another, you're out of luck. > > Maybe a simpler transform for printed output, rather than collecting > all URLs into one section at the back, would just be to follow any > <ulink> that has link text with a <footnote> containing the same ulink > without the link text, so it shows the URL, and that would be right at > the bottom of the same 'page'. > > That'd be an introductory XSL exercise.... > > In practice, applying such a transform "for printed output" would > probably mean applying it when generating PDF output, which of course > can also be viewed online (and probably most often is, these days). I don't think that is a good idea. PDFs have had the ability to embed hyperlinks under descriptive text for years. If we are going to expand links for printed output, we should have a specific build / modification to the make file for printed output. I also wonder if we are trying to solve the 1% problem here. Who is really going to "print" our docs? If they do print the docs, they are likely not going to "type in" a long URL. They are going to go online (or to the PDF) and click the link that they saw within the printed page. JD -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc PostgreSQL centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org *** A fault and talent of mine is to tell it exactly how it is. ***
On 2019-Jan-21, Chapman Flack wrote: > But the point's well taken that in /printed output/, that's of no use. > Which is, in a sense, an inconsistency: in one format, you can follow the > links, while in another, you're out of luck. > > Maybe a simpler transform for printed output, rather than collecting > all URLs into one section at the back, would just be to follow any > <ulink> that has link text with a <footnote> containing the same ulink > without the link text, so it shows the URL, and that would be right at > the bottom of the same 'page'. Of course, the text would also be clickable, right? I think putting the URL in a footnote is good in that case; it works both on screen and on paper, which should alleviate JD's concerns. > I wouldn't think it important to apply the same treatment when making HTML. Right, only PDF. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/21/19 12:07, Joshua D. Drake wrote: > Who is really going to "print" our docs? If they do print the > docs, they are likely not going to "type in" a long URL. QR code in footnote (ducks and runs).
On 1/21/19 10:01 AM, Alvaro Herrera wrote: > On 2019-Jan-21, Chapman Flack wrote: > >> But the point's well taken that in /printed output/, that's of no use. >> Which is, in a sense, an inconsistency: in one format, you can follow the >> links, while in another, you're out of luck. >> >> Maybe a simpler transform for printed output, rather than collecting >> all URLs into one section at the back, would just be to follow any >> <ulink> that has link text with a <footnote> containing the same ulink >> without the link text, so it shows the URL, and that would be right at >> the bottom of the same 'page'. > Of course, the text would also be clickable, right? I think putting the > URL in a footnote is good in that case; it works both on screen and on > paper, which should alleviate JD's concerns. Yeah I could see that. I thought about that but was wondering if it was possible to auto cite? JD > >> I wouldn't think it important to apply the same treatment when making HTML. > Right, only PDF. > -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc PostgreSQL centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org *** A fault and talent of mine is to tell it exactly how it is. ***
On 1/21/19 10:11 AM, Chapman Flack wrote: > On 01/21/19 12:07, Joshua D. Drake wrote: >> Who is really going to "print" our docs? If they do print the >> docs, they are likely not going to "type in" a long URL. > QR code in footnote (ducks and runs). Funny, although I know why you said that. I don't think it is that bad of an idea. Everyone uses QR Codes now. Have a QR code that automatically opens a page that lists all the expanded links based on the page it is placed? That is pretty cool but I am not sure if that is something that would happen in this round. JD > -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc PostgreSQL centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn and Network: https://postgresconf.org *** A fault and talent of mine is to tell it exactly how it is. ***
On 2019-Jan-21, Joshua D. Drake wrote: > On 1/21/19 10:01 AM, Alvaro Herrera wrote: > > On 2019-Jan-21, Chapman Flack wrote: > > > > > But the point's well taken that in /printed output/, that's of no use. > > > Which is, in a sense, an inconsistency: in one format, you can follow the > > > links, while in another, you're out of luck. > > > > > > Maybe a simpler transform for printed output, rather than collecting > > > all URLs into one section at the back, would just be to follow any > > > <ulink> that has link text with a <footnote> containing the same ulink > > > without the link text, so it shows the URL, and that would be right at > > > the bottom of the same 'page'. > > Of course, the text would also be clickable, right? I think putting the > > URL in a footnote is good in that case; it works both on screen and on > > paper, which should alleviate JD's concerns. > > Yeah I could see that. I thought about that but was wondering if it was > possible to auto cite? Sorry, what do you mean with auto cite? Put all links at the end of the section/chapter/book? That seems less usable to me (you have to find out the page where the links appear, and make sure to print the right pages) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/21/19 13:14, Joshua D. Drake wrote: >> Of course, the text would also be clickable, right? I think putting the >> URL in a footnote is good in that case; it works both on screen and on >> paper, which should alleviate JD's concerns. > > Yeah I could see that. I thought about that but was wondering if it was It looks like the easiest way to integrate such a behavior into the current Makefile would be not as a separate DocBook->DocBook transform in advance, but simply by editing the existing stylesheet-fo.xsl that produces the FO input for generating the PDF. That would mean learning some FO, which I've wanted to do for a while, but haven't yet, so it stops looking like something I might experiment with this afternoon. OTOH, it could mean more flexibility in how the presentation should look. I note in passing that the google result [1] is nonempty.... -Chap [1] https://www.google.com/search?q=xsl-fo+qr+code
Hi, In two places in the XMLTABLE implementation (XmlTableFetchRow, iterating through a nodeset returned by the row_expression, and XmlTableGetValue, going through a nodeset returned by the column_expression), the iteration proceeds in index order through xpathobj->nodesetval->nodeTab. The same happens in the older xml_xpathobjtoxmlarray, producing the array result of the xpath() function. In <libxml/xpath.h> one finds this unsettling comment: xmlNodePtr *nodeTab; /* array of nodes in no particular order */ So far, no matter what oddball XPath expressions I think up, nodeTab does seem to end up in document order. It would be comforting to document that, but the comment suggests it might not be guaranteed. Are you aware of any statement I might have missed in the libxml docs, where they commit to XPath evaluation returning a nodeset where nodeTab has a predictable order? If there is a statement to that effect somewhere, it might be worth citing in a comment in xml.c. Regards, -Chap Here is a fairly oddball query, generating an XML document with about 3k names in descending order, partitioning those elements into subsets having the same second character of the name, and unioning those back together. You'd sort of expect that to produce a result nodeTab in goofy order if anything could, but no, the results seem verifiably in descending order every time, suggesting the result nodeTab is indeed being put in document order before the XPath evaluator returns. If only they would document that. WITH nodeset_order_check(name, prev_name) AS ( SELECT x, lag(x) OVER (ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM (SELECT string_agg(DISTINCT 'x/n[substring(.,2,1) = "' || substring(proname from 2 for 1) || '"]', '|') FROM pg_proc) AS rowx(rowx), (SELECT XMLELEMENT(NAME x, XMLAGG(XMLELEMENT(NAME n, proname) ORDER BY proname DESC)) FROM pg_proc) AS src(src), XMLTABLE(rowx PASSING src COLUMNS x text PATH '.') ) SELECT every(prev_name >= name IS NOT FALSE) FROM nodeset_order_check; every ------- t (1 row)
Hi, On 01/21/19 01:33, Pavel Stehule wrote: > ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net> > napsal: >> form (whether or not the word LATERAL is used), and re-executes xmltable >> whenever the referenced column value changes. In that case, whether the >> default argument is evaluated at function entry or later doesn't seem >> to matter: the function is re-executed, so evaluating the new default >> at the time of entry is sufficient. > > it has sense only for volatile functions. it was not often. On second hand > deferred evaluation shoul not be a problem, and more, it is evaluated only > when it is necessary, what is more sensible for me. That makes sense. I trimmed the language about input rows and referring to earlier columns, and put more emphasis on the usefulness of evaluating volatile functions only when needed. I am: - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged (just so CF app does not lose track) - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged - attaching for the first time xml-functions-type-docfix-1.patch The doc patch is made to go on top of the passing-mechanisms patch (there were some doc changes in passing-mechanisms, changed again in the new patch to be links to the added Limits and Compatibility section). I hope the patched docs describe accurately what we have at this point. Regards, -Chap
Вложения
pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
Hi,
On 01/21/19 01:33, Pavel Stehule wrote:
> ne 20. 1. 2019 23:13 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> form (whether or not the word LATERAL is used), and re-executes xmltable
>> whenever the referenced column value changes. In that case, whether the
>> default argument is evaluated at function entry or later doesn't seem
>> to matter: the function is re-executed, so evaluating the new default
>> at the time of entry is sufficient.
>
> it has sense only for volatile functions. it was not often. On second hand
> deferred evaluation shoul not be a problem, and more, it is evaluated only
> when it is necessary, what is more sensible for me.
That makes sense. I trimmed the language about input rows and referring to
earlier columns, and put more emphasis on the usefulness of evaluating
volatile functions only when needed.
I am:
- re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
(just so CF app does not lose track)
- re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
- attaching for the first time xml-functions-type-docfix-1.patch
The doc patch is made to go on top of the passing-mechanisms patch
(there were some doc changes in passing-mechanisms, changed again in
the new patch to be links to the added Limits and Compatibility section).
I hope the patched docs describe accurately what we have at this point.
looks well
Pavel
Regards,
-Chap
On 19/01/2019 21:24, Chapman Flack wrote: > (thinks to self half-seriously about an XSL transform for generating > printed output that could preserve link-texted links, add raised numbers, > and produce a numbered URLs section at the back) External links already create footnotes in the PDF output. Is that different from what you are saying? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/25/19 09:01, Peter Eisentraut wrote: > On 19/01/2019 21:24, Chapman Flack wrote: >> (thinks to self half-seriously about an XSL transform for generating >> printed output that could preserve link-texted links, add raised numbers, >> and produce a numbered URLs section at the back) > > External links already create footnotes in the PDF output. Is that > different from what you are saying? That might, only, indicate that I was just thinking aloud in email and had not gone and checked in PDF output to see how the links were handled. Yes, it could very possibly indicate that. If they are already processed that way, does that mean the o Do not use text with <ulink> so the URL appears in printed output in README.links should be considered obsolete, and removed even, and doc authors should feel free to put link text in <ulink> without hesitation? Regards, -Chap
On 01/25/19 00:45, Pavel Stehule wrote: > pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net> > napsal: >> I am: >> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged >> (just so CF app does not lose track) >> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged >> - attaching for the first time xml-functions-type-docfix-1.patch >> >> The doc patch is made to go on top of the passing-mechanisms patch Realized xmltable-xmlexists-passing-mechanisms-1.patch didn't add a regression test. Here attaching (or re-attaching): - xmltable-xpath-result-processing-bugfix-5.patch - unchanged - xmltable-xmlexists-passing-mechanisms-2.patch - now with test - xml-functions-type-docfix-1.patch - unchanged I'll venture a review opinion that all of this applies, builds, and passes check-world on top of 18c0da8, and that, of the issues I had identified at the start of this thread, these changes resolve the ones they set out to resolve. But the second two patches are my own work, so another reviewer is needed. The passing-mechanisms patch is tiny while the docfix patch is not, so there's an opening for a reviewer with an interest in documentation. :) There is still nothing in this patch set to address [1], though that also seems worth doing, perhaps in another patch, and probably not difficult, perhaps needing only a regex. And of course we're still saddled with all the unfixable limits of XPath 1.0; this patch set is fixing a few peripheral fixable things around that. -Chap [1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net
Вложения
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: not tested As the reporter of the issues raised in this email thread, I've reviewed the first patch and contributed the second and third. WHAT THE PATCHES DO: xmltable-xpath-result-processing-bugfix-5.patch contains code changes correcting a subset of the issues that were raised in this email thread. xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow the XML parameter passing mechanism BY VALUE as well as BY REF. Both are ignored, but formerly BY VALUE was a syntax error, which was unintuitive considering that BY VALUE is the passing mechanism PostgreSQL implements (XML node identities are not preserved). xml-functions-type-docfix-1.patch conforms the documentation to reflect the changes in this patch set and the limitations identified in this thread. WHAT I HAVE REVIEWED: I have applied all three patches over 18c0da8 and confirmed that installcheck-world passes and that the code changes resolve the issues they set out to resolve. I've made no entry for "spec compliant" because the question is moot; the spec is written in terms of the XQuery language, types, and concepts, and these facilities in PG are implemented on XPath 1.0, which doesn't have those. But the changes in this patch set do make the PG behaviors more, well, closely analogous to the way the spec compliant functions would behave. WHAT I HAVE NOT REVIEWED: The passing-mechanisms and docfix patches are my own work, so there should be another reviewer who is not me. I've looked closely at the technical, SQL/XML behavior aspects already, but a reviewer with an eye for documentation would be welcome. I'll venture my opinion that this is ready-for-committer to the extent of my own review, but will leave the status at needs-review for a not-me reviewer to update.
Hi
so 26. 1. 2019 v 4:25 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
maybe more simple will be work with original commitfest subscriptions. I can do review of xmltable-xmlexists-passing-mechanisms-2.patch. Documentation review will be harder - I am not a native speaker and I have not a necessary knowledges of XQuery (probably only you have this knowledge).
As the reporter of the issues raised in this email thread, I've reviewed the first patch
and contributed the second and third.
WHAT THE PATCHES DO:
xmltable-xpath-result-processing-bugfix-5.patch contains code changes correcting
a subset of the issues that were raised in this email thread.
xmltable-xmlexists-passing-mechanisms-2.patch adjusts the grammar to allow the XML
parameter passing mechanism BY VALUE as well as BY REF. Both are ignored, but
formerly BY VALUE was a syntax error, which was unintuitive considering that BY VALUE
is the passing mechanism PostgreSQL implements (XML node identities are not preserved).
xml-functions-type-docfix-1.patch conforms the documentation to reflect the changes in
this patch set and the limitations identified in this thread.
WHAT I HAVE REVIEWED:
I have applied all three patches over 18c0da8 and confirmed that installcheck-world passes
and that the code changes resolve the issues they set out to resolve.
I've made no entry for "spec compliant" because the question is moot; the spec is written
in terms of the XQuery language, types, and concepts, and these facilities in PG are
implemented on XPath 1.0, which doesn't have those. But the changes in this patch set
do make the PG behaviors more, well, closely analogous to the way the spec compliant
functions would behave.
WHAT I HAVE NOT REVIEWED:
The passing-mechanisms and docfix patches are my own work, so there should be another
reviewer who is not me. I've looked closely at the technical, SQL/XML behavior aspects already,
but a reviewer with an eye for documentation would be welcome.
I'll venture my opinion that this is ready-for-committer to the extent of my own review, but will
leave the status at needs-review for a not-me reviewer to update.
On 01/25/19 19:37, Chapman Flack wrote: > There is still nothing in this patch set to address [1], though that > also seems worth doing, perhaps in another patch, and probably not > difficult, perhaps needing only a regex. Heck, it could be even simpler than that. If some XML input has a DTD, the attempt to parse it as 'content' with libxml is sure to fail early in the parse (because that's where the DTD is found). If libxml reports enough error detail to show it failed because of a DTD—and it appears to: DETAIL: line 1: StartTag: invalid element name <!DOCTYPE foo> —then simply recognize that error and reparse as 'document' on the spot. The early-failing first parse won't have cost much, and there is probably next to nothing to gain by trying to be any more clever. The one complication might that there seem to be versions of libxml that report error detail differently (hence the variant regression test expected files), so the code might have to recognize a couple different forms. -Chap > [1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net
On 01/25/19 22:53, Pavel Stehule wrote: > Documentation review will be harder - I am not a native speaker and I have > not a necessary knowledges of XQuery (probably only you have this > knowledge). The doc patch is enough that I think it would be ideal to somehow attract a native speaker who has interest in documentation to review it. Even if that is someone without much XQuery-specific knowledge—I'll happily answer whatever questions they ask about that. :) -Chap
On 25/01/2019 15:37, Chapman Flack wrote: >> External links already create footnotes in the PDF output. Is that >> different from what you are saying? > That might, only, indicate that I was just thinking aloud in email and > had not gone and checked in PDF output to see how the links were handled. > > Yes, it could very possibly indicate that. > > If they are already processed that way, does that mean the > > o Do not use text with <ulink> so the URL appears in printed output > > in README.links should be considered obsolete, and removed even, and > doc authors should feel free to put link text in <ulink> without > hesitation? I think it's obsolete, yes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
so 26. 1. 2019 v 1:38 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 01/25/19 00:45, Pavel Stehule wrote:
> pá 25. 1. 2019 v 5:46 odesílatel Chapman Flack <chap@anastigmatix.net>
> napsal:
>> I am:
>> - re-attaching xmltable-xpath-result-processing-bugfix-5.patch unchanged
>> (just so CF app does not lose track)
>> - re-attaching xmltable-xmlexists-passing-mechanisms-1.patch unchanged
>> - attaching for the first time xml-functions-type-docfix-1.patch
>>
>> The doc patch is made to go on top of the passing-mechanisms patch
Realized xmltable-xmlexists-passing-mechanisms-1.patch didn't add
a regression test. Here attaching (or re-attaching):
- xmltable-xpath-result-processing-bugfix-5.patch - unchanged
- xmltable-xmlexists-passing-mechanisms-2.patch - now with test
- xml-functions-type-docfix-1.patch - unchanged
I'll venture a review opinion that all of this applies, builds, and passes
check-world on top of 18c0da8, and that, of the issues I had identified at
the start of this thread, these changes resolve the ones they set out to
resolve.
But the second two patches are my own work, so another reviewer is needed.
The passing-mechanisms patch is tiny while the docfix patch is not, so
there's an opening for a reviewer with an interest in documentation. :)
There is still nothing in this patch set to address [1], though that
also seems worth doing, perhaps in another patch, and probably not
difficult, perhaps needing only a regex.
And of course we're still saddled with all the unfixable limits
of XPath 1.0; this patch set is fixing a few peripheral fixable things
around that.
I am sending a review of these patches
xmltable-xpath-result-processing-bugfix-5.patch - I'll skip it - just all tests passed
xmltable-xmlexists-passing-mechanisms-2.patch - this patch introduce new PASSING mechanism BY VALUE - it is just syntactic sugar due compatibility with standard. It is unhappy so previous implementation was broken and introduced "BY REF" instead "BY VALUE", but this bug should be fixed 10 years ago. It change nothing, all tests passed and the documentation looks ok.
Last patch is documentation only patch - I am thinking so the difference and limits our implementation of XPath based functions are described well and correctly.
I'll mark this patch as ready for commiters.
Regards
Pavel
-Chap
[1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net
On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote: > I'll mark this patch as ready for commiters. For now I have moved the patch to the next CF, with the same status. -- Michael
Вложения
On 02/01/19 20:20, Michael Paquier wrote: > On Thu, Jan 31, 2019 at 04:26:31PM +0100, Pavel Stehule wrote: >> I'll mark this patch as ready for commiters. > > For now I have moved the patch to the next CF, with the same status. I wonder whether, given the move to next CF, it makes sense to change the title of the CF entry from "XMLTABLE" to, more generically, XML improvements, and get one or two more small changes in: - get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to succeed even for content with DTDs, so that the content subtype really does fully include the document subtype, aligning it with the SQL:2006+ standard. I think this would be a simple patch that I can deliver early this month, and Tom found reports where the current behavior already bites people in pg_restore. Its only effect would be to allow a currently- failing case to succeed (and stop biting people). - get XMLEXISTS and XMLTABLE to allow passing of named parameters. I have less of a sense of how difficult that might be, but I see that the libxml xpath API does allow them. I don't know whether they were left out of the original development just arbitrarily, or whether some effort was made and ran into some problem. The value of supporting the named parameters, especially when the library limits us to XPath 1.0, is that the XPath 1.0 limitation that a value passed as the context item can only be an XML 'document' only applies to the context item, not to named parameters. So if somebody is trying to port an expression ...'foo(.)' PASSING bar... and bar is not in document form, there's a simple rewrite to ...'foo($a) PASSING bar AS a... whereas if we can't do named parameters, frustration ensues. Again, the only effect of such a change would be to permit something that currently isn't possible. I don't think I can promise to work on a patch for the second issue, but they both seem worthwhile, and I'm happy to work on the first. It seems to me these changes and the doc patch to go with them are closely enough related to fit in one CF entry that's still smaller and simpler than many, and that shouldn't be too difficult to review for v12, but I'll defer to more experienced voices. Regards, -Chap
Note: I sent an email last night with updated patches, which was not received because of a spamhaus reputation issue formy email provider. In working that out with my provider, at the moment I cannot send email at all, so I am using this comment to explain whythe status went back to "needs review" with no new patches yet. I'll send them again when I can.
[Resending to list so commitfest app will see it; the list blocked this message the first time on a mail reputation issue. Sorry for the duplication. I've removed the individual cc:s from this message.] On 02/05/19 23:16, Chapman Flack wrote: > I wonder whether, given the move to next CF, it makes sense to change > the title of the CF entry from "XMLTABLE" to, more generically, XML > improvements, and get one or two more small changes in: Interpreting the crickets as approval, I have changed the title of the CF entry, and the status back to Needs Review, with these patches attached: xmltable-xpath-result-processing-bugfix-6.patch xmltable-xmlexists-passing-mechanisms-3.patch xml-functions-type-docfix-2.patch xml-content-2006-1.patch That last one is new, and everything is rebased (onto 068503c). xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected output for the no-libxml case that was left out of -5. xml-functions-type-docfix-2.patch removes one more sentence I had meant to remove[1] but forgotten to. xml-content-2006-1.patch does this: > - get XMLPARSE(CONTENT... (and cast-to-xml with XMLOPTION=content) to > succeed even for content with DTDs, so that the content subtype really > does fully include the document subtype, aligning it with the SQL:2006+ > standard. I think this would be a simple patch that I can deliver early > this month, and Tom found reports where the current behavior already > bites people in pg_restore. Its only effect would be to allow a currently- > failing case to succeed (and stop biting people). It works as suggested in [2], just by intercepting the error if a parse-as-content trips over a DTD, and retrying as a parse-as-document. While that has a certain hacky smell, it also has the advantage of handling what's probably an uncommon edge case in a way that adds no upfront cost. (Other, 'tidier' approaches could involve evaluating a regex first to decide how to parse--I believe everything that's allowed ahead of a DTD makes a regular language--but that would add cycles to every parse.) In xml.c one does find the following comment: * TODO maybe libxml2's xmlreader is better? (do not construct DOM, * yet do not use SAX - see xmlreader.c) and yes, I think a complete rewrite of xml_parse along those lines would probably be a substantial win (why construct an internal DOM just to confirm that the input is parsable, then throw it away?). But that would be a more involved rewrite that I'm not volunteering to do. This patch is a quick way to get the desired behavior given the current implementation. -Chap [1] https://www.postgresql.org/message-id/5C4A94A5.8010402%40anastigmatix.net [2] https://www.postgresql.org/message-id/5C4BDBFF.6040905%40anastigmatix.net
Вложения
On 2019-01-26 09:31, Peter Eisentraut wrote: > On 25/01/2019 15:37, Chapman Flack wrote: >>> External links already create footnotes in the PDF output. Is that >>> different from what you are saying? >> That might, only, indicate that I was just thinking aloud in email and >> had not gone and checked in PDF output to see how the links were handled. >> >> Yes, it could very possibly indicate that. >> >> If they are already processed that way, does that mean the >> >> o Do not use text with <ulink> so the URL appears in printed output >> >> in README.links should be considered obsolete, and removed even, and >> doc authors should feel free to put link text in <ulink> without >> hesitation? > > I think it's obsolete, yes. Committed that change. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
I have tested bug fixes provided by all the patches. They are working great. I found one minor issue
select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/');
The above query returns the xml. But there is an extra plus symbol at the end
<e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>+
Regards,
Ram
On 03/01/19 07:15, Ramanarayana wrote: > Hi, > I have tested bug fixes provided by all the patches. They are working > great. I found one minor issue > > select * from xmltable('*' PASSING '<e>pre<!--c1--><?pi > arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>' COLUMNS x XML PATH '/'); > > The above query returns the xml. But there is an extra plus symbol at the > end > > <e>pre<!--c1--><?pi arg?><![CDATA[&ent1]]><n2>&deep</n2>post</e>+ Hi, Are you sure that isn't the + added by psql when displaying a value that contains a newline? What happens if you repeat the query but after the psql command \a to leave the output unaligned? Thanks! -Chap
Hi,
Yes it is working fine with \a option in psql.
Cheers
Ram 4.0
Ram 4.0
This CF entry shows Pavel and me as reviewers, but the included patches were also produced by one or the other of us, so additional review by someone who isn't us seems appropriate. :) Would it make sense to remove one or both of us from the 'reviewers' field in the app, to make it more obviously 'available' for reviewing? One of the patches deals only with docs, so even someone who isn't sure about reviewing XML functionality, but takes an interest in documentation, would have a valuable role looking at that. Regards, -Chap
On 3/6/19 6:16 PM, Chapman Flack wrote: > This CF entry shows Pavel and me as reviewers, but the included > patches were also produced by one or the other of us, so additional > review by someone who isn't us seems appropriate. :) > > Would it make sense to remove one or both of us from the 'reviewers' > field in the app, to make it more obviously 'available' for reviewing? I think it makes sense to remove both of you so I have done so. I think it is assumed that co-authors will review each others work, and as you say it will make it harder for you to get additional reviewers. Regards, -- -David david@pgmasters.net
On 2019-Feb-11, Chapman Flack wrote: > Interpreting the crickets as approval, I have changed the title of the > CF entry, and the status back to Needs Review, with these patches > attached: I just pushed this one to pg12 only: > xmltable-xmlexists-passing-mechanisms-3.patch I removed the words "first appears in SQL:2006" from the new doc paragraph; we used to have similar phrases scattered here and there but were removed years ago, on the grounds that only the most recent version is relevant (I couldn't find the commit, though). Mostly left your patch alone otherwise, other than adding some additional mark-up to the doc changes. Thanks! -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3/7/19 10:08 AM, Alvaro Herrera wrote: > I just pushed this one to pg12 only: >> xmltable-xmlexists-passing-mechanisms-3.patch Thanks! > I removed the words "first appears in SQL:2006" from the new doc > paragraph; we used to have similar phrases scattered here and there but > were removed years ago, on the grounds that only the most recent version > is relevant (I couldn't find the commit, though). In case there are similar qualifiers in the other patches (especially the -docfix one, there are several), while respecting that general principle, I would offer that there might be an argument for bending it some here ... ... one of the main things challenging intuition and expectation in the SQL/XML part of the standard is that there were radical changes to the 'XML' data type and the underlying data model between :2003 and :2006, and most of what's in PostgreSQL was implemented to :2003 and still conforms to that. (Being therefore, already, a bit of an exception to the "only the most recent version is relevant" principle.) So there could be good reason here to call out the distinctions when they matter here, even if in other areas of the doc the practice is not to call them out. -Chap
On 2019-Feb-11, Chapman Flack wrote: > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected > output for the no-libxml case that was left out of -5. Pushed this one, with some trivial changes: I renamed and relocated Pavel's function to strdup-and-free and fused the new test cases to use less queries for the same functionality. Naturally I had to adjust the expected files ... I tried to do my best but there's always a little something that sneaks under one's nose. Thank you three for your persistence on this! -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-07, Alvaro Herrera wrote: > On 2019-Feb-11, Chapman Flack wrote: > > > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected > > output for the no-libxml case that was left out of -5. > > Pushed this one, with some trivial changes: I renamed and relocated > Pavel's function to strdup-and-free and fused the new test cases to use > less queries for the same functionality. Naturally I had to adjust the > expected files ... I tried to do my best but there's always a little > something that sneaks under one's nose. So we now have a double-free bug here or something ... Too tired right now to do anything about it. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02 ================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ================== [New LWP 20275] warning: Can't read pathname for load map: Input/output error. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1". Core was generated by `postgres: pgbuildfarm regression [local] SELECT '. Program terminated with signal 11, Segmentation fault. #0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6 #0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6 #1 0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2 #2 0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751 #3 0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) atxml.c:4540 #4 0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489 #5 tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318 #6 0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65 #7 0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0)at execScan.c:93 #8 ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143 #9 0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445 #10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241 #11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0)at execMain.c:1643 #12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 #13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929 #14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08,completionTag=0x7ea42414 "") at pquery.c:770 #15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215 #16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256 #17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399 #18 BackendStartup (port=0x13967b8) at postmaster.c:4090 #19 ServerLoop () at postmaster.c:1703 #20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376 #21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228 -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 02/15/19 11:31, Peter Eisentraut wrote: >> On 25/01/2019 15:37, Chapman Flack wrote: >>> If they are already processed that way, does that mean the >>> >>> o Do not use text with <ulink> so the URL appears in printed output >>> >>> in README.links should be considered obsolete, and removed even, and >>> doc authors should feel free to put link text in <ulink> without >>> hesitation? > > Committed that change. I only now noticed that probably this should also have been changed: -o If you want to supply text, use <link>, else <xref> +o If you want to supply text, use <link> or <ulink>, else <xref> Regards, -Chap
Hi
pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-07, Alvaro Herrera wrote:
> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality. Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.
So we now have a double-free bug here or something ... Too tired right
now to do anything about it.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]
warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT '.
Program terminated with signal 11, Segmentation fault.
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1 0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2 0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3 0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4 0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5 tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6 0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7 0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8 ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9 0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
I have not any hypotheses what is reason - maybe we hit some libxml2 error in xmlCopyNode function - now it is called for larger set of node types.
What I see, a error handling in original code was not probably correct. Hard to say if attached patch fix it, but probably it is more correct than now.
Regards
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
pá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-07, Alvaro Herrera wrote:
> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality. Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.
So we now have a double-free bug here or something ... Too tired right
now to do anything about it.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]
warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT '.
Program terminated with signal 11, Segmentation fault.
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1 0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2 0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3 0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4 0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5 tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6 0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7 0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8 ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9 0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
I am able to emulate it on 32bit old scientific linux. So I hope I find fix
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pá 8. 3. 2019 v 7:41 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hipá 8. 3. 2019 v 3:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:On 2019-Mar-07, Alvaro Herrera wrote:
> On 2019-Feb-11, Chapman Flack wrote:
>
> > xmltable-xpath-result-processing-bugfix-6.patch includes a regress/expected
> > output for the no-libxml case that was left out of -5.
>
> Pushed this one, with some trivial changes: I renamed and relocated
> Pavel's function to strdup-and-free and fused the new test cases to use
> less queries for the same functionality. Naturally I had to adjust the
> expected files ... I tried to do my best but there's always a little
> something that sneaks under one's nose.
So we now have a double-free bug here or something ... Too tired right
now to do anything about it.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2019-03-08%2002%3A00%3A02
================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
[New LWP 20275]
warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `postgres: pgbuildfarm regression [local] SELECT '.
Program terminated with signal 11, Segmentation fault.
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#0 0x76a32e04 in free () from /lib/arm-linux-gnueabihf/libc.so.6
#1 0x76e28380 in xmlFreeNode () from /usr/lib/arm-linux-gnueabihf/libxml2.so.2
#2 0x00481f94 in xml_xmlnodetoxmltype (cur=<optimized out>, xmlerrcxt=<optimized out>) at xml.c:3751
#3 0x004823dc in XmlTableGetValue (state=0x148c370, colnum=1994818404, typid=2124685192, typmod=0, isnull=0x7ea42117) at xml.c:4540
#4 0x0026df60 in tfuncLoadRows (econtext=0x2, tstate=0x14a8578) at nodeTableFuncscan.c:489
#5 tfuncFetchRows (tstate=0x14a8578, econtext=0x2) at nodeTableFuncscan.c:318
#6 0x0026e248 in TableFuncNext (node=0x14a83e0) at nodeTableFuncscan.c:65
#7 0x0023e640 in ExecScanFetch (recheckMtd=0x23e640 <ExecScan+816>, accessMtd=0x26db1c <TableFuncRecheck>, node=0x14a83e0) at execScan.c:93
#8 ExecScan (node=0x14a83e0, accessMtd=0x26db1c <TableFuncRecheck>, recheckMtd=0x23e640 <ExecScan+816>) at execScan.c:143
#9 0x0023c638 in ExecProcNodeFirst (node=0x14a83e0) at execProcnode.c:445
#10 0x00235630 in ExecProcNode (node=0x14a83e0) at ../../../src/include/executor/executor.h:241
#11 ExecutePlan (execute_once=<optimized out>, dest=0x14b8d08, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x14a83e0, estate=0x14a82a0) at execMain.c:1643
#12 standard_ExecutorRun (queryDesc=0x142c0e0, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
#13 0x003955f4 in PortalRunSelect (portal=0x13e3f48, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:929
#14 0x00396a0c in PortalRun (portal=0x0, count=0, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x14b8d08, altdest=0x14b8d08, completionTag=0x7ea42414 "") at pquery.c:770
#15 0x003923c8 in exec_simple_query (query_string=0x7ea42414 "") at postgres.c:1215
#16 0x00393e8c in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x0, username=<optimized out>) at postgres.c:4256
#17 0x000849a8 in BackendRun (port=0x13967b8) at postmaster.c:4399
#18 BackendStartup (port=0x13967b8) at postmaster.c:4090
#19 ServerLoop () at postmaster.c:1703
#20 0x0031607c in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1376
#21 0x000864d4 in main (argc=7301080, argv=0x8) at main.c:228
looks like error in xmlXPathCompiledEval function, that produce little bit broken result for XML_DOCUMENT_NODE type. I hadn't this problem with libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
Currently I had not fresh 32 bit system to check it.
I found a workaround - in this case copy (and release xmlNode) is not necessary.
please, apply attached patch.
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-Mar-08, Pavel Stehule wrote: > looks like error in xmlXPathCompiledEval function, that produce little bit > broken result for XML_DOCUMENT_NODE type. I hadn't this problem with > libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit. > > Currently I had not fresh 32 bit system to check it. > > I found a workaround - in this case copy (and release xmlNode) is not > necessary. > > please, apply attached patch. Wow :-( At this point I'm wondering if this should be put in back branches as well ... I mean, distill part of commit 251cf2e27bec that doesn't affect the behavior of text nodes, and put it on all branches together with your fix? Another thought: should we refuse to work on known-broken libxml2 versions? Seems like this bug could affect other parts of code too -- I see that xmlXPathCompiledEval() is called in file places (including two in contrib/xml2). Third thought: an alternative might be to create a wrapper for xmlXPathCompiledEval that detects NULL content and fills in a pointer that xmlFreeNode can free. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pá 8. 3. 2019 v 13:20 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Pavel Stehule wrote:
> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
>
> Currently I had not fresh 32 bit system to check it.
>
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.
>
> please, apply attached patch.
Wow :-( At this point I'm wondering if this should be put in back
branches as well ... I mean, distill part of commit 251cf2e27bec that
doesn't affect the behavior of text nodes, and put it on all branches
together with your fix?
The problem is just for case result: XML_DOCUMENT_TYPE, target XML. For this case the previously used transformation doesn't work.
Is not problem to detect this situation. The content field has -1 instead 0.
Originally there was inverted logic, so xmlCopyNode and xmlFreeNode was not used for XML_DOCUMENT_TYPE, and then we didn't hit this bug.
Another thought: should we refuse to work on known-broken libxml2
versions? Seems like this bug could affect other parts of code too -- I
see that xmlXPathCompiledEval() is called in file places (including two
in contrib/xml2).
Third thought: an alternative might be to create a wrapper for
xmlXPathCompiledEval that detects NULL content and fills in a pointer
that xmlFreeNode can free.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-03-08 05:04, Chapman Flack wrote: > I only now noticed that probably this should also have been changed: > > -o If you want to supply text, use <link>, else <xref> > +o If you want to supply text, use <link> or <ulink>, else <xref> The choice of <link> vs <xref> is for internal links. For external links you have to use <ulink>. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-08, Pavel Stehule wrote: > looks like error in xmlXPathCompiledEval function, that produce little bit > broken result for XML_DOCUMENT_NODE type. I hadn't this problem with > libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit. > > Currently I had not fresh 32 bit system to check it. > > I found a workaround - in this case copy (and release xmlNode) is not > necessary. Hmm ... going over the libxml2 2.7.6 source, I noticed that xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for XML_DOCUMENT_NODE. Maybe a sufficient answer is to change the xmlFreeNode there to xmlFreeNodeList. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3/8/19 7:38 AM, Peter Eisentraut wrote: > On 2019-03-08 05:04, Chapman Flack wrote: >> -o If you want to supply text, use <link>, else <xref> >> +o If you want to supply text, use <link> or <ulink>, else <xref> > > The choice of <link> vs <xref> is for internal links. For external > links you have to use <ulink>. Understood, but I was thinking of the unintended message possibly taken by a fast reader who wants to create an external link but also wants to supply text. "I want to supply text! Is ulink not an option?" Perhaps: o For an internal link, use <link> if you will supply text, else <xref> o For an external link, use <ulink> with or without link text -Chap
pá 8. 3. 2019 v 15:31 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Pavel Stehule wrote:
> looks like error in xmlXPathCompiledEval function, that produce little bit
> broken result for XML_DOCUMENT_NODE type. I hadn't this problem with
> libxml2 2.7.6 64bit, but I seen this issue on same version on 32bit.
>
> Currently I had not fresh 32 bit system to check it.
>
> I found a workaround - in this case copy (and release xmlNode) is not
> necessary.
Hmm ... going over the libxml2 2.7.6 source, I noticed that
xmlFreeNodeList seem to get this right -- it uses xmlFreeDoc for
XML_DOCUMENT_NODE. Maybe a sufficient answer is to change the
xmlFreeNode there to xmlFreeNodeList.
It fixes current issue, but I afraid so these two routines are not replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE is not checked.
You can see, from xmlNodeGetContent, XML_DOCUMENT_NODE type should to ignore content value, and newer returns this value. Other interesting is xmlXPathOrderDocElems where content is used for counting, and probably from there is -1.
Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
some like
if (cur_copy->type == XML_DOCUMENT_NODE)
xmlFreeDoc((xmlDocPtr) cur_copy);
else
xmlFreeNode(cur_copy);
This looks most correct fix for me. What do you think?
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
(I spent some time trying to reproduce the original bug, but was interrupted for lunch before getting a useful installation. I find it a bit strange that it doesn't crash in x86_64, mind ...) On 2019-Mar-08, Pavel Stehule wrote: > It fixes current issue, but I afraid so these two routines are not > replaceable. xmlFreeNodeList doesn't release xmlFreeDtd, XML_ATTRIBUTE_NODE > is not checked. :-( > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode > > some like > > if (cur_copy->type == XML_DOCUMENT_NODE) > xmlFreeDoc((xmlDocPtr) cur_copy); > else > xmlFreeNode(cur_copy); > > This looks most correct fix for me. What do you think? Seems like that should work, yeah ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-08, Alvaro Herrera wrote: > > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode > > > > some like > > > > if (cur_copy->type == XML_DOCUMENT_NODE) > > xmlFreeDoc((xmlDocPtr) cur_copy); > > else > > xmlFreeNode(cur_copy); > > > > This looks most correct fix for me. What do you think? > > Seems like that should work, yeah ... Something like this perhaps? Less repetitive ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Alvaro Herrera wrote:
> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> > xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> > xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...
Something like this perhaps? Less repetitive ...
+1
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pá 8. 3. 2019 v 19:48 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:On 2019-Mar-08, Alvaro Herrera wrote:
> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> > xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> > xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...
Something like this perhaps? Less repetitive ...
Thank you for commit.
the commit message is not correct. xmlCopyNodes does owns works well, but the node is broken already, and because we should to call xmlFreeNode, we have a problem.
regards
Pavel
+1Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi
pá 8. 3. 2019 v 19:44 odesílatel Alvaro Herrera <alvherre@2ndquadrant.com> napsal:
On 2019-Mar-08, Alvaro Herrera wrote:
> > Maybe we can call explicitly xmlFreeDoc instead xmlFreeNode
> >
> > some like
> >
> > if (cur_copy->type == XML_DOCUMENT_NODE)
> > xmlFreeDoc((xmlDocPtr) cur_copy);
> > else
> > xmlFreeNode(cur_copy);
> >
> > This looks most correct fix for me. What do you think?
>
> Seems like that should work, yeah ...
Something like this perhaps? Less repetitive ...
I looking to code
void (*nodefree) (xmlNodePtr) = NULL;
volatile xmlBufferPtr buf = NULL;
volatile xmlBufferPtr buf = NULL;
should not be "nodefree" volatile too?
Pavel
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-14, Pavel Stehule wrote: > I looking to code > > void (*nodefree) (xmlNodePtr) = NULL; > volatile xmlBufferPtr buf = NULL; > > should not be "nodefree" volatile too? Ah, good question. I remember I had it volatile and removed it for some reason, though I don't remember why. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-03-08 15:38, Chapman Flack wrote: > Perhaps: > > o For an internal link, use <link> if you will supply text, else <xref> > o For an external link, use <ulink> with or without link text I have pushed an update to this effect. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Oct-24, Chapman Flack wrote: > Inspired by the wiki page on PostgreSQL vs SQL Standard in general, > I have made another wiki page specifically about $subject. I hope > this was not presumptuous, and invite review / comment. I have not > linked to it from any other page yet. In the SQL Standards session at the Unconference, I found out that the committee produced this technical report in 2011: https://www.iso.org/standard/41528.html "XQuery Regular Expression Support in SQL"; it furthers our lack of support for XQuery in our implementation SQL/XML. That content is probably relevant for this topic, even if we cannot do much about it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 05/31/19 16:04, Alvaro Herrera wrote: > https://www.iso.org/standard/41528.html "XQuery Regular Expression > Support in SQL" Although I hadn't seen that particular document, I did see those in the SQL spec and mention them in the wiki page [1]. I should point out that's also a conformance note in the new SQL/JSON support [2], as the standard specifies XQuery regular expressions for jsonpath's like_regex predicate, where we're applying POSIX ones instead. Doesn't change my thinking much ... I think we should support the stuff. :) ... using a non-in-house-developed XQuery library to do it, or better yet, a choice of non-in-house-developed XQuery libraries. I still think the second idea I suggested on the wiki seems like the most promising road map [3], though it'll involve a few preparatory stops along the way. -Chap [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XML_Query_regular_expressions [2] https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md#sqljson-conformance [3] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Proposal_2:_desugaring_to_calls_on_normal_extension_functions