Re: patch: function xmltable

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: patch: function xmltable
Дата
Msg-id CAFj8pRA=BjeNe5+Ad9WgHZ4tW=5vVvLb2qb8Og54+ye1n++E8w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: patch: function xmltable  (Craig Ringer <craig@2ndquadrant.com>)
Ответы Re: patch: function xmltable  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers


2016-09-12 6:28 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:
> I'll take a closer read-through shortly.


Missing file. You omitted executor/tableexpr.h from the patch, so I
can't compile.

I've expanded and copy-edited the docs. Some of it is guesswork based
on the references you sent and a glance at the code. Please check my
changes carefully. I found a few surprises, like the fact that DEFAULT
isn't a normal literal, it's an xpath expression evaluated at the same
time as the rowexpression.

Updated patch attached as XMLTABLE-v3 includes the docs changes. Note
that it's missing tableexpr.h. For convenient review or to apply to
your working tree I also attach a diff of just my docs changes as
proposed-docs-changes.diff.


Docs:

- Can you send the sample data used to generate the example output?
I'd like to include at least a cut down part of it in the docs to make
it clear how the input correlates with output, and preferably put the
whole thing in an appendix.

it is in regress tests.
 

- How does it decide what subset of the document to iterate over?
That's presumably rowexpr, which is xpath in postgresql? (I added this
to docs).

- xmlnamespaces clause in docs needs an example for a non-default namespace.

- What effect does xmlnamespaces clause have? Does supplying it allow
you to reference qualified names in xpath? What happens if you don't
specify it for a document that has namespaces or don't define all the
namespaces? What if you reference an undefined namespace in xpath?
What about if an undefined namespace isn't referenced by xpath, but is
inside a node selected by an xpath expression?

All this is under libxml2 control - when you use undefined namespace, then libxml2 raises a error. The namespaces in document and in XPath queries are absolutely independent - the relation is a URI. When you use bad URI (referenced by name), then the result will be empty set. When you use undefined name, then you will get a error.
 

- What are the rules for converting the matched XML node into a value?
If the matched node is not a simple text node or lacks a text node as
its single child, what happens?

This process is described and controlled by "XML SQL mapping". The Postgres has minimalistic implementation without possibility of external control and without schema support. The my implementation is simple. When user doesn't specify result target like explicit using of text() function, then the text() function is used implicitly when target type is not XML. Then I dump result to string and I enforce related input functions for target types.

 

- What happens if the matched has multiple text node children? This
can happen if, for example, you have something like

    <matchedNode>
      some text <!-- comment splits up text node --> other text
    </matchedNode>

depends on target type - it is allowed in XML, and it is disallowed for other types. I though about support of a arrays - but the patch will be much more complex - there can be recursion - so I disallowed it. When the user have to solve this issue, then he can use nested XMLTABLE functions and nested function is working with XML type.

Just for record - This issue is solved in JSON_TABLE functions - it allows nested PATHS. But XMLTABLE doesn't allow it.
 

- Is there a way to get an attribute as a value? If so, an example
should show this because it's going to be a common need. Presumably
you want   node/@attrname ?

you can use reference to current node "." - so "./@attname" should to work - a example is in regress tests

 

- What happens if COLUMNS is not specified at all? It looks like it
returns a single column result set with the matched entries as 'xml'
type, so added to docs, please verify.

sure, that is it
 


- PASSING clause isn't really defined. You can specify one PASSING
entry as a literal/colref/expression, and it's the argument xml
document, right? The external docs you referred to say that PASSING
may have a BY VALUE keyword, alias its argument with AS, and may have
expressions, e.g.

    PASSING BY VALUE '<x/>' AS a, '<y/>' AS b

  Neither aliases nor multiple entries are supported by the code or
grammar. Should this be documented as a restriction? Do you know if
that's an extension by the other implementation or if it's SQL/XML
standard? (I've drafted a docs update to cover this in the updated
patch).

The ANSI allows to pass more documents - and then do complex queries with XQuery. Passing more than one document has not sense in libxml2 based implementation, so I didn't supported it. The referenced names can be implemented later - but it needs to changes in XPATH function too.
 


- What does BY REF mean? Should this just be mentioned with a "see
xmlexists(...)" since it seems to be compatibility noise? Is there a
corresponding BY VALUE or similar?

When the XML document is stored as serialized DOM, then by ref means link on this DOM. It has not sense in Postgres - because we store XML documents by value only.
 


- The parser definitions re-use xmlexists_argument . I don't mind
that, but it's worth noting here in case others do.

It is one clause - see SQL/XML doc PASSING <XML table argument passing mechanism>



- Why do the expression arguments take c_expr (anything allowed in
a_expr or b_expr), not b_expr (restricted expression) ?

I don't know - I expect the problems with parser - because PASSING is restricted keyword in ANSI/SQL and unreserved keyword in Postgres.
 


- Column definitions are underdocumented. The grammar says they can be
NOT NULL, for example, but I don't see that in any of the references
you mailed me nor in the docs. What behaviour is expected for a NOT
NULL column? I've documented my best guess (not checked closely
against the code, please verify).


yes - some other databases allows it - I am thinking so it is useful.
 
-




Test suggestions:

- Coverage of multiple text() node children of an element, where split
up by comment or similar

- Coverage of xpath that matches a node with child element nodes

I'll do it.
 



More to come. Please review my docs changes in the mean time. I'm
spending a lot more time on this than I expected so I might have to
get onto other things for a while too.



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: patch: function xmltable
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: patch: function xmltable