XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated with wrong context

Поиск
Список
Период
Сортировка
От Markus Winand
Тема XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated with wrong context
Дата
Msg-id 0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
обсуждение исходный текст
Ответы Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hi!

I’ve found two issues with XML/XPath integration in PostgreSQL. Two patches are attached. I’ve just separated them
becausethe second one is an incompatible change. 

* Patch 0001: Accept TEXT and CDATA nodes in XMLTABLE’s column_expression.

> Column_expressions that match TEXT or CDATA nodes must return
> the contents of the node itself, not the content of the
> non-existing childs (i.e. the empty string).

The following query returns a wrong result in master:

SELECT *
  FROM (VALUES ('<xml>text</xml>'::xml)
             , ('<xml><![CDATA[<cdata>]]></xml>'::xml)
       ) d(x)
  CROSS JOIN LATERAL
        XMLTABLE('/xml'
                 PASSING x
                 COLUMNS "node()" TEXT PATH 'node()'
                ) t

               x                | node()
--------------------------------+--------
 <xml>text</xml>                |
 <xml><![CDATA[<cdata>]]></xml> |

(the “node()” column is empty)

The correct result can be obtained with patch 0001 applied:

               x                | node()
--------------------------------+---------
 <xml>text</xml>                | text
 <xml><![CDATA[<cdata>]]></xml> | <cdata>

The patch adopts existing tests to guard against future regressions.

* Patch 0002: Set correct context for XPath evaluation.

> According to the ISO standard, the context of XMLTABLE's XPath
> row_expression is the document node of the XML[1] - not the root node.
>
> The respective code is shared with non-ISO functions such as xpath
> and xpath_exists so that the change also affects these functions.

It’s an incompatible change - even the regression tests need to be adjusted because they
test for the “wrong” behaviour. The documentation, on the other hand, does neither
document the behaviour explicitly, no does it show any example that breaks due to this patch.

The alternatives to this patch are (1) documenting the current standard-breaking behaviour
and (2) fixing the context only for ISO functions. Personally, I don’t like either of them.

I’ve checked the libxml2 code to see what’s the proper way to set the context to the
document node. It’s indeed just “(xmlNodePtr)ctxt->doc”.

See: https://github.com/GNOME/libxml2/blob/7abec671473b837f99181442d59edd0cc2ee01d1/xpath.c#L14306

-markus

--
[1] SQL/XML:2011, 6.18 General Rule 1aii2.










Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Eren Başak
Дата:
Сообщение: Re: [HACKERS] Optional message to user when terminating/cancelling backend
Следующее
От: Masahiko Sawada
Дата:
Сообщение: pg_class.reltuples of brin indexes