Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated withwrong context
Дата
Msg-id 20180621202730.i6esu5wchi475suk@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated with wrong context  (Markus Winand <markus.winand@winand.at>)
Ответы Re: XML/XPath issues: text/CDATA in XMLTABLE, XPath evaluated with wrong context  (Markus Winand <markus.winand@winand.at>)
Список pgsql-hackers
I have pushed the patch now (in your original form rather than my later
formulation) -- let's see how the buildfarm likes it.  There are (at
least) three issues remaining, as per below; Pavel, do you have any
insight on these?

First one is about array indexes not working sanely (I couldn't get this
to work in Oracle)

> > Also, array indexes behave funny.  First let's add more XML comments
> > inside that number, and query for the subscripts:
> > 
> > update xmldata set data = regexp_replace(data::text, '7<!--small country-->91',
'<!--ah-->7<!--oh-->9<!--uh-->1')::xml;
> > 
> > SELECT  xmltable.*
> >   FROM (SELECT data FROM xmldata) x,
> >        LATERAL XMLTABLE('/ROWS/ROW'
> >                         PASSING data
> >                         COLUMNS
> >                                  country_name text PATH 'COUNTRY_NAME/text()' NOT NULL,
> >                                  size_text float PATH 'SIZE/text()',
> >                                  size_text_1 float PATH 'SIZE/text()[1]',
> >                                  size_text_2 float PATH 'SIZE/text()[2]',
> >                                  "SIZE" float, size_xml xml PATH 'SIZE')
> > where size_text is not null;
> > 
> > country_name │ size_text │ size_text_1 │ size_text_2 │ size_text_3 │ SIZE │                       size_xml
             
 
> >
──────────────┼───────────┼─────────────┼─────────────┼─────────────┼──────┼───────────────────────────────────────────────────────
> > Singapore    │       791 │         791 │          91 │           1 │  791 │ <SIZE
unit="km"><!--ah-->7<!--oh-->9<!--uh-->1</SIZE>
> > (1 fila)

The second one is about (lack of!) processing instructions and comments:

> Also, node() matching comments or processing instructions
> seems to be broken too:
> 
> SELECT *
>  FROM (VALUES ('<xml><!--comment--></xml>'::xml)
>             , ('<xml><?pi content?></xml>'::xml)
>       ) d(x)
>  CROSS JOIN LATERAL
>        XMLTABLE('/xml'
>                 PASSING x
>                 COLUMNS "node()" TEXT PATH 'node()'
>                ) t
> 
>              x             | node()
> ---------------------------+--------
>  <xml><!--comment--></xml> |
>  <xml><?pi content?></xml> |
> (2 rows)
> 
> I can look into this, but it may take a while.

Compare the empty second columns with oracle behavior, which returns the
contents of the PI and the comment.  As a script for
http://rextester.com/l/oracle_online_compiler

create table xmltb (data xmltype) \\
insert into xmltb values ('<xml><!--the comment is here--></xml>') \\
insert into xmltb values ('<xml><?pi php_stuff(); do_stuff("hello"); ?></xml>') \\
SELECT *  FROM xmltb, XMLTABLE('/xml' PASSING data COLUMNS node varchar2(100) PATH 'node()') t \\
drop table xmltb \\


The third issue is the way we output comments when they're in a column
of type XML:

> > Note what happens if I change the type from text to xml in that
> > column:
> > 
> > SELECT *
> >  FROM (VALUES ('<xml>te<!-- ahoy -->xt</xml>'::xml)
> >             , ('<xml><![CDATA[some <!-- really --> weird <stuff>]]></xml>'::xml)
> >       ) d(x)
> >  CROSS JOIN LATERAL
> >        XMLTABLE('/xml'
> >                 PASSING x
> >                 COLUMNS "node()" xml PATH 'node()'
> >                ) t;
> > 
> >                             x                             │                     node()                     
> > ───────────────────────────────────────────────────────────┼────────────────────────────────────────────────
> > <xml>te<!-- ahoy -->xt</xml>                              │ te ahoy xt
> > <xml><![CDATA[some <!-- really --> weird <stuff>]]></xml> │ some <!-- really --> weird <stuff>
> > (2 filas)
> 
> The comment seems to be wrong.
> 
> I guess it’s fine if the CDATA gets transformed in to an equivalent
> string using the XML entities. Yet, it might be better avoiding it.


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Spilling hashed SetOps and aggregates to disk
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Fast default stuff versus pg_upgrade