Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)
Дата
Msg-id 20170925.202555.167747783.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
Hello, this patch have been ignored for a long time since its proposal...

At Sat, 11 Mar 2017 20:44:31 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
<CAFj8pRB+WDyDcZyGmfRdJ0HOoXugeaL-KNFeK9YA5Z10JN9qfA@mail.gmail.com>
> Hi
> 
> This proposal is followup of implementation of XMLTABLE.
> 
> Lot of XML documents has assigned document namespace.
> 
> <rows xmlns="http://x.y"><row><a>10</a></row></rows>
> 
> For these XML document any search path must use schema "http://x.y". This
> is not too intuitive, and from XMLTABLE usage is not too user friendly,
> because the default column path (same like column name) cannot be used. A
> solution of this issue is default namespace - defined in SQL/XML.
> 
> example - related to previous xml
> 
> without default namespace:
> XMLTABLE(NAMESPACES('http://x.y' AS aux),
>                     '/aux:rows/aux:row' PASSING ...
>                     COLUMNS a int PATH 'aux:a')
> 
> with default namespace
> XMLTABLE(NAMESPACES(DEFAULT 'http://x.y'),
>                     '/rows/row' PASSING ...
>                     COLUMNS a int);
> 
> 
> Unfortunately the libxml2 doesn't support default namespaces in XPath
> expressions. Because the libxml2 functionality is frozen, there is not big
> chance for support in near future. A implementation is not too hard -
> although it requires simple XPath expressions state translator.
> 
> The databases with XMLTABLE implementation supports default namespace for
> XPath expressions.
> 
> The patch for initial implementation is attached.

The original message is a bit less informative for those who
wants to review this but are not accustomed (like me) to this
area. I try to augment this with a bit more information. (Perhaps)

An example of this issue can be as follows.

create table t1 (id int, doc xml);
insert into t1  values (1, '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'), (2, '<rows
xmlns="http://x.y"><row><a>20</a></row></rows>'),(3, '<rows xmlns="http://x.y"><row><a>30</a></row></rows>'), (4,
'<rowsxmlns="http://x.y"><row><a>40</a></row></rows>');
 
select x.* from t1, xmltable('/rows/row' passing t1.doc columns data int PATH 'a') as x;
|  data 
| ------
| (0 rows)
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' as n), '/n:rows/n:row' passing t1.doc columns data int PATH
'n:a')as x;
 
|  data 
| ------
|    10
|    20
|    30
|    40
| (4 rows)

But, currently the follwing command fails with error.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' passing t1.doc columns data int PATH 'a')
asx;
 
| ERROR:  DEFAULT namespace is not supported

This patch let PostgreSQL allow this syntax by transforming xpath
string when DEFAULT namespace is defined.

=======================
I have some review comments.

This patch still applies with shifts and works as expected.

1. Uniformity among simliar features
 As mentioned in the proposal, but it is lack of uniformity that the xpath transformer is applied only to xmltable and
notfor other xpath related functions.
 


2. XML comformance
  I'm not yet sure this works for the all extent of the  available syntax but at least fails for the following
expression.

(delete from t1;)
insert into t1   values  (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' passing t1.doc columns data int PATH
'a[1][@hoge]')as x;
 
>  data 
> ------
>      
> (1 row)

 The following expression works.

select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' as x), '/x:rows/x:row' passing t1.doc columns data int PATH
'x:a[1][@hoge]')as x;
 
>  data 
> ------
>    50
> (1 row)
 The w3c says as follows.
 https://www.w3.org/TR/xml-names/#defaulting > The namespace name for an unprefixed attribute name always has no value.

 
 We currently don't have a means to make sure that this works correctly for the whole extent. More regression test
helps?
 

3. The predefined prefix for default namespace
 The patch defines the name of the defaut namespace as "pgdefnamespace". If a default namespace is defined, a namespace
ismade with the name and with_default_ns is true. If a namespace with the name is defined, a namespace is made also
withthe same name but with_default_ns is false. This causes a confused behavior.
 

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.x', 'http://x.y' as pgdefnamespace), '/rows/row' passing
t1.doccolumns data int PATH 'a') as x;
 
|  data 
| ------
|    10
|    20
|    30
|    40
| (4 rows)
The reason for the design is the fact that xmlXPathRegisterNsdoesn't accept NULL or empty string as a namespace prefix
anditonly accepts a string consists of valid XML caharacters.
 
Even if we are to live with such restriction and such name ishardly used, a namespace prefix with the name should
berejected.


4. A mistake in the documentaion ?
The documentaion says about the XMLNAMESPACES clause as thefolows.

https://www.postgresql.org/docs/10/static/functions-xml.html
> The optional XMLNAMESPACES clause is a comma-separated list of
> namespaces. It specifies the XML namespaces used in the document
> and their aliases.

As far as looking into XmlTableSetNamespace, (and if I read the
documentation correctly) the defined namespaces are not applied
on documents, but expressions. This patch is not to blame for
this. So this will be another patch backbatchable to Pg10.

| The optional XMLNAMESPACES clause is a comma-separated list of
| namespaces. It specifies the XML namespaces used in the
| row_expression.



regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Shaky coding for vacuuming partitioned relations
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: [HACKERS] logical replication and statistics