Re: XMLATTRIBUTES vs. values of type XML

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: XMLATTRIBUTES vs. values of type XML
Дата
Msg-id 22F5933F-3D17-4556-8920-0AFD074859EB@phlo.org
обсуждение исходный текст
Ответ на Re: XMLATTRIBUTES vs. values of type XML  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
On Aug11, 2011, at 09:16 , Peter Eisentraut wrote:
> On fre, 2011-07-29 at 11:37 +0200, Florian Pflug wrote:
>> On Jul28, 2011, at 22:51 , Peter Eisentraut wrote:
>>> On ons, 2011-07-27 at 23:21 +0200, Florian Pflug wrote:
>>>> On Jul27, 2011, at 23:08 , Peter Eisentraut wrote:
>>>>> Well, offhand I would expect that passing an XML value to XMLATTRIBUTES
>>>>> would behave as in
>>>>>
>>>>> SELECT XMLELEMENT(NAME "t", XMLATTRIBUTES(XMLSERIALIZE(content '&'::XML AS text) AS "a"))
>>>>
>>>> With both 9.1 and 9.2 this query returns
>>>>
>>>>    xmlelement
>>>> --------------------
>>>> <t a="&amp;"/>
>>>>
>>>> i.e. makes the value of "a" represent the *literal* string '&', *not*
>>>> the literal string '&'. Just to be sure there's no miss-understanding here
>>>> - is this what you expect?
>>>
>>> Well, I expect it to fail.
>>
>> Now you've lost me. What exactly should fail under what circumstances?
>
> To me, the best solution still appears to be forbidding passing values
> of type xml to XMLATTRIBUTES, unless we find an obviously better
> solution that is not, "I came up with this custom escape function that I
> tweaked so that it appears to make sense".

Hm, OK, I see your point. However, if we simply raise an error in 9.2,
and do nothing else, that we make it impossible to use the result of
an XPath expression as an XML attribute value. Not just inconvenient,
but impossible, so I don't think we can do that. We'd thus need to add a
function
 XMLUNESCAPE(XML) RETURNS TEXT

to restore that functionality. Defining a sane behaviour for such a function,
however, seems no easier than defining sane behaviour for an XML attribute
of already of type XML. The core of the problems remains to define the result
of XMLUNESCAPE('<tag>content</tag>'), just as the core of the XMLATTRIBUTES
problems is to define XMLELEMENT(... XMLATTRIBUTES('<tag>content</tag>' as a)).

Thinking about this further, it seems that we essentially have two distinct
classes of XML values. Some are essentially plain text, but might contains
entity references, while others are "real" XML fragments which contain at
least one tag. That suggests that a sensible behaviour for XMLUNESCAPE might
be to return a string with the entity references resolved in the former case,
and simply return an error in the latter.

To summarize, we'd have
 XMLUNESCAPE(''::XML) -> 'a' XMLUNESCAPE('a'::XML) -> 'a' XMLUNESCAPE('<'::XML) -> '<' XMLUNESCAPE('<t/>'::XML) ->
error

To not break applications needlessly, I'd then be inclined to make
 XMLATTRIBUTES(xml_value as "a")

mean
 XMLATTRIBUTES(XMLUNESCAPE(xml_value) as "a")

i.e. throw an error if xml_value contains anything but plain text and
entity references. But I could probably also live with not doing that.

>>> Unfortunately, in the latest SQL/XML standard the final
>>> answer it nested deep in the three other standards, so I don't have an
>>> answer right now.  But there are plenty of standards in this area, so
>>> I'd hope that one of them can give us the right behavior, instead of us
>>> making something up.
>>
>> Which standards to you have in mind there? If you can point me to a place
>> where I can obtain them, I could check if there's something in them
>> which helps.
>
> In SQL/XML 2008, the actual behavior of XMLSERIALIZE is delegated to
> "XSLT 2.0 and XQuery 1.0 Serialization".  I'm not familiar with this
> latter standard, but it appears to have lots of options and parameters,
> one of which might help us.

I'll try to obtain a copy of that. Thanks.

best regards,
Florian Pflug



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: sha1, sha2 functions into core?
Следующее
От: Bernd Helmle
Дата:
Сообщение: Re: "pgstat wait timeout" warnings