Peter Eisentraut wrote:
> On sön, 2010-03-21 at 13:07 -0400, Andrew Dunstan wrote:
>
>> Yeah, maybe. According to
>> <http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html> the only
>> legal child of an XML Document node that is not also a legal child of a
>> DocumentFragment node is a DocumentType node. So we could probably just
>> look for one of those in each argument node and strip it out. That
>> should be fairly lightweight in the common case where it's not present -
>> we'd just be searching for a fixed string. Removing it if found would be
>> more complex. We'd have to parse the node to remove it, since a legal
>> DocumentType node string could appear legally inside a CDATA node.
>>
>
> According to the SQL/XML standard, the document type declaration should
> apparently be stripped when doing a concatenation. (This makes sense
> because the result of a concatenation can never be valid according to a
> DTD.)
>
> But if we are not comfortable about being able to do that safely, I
> would be OK with just raising an error if a concatenation is attempted
> where one value contains a DTD. The impact in practice should be low.
>
Right. Can you find a way to do that using the libxml API? I haven't
managed to, and I'm pretty sure I can construct XML that fails every
simple string search test I can think of, either with a false negative
or a false positive.
cheers
andrew