Re: Encoding problems in PostgreSQL with XML data

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: Encoding problems in PostgreSQL with XML data
Дата
Msg-id 3FFF129E.6020109@dunslane.net
обсуждение исходный текст
Ответ на Re: Encoding problems in PostgreSQL with XML data  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
Ответы Re: Encoding problems in PostgreSQL with XML data  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
Perhaps the document should be stored in canonical form. See 
http://www.w3.org/TR/xml-c14n

I think I agree with Rod's opinion elsewhere in this thread. I guess the 
"philosophical" question is this: If 2 XML documents with different 
encodings have the same canonical form, or perhaps produce the same DOM, 
are they equivalent? Merlin appears to want to say "no", and I think I 
want to say "yes".

cheers

andrew

Merlin Moncure wrote:

>Peter Eisentraut wrote:
>  
>
>>The central problem I have is this:  How do we deal with the fact that
>>an XML datum carries its own encoding information?
>>    
>>
>
>Maybe I am misunderstanding your question, but IMO postgres should be
>treating xml documents as if they were binary data, unless the server
>takes on the role of a parser, in which case it should handle
>unspecified/unknown encodings just like a normal xml parser would (and
>this does *not* include changing the encoding!).
>
>According to me, an XML parser should not change one bit of a document,
>because that is not a 'parse', but a 'transformation'.
> 
>  
>
>>Rewriting the <?xml?> declaration seems like a workable solution, but
>>    
>>
>it
>  
>
>>would break the transparency of the client/server encoding conversion.
>>Also, some people might dislike that their documents are being changed
>>as they are stored.
>>    
>>
>
>Right, your example begs the question: why does the server care what the
>encoding of the documents is (perhaps indexing)?  ZML validation is a
>standardized operation which the server (or psql, I suppose) can
>subcontract out to another application.
>
>Just a side thought: what if the xml encoding type was built into the
>domain type itself?
>create domain xml_utf8 ...
>Which allows casting, etc. which is more natural than an implicit
>transformation.
>
>Regards,
>Merlin
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>  
>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Merlin Moncure"
Дата:
Сообщение: Re: Encoding problems in PostgreSQL with XML data
Следующее
От: Shachar Shemesh
Дата:
Сообщение: Re: OLE DB driver