Re: Encoding problems in PostgreSQL with XML data

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: Encoding problems in PostgreSQL with XML data
Дата
Msg-id 1074165016.3206.27.camel@fuji.krosing.net
обсуждение исходный текст
Ответ на Re: Encoding problems in PostgreSQL with XML data  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
Список pgsql-hackers
Merlin Moncure kirjutas K, 14.01.2004 kell 15:49:
> Hannu Krosing wrote:
> > I hope that real as-needed-column-by-column translation will be used
> > with bound argument queries.
> > 
> > It also seems possible to delegate the encoding changes to after the
> > query is parsed, but this will never work for EBCDIC and other funny
> > encodings (like rot13 ;).
> > 
> > for these we need to define the actual SQL statement encoding on-wire
> to
> > be always ASCII.
> 
> In that case, treat the XML document like a binary stream, using
> PQescapeBytea, etc. to encode if necessary pre-query.  Also, the XML
> domain should inherit from bytea, not varchar.

why ?

the allowed characters repertoire in XML is even less than in varchar.

> The document should be stored bit for bit as was submitted.

Or in some pre-parsed form which allows restoration of submitted form,
which could be more for things like xpath queries or subtree extraction.

> If we can do that for bitmaps, why can't we do it for XML documents?  
> 
> OTOH, if we are transforming the document down to a more generic format
> (either canonical or otherwise), then the xml could be dealt with like
> text in the ususal way.  Of course, then we are not really storing xml,
> more like 'meta' xml ;)

On the contrary! If there is DTD or Schema or other structure definition
for XML, then we know which whitespace is significant and can do
whatever we like with insignificant whitespace.

It also is ok to store all XML in some UNICODE encoding as this is what
every XML must be convertible to.

its he same as storing ints - you don't care if you specified 1000 ot
1e3 when doing the insert as

hannu=# select 1000=1e3;?column?
----------t
(1 row)

in the same way the following should also be true

select
'<d/>'::xml == '<?xml version="1.0" encoding="utf-8"?>\n<d/>\n'::xml
;

-----------
Hannu



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Claudio Natoli
Дата:
Сообщение: Re: [pgsql-hackers-win32] Win32 signal code - first try
Следующее
От: jihuang
Дата:
Сообщение: FYI , Intel CC and PostgreSQL , benchmark by pgsql