Re: [HACKERS] possible encoding issues with libxml2 functions

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: [HACKERS] possible encoding issues with libxml2 functions
Дата
Msg-id 20170317032327.GA1993326@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: [HACKERS] possible encoding issues with libxml2 functions  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: possible encoding issues with libxml2 functions  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
On Sun, Mar 12, 2017 at 10:26:33PM +0100, Pavel Stehule wrote:
> 2017-03-12 21:57 GMT+01:00 Noah Misch <noah@leadboat.com>:
> > On Sun, Mar 12, 2017 at 08:36:58PM +0100, Pavel Stehule wrote:
> > > 2017-03-12 0:56 GMT+01:00 Noah Misch <noah@leadboat.com>:
> > Please add a test case.
> 
> It needs a application - currently there is not possibility to import XML
> document via recv API :(

I think xml_in() can create every value that xml_recv() can create; xml_recv()
is just more convenient given diverse source encodings.  If you make your
application store the value into a table, does "pg_dump --inserts" emit code
that reproduces the same value?  If so, you can use that in your test case.
If not, please provide precise instructions (code, SQL commands) for
reproducing the bug manually.

> > Why not use xml_parse() instead of calling xmlCtxtReadMemory() directly?
> > The
> > answer is probably in the archives, because someone understood the problem
> > enough to document "Some XML-related functions may not work at all on
> > non-ASCII data when the server encoding is not UTF-8. This is known to be
> > an
> > issue for xpath() in particular."
> 
> 
> Probably there are two possible issues

Would you research in the archives to confirm?

> 1. what I touched - recv function does encoding to database encoding - but
> document encoding is not updated.

Using xml_parse() would fix that, right?

> 2. there are not possibility to encode from document encoding to database
> encoding.

Both xml_in() and xml_recv() require the value to be representable in the
database encoding, so I don't think this particular problem can remain by the
time we reach an xpath_internal() call.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: [HACKERS] Re: [COMMITTERS] pgsql: Remove objname/objargs split for referring toobjects
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] logical replication launcher crash on buildfarm