[HACKERS] possible encoding issues with libxml2 functions

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема [HACKERS] possible encoding issues with libxml2 functions
Дата
Msg-id CAFj8pRC-dM=tT=QkGi+Achkm+gwPmjyOayGuUfXVumCxkDgYWg@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] possible encoding issues with libxml2 functions  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
Hi

Today I played with xml_recv function and with xml processing functions.

xml_recv function ensures correct encoding from document encoding to server encoding. But the decl section holds original encoding info - that should be obsolete after encoding. Sometimes we solve this issue by removing decl section - see the xml_out function.

Sometimes we don't do it - lot of functions uses direct conversion from xmltype to xmlChar. Wrong encoding in decl section can breaks libxml2 parser with error

ERROR:  could not parse XML document
DETAIL:  input conversion failed due to input error, bytes 0x88 0x3C 0x2F 0x72
line 1: switching encoding: encoder error

This error is not often - but it is hard to find it - because there is small but important difference between printed XML and used XML.

There are possible two fixes

a) clean decl on input - the encoding info can be removed from decl part

b) use xml_out_internal everywhere before transformation to xmlChar. pg_xmlCharStrndup can be good candidate.

Comments?

Regards

Pavel

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: [HACKERS] DROP SUBSCRIPTION and ROLLBACK
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] fd,c just Assert()s that lseek() succeeds