Re: BUG #15420: Server crash. Segmentation fault when parsing xml file

Поиск
Список
Период
Сортировка
От Sergey Mirvoda
Тема Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
Дата
Msg-id CALkWArjA5ApwXTnWWGMSmw6CFUaaTWHiL5gmJuMZXsMsb0tqeQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #15420: Server crash. Segmentation fault when parsing xml file  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: BUG #15420: Server crash. Segmentation fault when parsing xmlfile  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs


чт, 4 окт. 2018, 19:03 Pavel Stehule <pavel.stehule@gmail.com>:


čt 4. 10. 2018 v 13:47 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


čt 4. 10. 2018 v 13:43 odesílatel Andrey Borodin <x4mmm@yandex-team.ru> napsal:


4 окт. 2018 г., в 16:38, Pavel Stehule <pavel.stehule@gmail.com> написал(а):




Actually we found this error in very fresh intatallation of Ubuntu 16.04 and postgres 10.5
After that we checked every configuration we have. 
And only postgres 9.4 works as expected. 

This issue is related to libxml2 limits - and it cannot to work with modern libxml2 libraries.
Yes, root cause is inside libxml2 code.

Can we protect postmaster from crashing from libxml2 error? There is a bunch of PG_TRY there, but it does not help.

Unfortunately, no. You cannot to handle crash. PostgreSQL doesn't start separate process for libxml2 calls, and fault there is fatal.

I played with it, and it looks on some problems with libxml2 and your specific document (maybe too much multibyte chars, .. I don't know)

I imported 200MB long xml document with 1M items. So it has not sense to limit xml size of PostgreSQL side.

It looks so your xml document hits some corner case of libxml2 where it is extremely memory expensive. What I can see, there is lot of long content inside attributes.

Regards

Pavel, thank you for your interest. 
It is definitely something inside this document. 

Actually we loaded about 10k different documents like this one. About 10Gb of content and crash is only on this one. 

But every other parser we tried (.net, Java, python)  handled this just fine. 

For now we ended with custom plpython function for parsing xml and this is slow as hell. 

This is looks like regression, pg 9.4 load this document without any problem. 

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #15420: Server crash. Segmentation fault when parsing xmlfile