Обсуждение: [BUGS] BUG #14584: Segmentation fault importing large XML file

Поиск
Список
Период
Сортировка

[BUGS] BUG #14584: Segmentation fault importing large XML file

От
jorsol@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      14584
Logged by:          Jorge Solorzano
Email address:      jorsol@gmail.com
PostgreSQL version: 9.6.2
Operating system:   Ubuntu 16.04 4.8.0-39-generic
Description:

I'm trying to import a large XML file into a table, for others (smaller)
xml, this works fine.

DROP TABLE IF EXISTS posts;
CREATE TABLE posts AS 
SELECT
    (xpath('@Id', x))[1]::text::integer "Id",
    (xpath('@PostTypeId', x))[1]::text::integer "PostTypeId",
    (xpath('@ParentId', x))[1]::text::integer "ParentId",
    (xpath('@AcceptedAnswerId', x))[1]::text::integer "AcceptedAnswerId",
    (xpath('@CreationDate', x))[1]::text::timestamp "CreationDate",
    (xpath('@Score', x))[1]::text::integer "Score",
    (xpath('@ViewCount', x))[1]::text::integer "ViewCount",
    (xpath('@Body', x))[1]::text "Body",
    (xpath('@OwnerUserId', x))[1]::text::integer "OwnerUserId",
    (xpath('@LastEditorUserId', x))[1]::text::integer "LastEditorUserId",
    (xpath('@LastEditDate', x))[1]::text::timestamp "LastEditDate",
    (xpath('@LastActivityDate', x))[1]::text::timestamp "LastActivityDate",
    (xpath('@Title', x))[1]::text "Title",
    (xpath('@Tags', x))[1]::text "Tags",
    (xpath('@AnswerCount', x))[1]::text::integer "AnswerCount",
    (xpath('@CommentCount', x))[1]::text::integer "CommentCount",
    (xpath('@FavoriteCount', x))[1]::text::integer "FavoriteCount",
    (xpath('@ClosedDate', x))[1]::text::timestamp "ClosedDate"
FROM
    unnest(xpath('/posts/row', xml_import('Posts.xml'))) x

This is what I get from the core dump:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3a0bcf9fb8 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
(gdb) bt full
#0  0x00007f3a0bcf9fb8 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#1  0x0000556dc268ba1c in xml_errorHandler (data=0x556dc2dba9b0,
error=<optimized out>) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:1661
        errFuncSaved = 0x7f3a0bcfa1b0 <xmlGenericErrorDefaultFunc>
        errCtxSaved = 0x0
        xmlerrcxt = 0x556dc2dba9b0
        ctxt = <optimized out>
        input = 0x556dc2dc44b0
        node = <optimized out>
        name = <optimized out>
        domain = <optimized out>
        level = <optimized out>
        errorBuf = 0x556dc2dbacc0
        __func__ = "xml_errorHandler"
#2  0x00007f3a0bcfbfa4 in __xmlRaiseError () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#3  0x00007f3a0bd00900 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#4  0x00007f3a0bd02f14 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#5  0x00007f3a0bd17338 in xmlParseContent () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#6  0x00007f3a0bd17c13 in xmlParseElement () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#7  0x00007f3a0bd1866a in xmlParseDocument () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#8  0x00007f3a0bd1ffd9 in xmlCtxtReadMemory () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#9  0x0000556dc2690a26 in xpath_internal
(xpath_expr_text=xpath_expr_text@entry=0x556dc2d7dda0,
data=data@entry=0x7f380fa77040, namespaces=namespaces@entry=0x556dc2d7df90,

    res_nitems=res_nitems@entry=0x0, astate=astate@entry=0x556dc2dda740) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:3842
        save_exception_stack = 0x7fffea7f8000
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {10, -4817151912512283653,
93929908723104, 979473840, 93929908972784, 979473840, -4817151912059298821,
-1728720388456276997}, __mask_was_saved = 0, 
            __saved_mask = {__val = {0, 0, 0, 93929908723600,
11806714016570382336, 8388608, 139887287250280, 139887287250280,
93929908097440, 140737127610384, 93929901717430, 139887287250280, 
                93929908890240, 140737127610416, 93929901471218, 142}}}}
        xmlerrcxt = 0x556dc2dba9b0
        ctxt = 0x556dc2dc2e80
        doc = 0x0
        xpathctx = 0x0
        xpathcomp = 0x0
        xpathobj = 0x0
        datastr = 0x7f380fa77044 "<?xml version=\"1.0\"
encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\"
AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\"
Score=\"155\" ViewCount=\"92736\" Body=\"<p>A Vista virtua"...
        len = 979473840
        xpath_len = 10
        string = 0x7f38becc5040 "<?xml version=\"1.0\"
encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\"
AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\"
Score=\"155\" ViewCount=\"92736\" Body=\"<p>A Vista virtua"...
        xpath_expr = 0x556dc2dbacf0 "/posts/row"
        i = <optimized out>
        ndim = <optimized out>
        ns_names_uris = 0x0
        ns_names_uris_nulls = 0x0
        ns_count = 0
        __func__ = "xpath_internal"
#10 0x0000556dc26918a8 in xpath (fcinfo=<optimized out>) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:3950
        xpath_expr_text = 0x556dc2d7dda0
        data = 0x7f380fa77040
        namespaces = 0x556dc2d7df90
        astate = 0x556dc2dda740


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Pavel Stehule
Дата:
Hi

2017-03-08 20:02 GMT+01:00 <jorsol@gmail.com>:
The following bug has been logged on the website:

Bug reference:      14584
Logged by:          Jorge Solorzano
Email address:      jorsol@gmail.com
PostgreSQL version: 9.6.2
Operating system:   Ubuntu 16.04 4.8.0-39-generic
Description:

I'm trying to import a large XML file into a table, for others (smaller)
xml, this works fine.

DROP TABLE IF EXISTS posts;
CREATE TABLE posts AS
SELECT
        (xpath('@Id', x))[1]::text::integer "Id",
        (xpath('@PostTypeId', x))[1]::text::integer "PostTypeId",
        (xpath('@ParentId', x))[1]::text::integer "ParentId",
        (xpath('@AcceptedAnswerId', x))[1]::text::integer "AcceptedAnswerId",
        (xpath('@CreationDate', x))[1]::text::timestamp "CreationDate",
        (xpath('@Score', x))[1]::text::integer "Score",
        (xpath('@ViewCount', x))[1]::text::integer "ViewCount",
        (xpath('@Body', x))[1]::text "Body",
        (xpath('@OwnerUserId', x))[1]::text::integer "OwnerUserId",
        (xpath('@LastEditorUserId', x))[1]::text::integer "LastEditorUserId",
        (xpath('@LastEditDate', x))[1]::text::timestamp "LastEditDate",
        (xpath('@LastActivityDate', x))[1]::text::timestamp "LastActivityDate",
        (xpath('@Title', x))[1]::text "Title",
        (xpath('@Tags', x))[1]::text "Tags",
        (xpath('@AnswerCount', x))[1]::text::integer "AnswerCount",
        (xpath('@CommentCount', x))[1]::text::integer "CommentCount",
        (xpath('@FavoriteCount', x))[1]::text::integer "FavoriteCount",
        (xpath('@ClosedDate', x))[1]::text::timestamp "ClosedDate"
FROM
        unnest(xpath('/posts/row', xml_import('Posts.xml'))) x


what means large XML file? Please, can you write test case?

Regards

Pavel
 
This is what I get from the core dump:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f3a0bcf9fb8 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
(gdb) bt full
#0  0x00007f3a0bcf9fb8 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#1  0x0000556dc268ba1c in xml_errorHandler (data=0x556dc2dba9b0,
error=<optimized out>) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:1661
        errFuncSaved = 0x7f3a0bcfa1b0 <xmlGenericErrorDefaultFunc>
        errCtxSaved = 0x0
        xmlerrcxt = 0x556dc2dba9b0
        ctxt = <optimized out>
        input = 0x556dc2dc44b0
        node = <optimized out>
        name = <optimized out>
        domain = <optimized out>
        level = <optimized out>
        errorBuf = 0x556dc2dbacc0
        __func__ = "xml_errorHandler"
#2  0x00007f3a0bcfbfa4 in __xmlRaiseError () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#3  0x00007f3a0bd00900 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#4  0x00007f3a0bd02f14 in ?? () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#5  0x00007f3a0bd17338 in xmlParseContent () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#6  0x00007f3a0bd17c13 in xmlParseElement () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#7  0x00007f3a0bd1866a in xmlParseDocument () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#8  0x00007f3a0bd1ffd9 in xmlCtxtReadMemory () from
/usr/lib/x86_64-linux-gnu/libxml2.so.2
No symbol table info available.
#9  0x0000556dc2690a26 in xpath_internal
(xpath_expr_text=xpath_expr_text@entry=0x556dc2d7dda0,
data=data@entry=0x7f380fa77040, namespaces=namespaces@entry=0x556dc2d7df90,

    res_nitems=res_nitems@entry=0x0, astate=astate@entry=0x556dc2dda740) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:3842
        save_exception_stack = 0x7fffea7f8000
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {10, -4817151912512283653,
93929908723104, 979473840, 93929908972784, 979473840, -4817151912059298821,
-1728720388456276997}, __mask_was_saved = 0,
            __saved_mask = {__val = {0, 0, 0, 93929908723600,
11806714016570382336, 8388608, 139887287250280, 139887287250280,
93929908097440, 140737127610384, 93929901717430, 139887287250280,
                93929908890240, 140737127610416, 93929901471218, 142}}}}
        xmlerrcxt = 0x556dc2dba9b0
        ctxt = 0x556dc2dc2e80
        doc = 0x0
        xpathctx = 0x0
        xpathcomp = 0x0
        xpathobj = 0x0
        datastr = 0x7f380fa77044 "<?xml version=\"1.0\"
encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\"
AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\"
Score=\"155\" ViewCount=\"92736\" Body=\"&lt;p&gt;A Vista virtua"...
        len = 979473840
        xpath_len = 10
        string = 0x7f38becc5040 "<?xml version=\"1.0\"
encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\"
AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\"
Score=\"155\" ViewCount=\"92736\" Body=\"&lt;p&gt;A Vista virtua"...
        xpath_expr = 0x556dc2dbacf0 "/posts/row"
        i = <optimized out>
        ndim = <optimized out>
        ns_names_uris = 0x0
        ns_names_uris_nulls = 0x0
        ns_count = 0
        __func__ = "xpath_internal"
#10 0x0000556dc26918a8 in xpath (fcinfo=<optimized out>) at
/build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:3950
        xpath_expr_text = 0x556dc2d7dda0
        data = 0x7f380fa77040
        namespaces = 0x556dc2d7df90
        astate = 0x556dc2dda740


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Jorge Solórzano
Дата:
Hi Pavel,

By large I mean big in size: 935M
Posts.xml: XML 1.0 document, UTF-8 Unicode text, with very long lines


I installed debug symbols for libxml2 if this helps:

#0  xmlParserPrintFileContextInternal (input=input@entry=0x55afc89ef4b0, channel=0x55afc636ca30 <appendStringInfo>, data=0x55afc89e5cc0) at ../../error.c:181
        cur = <optimized out>
        base = <optimized out>
        n = <optimized out>
        col = <optimized out>
        content = "\000\004\000\000\000\000\000\000\260\364\236ȯU\000\000\000\000\000\000\000\000\000\000\330\340\236ȯU\000\000`\203D*\375\177\000\000M@XƯU\000\000\300\\\236ȯU\000\000\260\364\236ȯU\000\000\"\000\000\000\000\000\000\000\332\370\023\340\241\177\000\000\240"
        ctnt = <optimized out>
#1  0x00007fa1e00a587a in xmlParserPrintFileContext__internal_alias (input=input@entry=0x55afc89ef4b0) at ../../error.c:231
No locales.
#2  0x000055afc6542a1c in xml_errorHandler (data=0x55afc89e59b0, error=<optimized out>) at /build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:1661
        errFuncSaved = 0x7fa1e00a41b0 <xmlGenericErrorDefaultFunc>
        errCtxSaved = 0x0
        xmlerrcxt = 0x55afc89e59b0
        ctxt = <optimized out>
        input = 0x55afc89ef4b0
        node = <optimized out>
        name = <optimized out>
        domain = <optimized out>
        level = <optimized out>
        errorBuf = 0x55afc89e5cc0
        __func__ = "xml_errorHandler"
#3  0x00007fa1e00a5fa4 in __xmlRaiseError (schannel=0x55afc6542920 <xml_errorHandler>, schannel@entry=0x0, channel=channel@entry=0x0, data=0x55afc89e59b0, data@entry=0x0, ctx=ctx@entry=0x55afc89ede80,
    nod=nod@entry=0x0, domain=domain@entry=1, code=1, level=XML_ERR_FATAL, file=0x0, line=838090, str1=0x7fa1e01cc24d "Huge input lookup", str2=0x0, str3=0x0, int1=0, col=4,
    msg=0x7ffd2a4485e0 "internal error: %s\n") at ../../error.c:604
        ctxt = <optimized out>
        node = 0x0
        str = 0x55b08f70e270 "internal error: Huge input lookup\n"
        input = <optimized out>
        to = 0x55afc89ee0d8
        baseptr = 0x0
#4  0x00007fa1e00aa900 in xmlFatalErr (ctxt=ctxt@entry=0x55afc89ede80, error=error@entry=XML_ERR_INTERNAL_ERROR, info=info@entry=0x7fa1e01cc24d "Huge input lookup") at ../../parser.c:546
        errmsg = <optimized out>
        errstr = "internal error: %s\n", '\000' <repetidos 109 veces>
#5  0x00007fa1e00acf14 in xmlGROW (ctxt=0x55afc89ede80) at ../../parser.c:2084
        curEnd = <optimized out>
        curBase = <optimized out>
#6  0x00007fa1e00c1338 in xmlParseContent__internal_alias (ctxt=0x55afc89ede80) at ../../parser.c:10101
        test = <optimized out>
        cons = 0
        cur = <optimized out>
#7  0x00007fa1e00c1c13 in xmlParseElement__internal_alias (ctxt=ctxt@entry=0x55afc89ede80) at ../../parser.c:10255
        name = 0x55afc89ef577 "posts"
        prefix = 0x0
        URI = 0x0
        node_info = {node = 0x0, begin_pos = 140333225866765, begin_line = 94213473493376, end_pos = 140333225866817, end_line = 94213473493376}
        line = 2
        tlen = 5
        ret = 0x55afc89efab0
        nsNr = 0
#8  0x00007fa1e00c266a in xmlParseDocument__internal_alias (ctxt=ctxt@entry=0x55afc89ede80) at ../../parser.c:10952
        start = "<?xm"
        enc = <optimized out>
#9  0x00007fa1e00c9fd9 in xmlDoRead (reuse=1, options=0, encoding=0x0, URL=0x0, ctxt=0x55afc89ede80) at ../../parser.c:15430
        ret = <optimized out>
#10 xmlCtxtReadMemory__internal_alias (ctxt=0x55afc89ede80,
    buffer=buffer@entry=0x7fa09306f040 "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\" AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\" Score=\"155\" ViewCount=\"92736\" Body=\"&lt;p&gt;A Vista virtua"..., size=size@entry=979473840, URL=URL@entry=0x0, encoding=encoding@entry=0x0, options=options@entry=0) at ../../parser.c:15719
        input = 0x55afc89ef410

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Pavel Stehule
Дата:


2017-03-08 20:17 GMT+01:00 Jorge Solórzano <jorsol@gmail.com>:
Hi Pavel,

By large I mean big in size: 935M
Posts.xml: XML 1.0 document, UTF-8 Unicode text, with very long lines


I installed debug symbols for libxml2 if this helps:

#0  xmlParserPrintFileContextInternal (input=input@entry=0x55afc89ef4b0, channel=0x55afc636ca30 <appendStringInfo>, data=0x55afc89e5cc0) at ../../error.c:181
        cur = <optimized out>
        base = <optimized out>
        n = <optimized out>
        col = <optimized out>
        content = "\000\004\000\000\000\000\000\000\260\364\236ȯU\000\000\000\000\000\000\000\000\000\000\330\340\236ȯU\000\000`\203D*\375\177\000\000M@XƯU\000\000\300\\\236ȯU\000\000\260\364\236ȯU\000\000\"\000\000\000\000\000\000\000\332\370\023\340\241\177\000\000\240"
        ctnt = <optimized out>
#1  0x00007fa1e00a587a in xmlParserPrintFileContext__internal_alias (input=input@entry=0x55afc89ef4b0) at ../../error.c:231
No locales.
#2  0x000055afc6542a1c in xml_errorHandler (data=0x55afc89e59b0, error=<optimized out>) at /build/postgresql-9.6-ZHxyhz/postgresql-9.6-9.6.2/build/../src/backend/utils/adt/xml.c:1661
        errFuncSaved = 0x7fa1e00a41b0 <xmlGenericErrorDefaultFunc>
        errCtxSaved = 0x0
        xmlerrcxt = 0x55afc89e59b0
        ctxt = <optimized out>
        input = 0x55afc89ef4b0
        node = <optimized out>
        name = <optimized out>
        domain = <optimized out>
        level = <optimized out>
        errorBuf = 0x55afc89e5cc0
        __func__ = "xml_errorHandler"
#3  0x00007fa1e00a5fa4 in __xmlRaiseError (schannel=0x55afc6542920 <xml_errorHandler>, schannel@entry=0x0, channel=channel@entry=0x0, data=0x55afc89e59b0, data@entry=0x0, ctx=ctx@entry=0x55afc89ede80,
    nod=nod@entry=0x0, domain=domain@entry=1, code=1, level=XML_ERR_FATAL, file=0x0, line=838090, str1=0x7fa1e01cc24d "Huge input lookup", str2=0x0, str3=0x0, int1=0, col=4,
    msg=0x7ffd2a4485e0 "internal error: %s\n") at ../../error.c:604
        ctxt = <optimized out>
        node = 0x0
        str = 0x55b08f70e270 "internal error: Huge input lookup\n"
        input = <optimized out>
        to = 0x55afc89ee0d8
        baseptr = 0x0
#4  0x00007fa1e00aa900 in xmlFatalErr (ctxt=ctxt@entry=0x55afc89ede80, error=error@entry=XML_ERR_INTERNAL_ERROR, info=info@entry=0x7fa1e01cc24d "Huge input lookup") at ../../parser.c:546
        errmsg = <optimized out>
        errstr = "internal error: %s\n", '\000' <repetidos 109 veces>
#5  0x00007fa1e00acf14 in xmlGROW (ctxt=0x55afc89ede80) at ../../parser.c:2084
        curEnd = <optimized out>
        curBase = <optimized out>
#6  0x00007fa1e00c1338 in xmlParseContent__internal_alias (ctxt=0x55afc89ede80) at ../../parser.c:10101
        test = <optimized out>
        cons = 0
        cur = <optimized out>
#7  0x00007fa1e00c1c13 in xmlParseElement__internal_alias (ctxt=ctxt@entry=0x55afc89ede80) at ../../parser.c:10255
        name = 0x55afc89ef577 "posts"
        prefix = 0x0
        URI = 0x0
        node_info = {node = 0x0, begin_pos = 140333225866765, begin_line = 94213473493376, end_pos = 140333225866817, end_line = 94213473493376}
        line = 2
        tlen = 5
        ret = 0x55afc89efab0
        nsNr = 0
#8  0x00007fa1e00c266a in xmlParseDocument__internal_alias (ctxt=ctxt@entry=0x55afc89ede80) at ../../parser.c:10952
        start = "<?xm"
        enc = <optimized out>
#9  0x00007fa1e00c9fd9 in xmlDoRead (reuse=1, options=0, encoding=0x0, URL=0x0, ctxt=0x55afc89ede80) at ../../parser.c:15430
        ret = <optimized out>
#10 xmlCtxtReadMemory__internal_alias (ctxt=0x55afc89ede80,
    buffer=buffer@entry=0x7fa09306f040 "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<posts>\n  <row Id=\"1\" PostTypeId=\"1\" AcceptedAnswerId=\"727273\" CreationDate=\"2009-07-15T06:27:46.723\" Score=\"155\" ViewCount=\"92736\" Body=\"&lt;p&gt;A Vista virtua"..., size=size@entry=979473840, URL=URL@entry=0x0, encoding=encoding@entry=0x0, options=options@entry=0) at ../../parser.c:15719
        input = 0x55afc89ef410

It looks not well handled libxml2 fatal error "internal error: Huge input lookup"

So you are hit libxml2 limit - but this should not to finish by segfault

PostgreSQL probably doesn't use huge_tree feature of libxml2

maybe it is some new bug, because  https://www.postgresql.org/message-id/20140304155421.GM23803@rdorte.org was reported correct behave.

With size 935M you are very near to PostgreSQL XML max size 1GB.

Regards

Pavel

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2017-03-08 20:17 GMT+01:00 Jorge Solórzano <jorsol@gmail.com>:
>> I installed debug symbols for libxml2 if this helps:
>> #0  xmlParserPrintFileContextInternal (input=input@entry=0x55afc89ef4b0,
>> channel=0x55afc636ca30 <appendStringInfo>, data=0x55afc89e5cc0) at
>> ../../error.c:181

> It looks not well handled libxml2 fatal error "internal error: Huge input
> lookup"

Yeah.  The segfault is inside xmlParserPrintFileContext(), so really there
is at least one libxml2 bug here and possibly two.  Maybe it should have
been able to cope with such large input, or maybe not, but for sure its
printing functions should have been able to cope with the state at the
time the error was reported.

I don't think there's much we can do about it; you need to report this
to the libxml2 folk.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Jorge Solórzano
Дата:
Let's assume that the bug is in libxml2, should postgres crash or should it have some kind of protection against it?

Yeah.  The segfault is inside xmlParserPrintFileContext(), so really there
is at least one libxml2 bug here and possibly two.  Maybe it should have
been able to cope with such large input, or maybe not, but for sure its
printing functions should have been able to cope with the state at the
time the error was reported.

I don't think there's much we can do about it; you need to report this
to the libxml2 folk.

                        regards, tom lane

Re: [BUGS] BUG #14584: Segmentation fault importing large XML file

От
Tom Lane
Дата:
=?UTF-8?Q?Jorge_Sol=C3=B3rzano?= <jorsol@gmail.com> writes:
> Let's assume that the bug is in libxml2, should postgres crash or should it
> have some kind of protection against it?

There's not much we can do when a library we depend on segfaults.  We have
no way to know what damage has been done, so letting the process crash is
pretty much the only safe option.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs