Обсуждение: tsvector concatenation - backend crash

Поиск
Список
Период
Сортировка

tsvector concatenation - backend crash

От
Jesper Krogh
Дата:
Hi

Attached SQL files gives (at least in my hands) a reliable backend crash
with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
I cannot provide a more trimmed down set of vectors the reproduces the
bug, thus
the "obsfucated" dataset. But even deleting single terms in the vectors make
the bug go away.

*** glibc detected *** postgres: jk jk [local] SELECT: corrupted
double-linked list: 0x0000000002279f80 ***
======= Backtrace: =========
/lib/libc.so.6(+0x775b6)[0x7fe4db4b25b6]
/lib/libc.so.6(+0x7aa25)[0x7fe4db4b5a25]
/lib/libc.so.6(cfree+0x73)[0x7fe4db4b8e83]
postgres: jk jk [local] SELECT[0x710de5]
postgres: jk jk [local] SELECT(MemoryContextReset+0x2a)[0x71119a]
postgres: jk jk [local] SELECT(ExecScan+0x4a)[0x57887a]
postgres: jk jk [local] SELECT(ExecProcNode+0x238)[0x571708]
postgres: jk jk [local] SELECT(standard_ExecutorRun+0xd2)[0x5705e2]
postgres: jk jk [local] SELECT[0x63c627]
postgres: jk jk [local] SELECT(PortalRun+0x248)[0x63d948]
postgres: jk jk [local] SELECT[0x639fdb]
postgres: jk jk [local] SELECT(PostgresMain+0x547)[0x63af97]
postgres: jk jk [local] SELECT[0x5fb959]
postgres: jk jk [local] SELECT(PostmasterMain+0xa97)[0x5fe137]
postgres: jk jk [local] SELECT(main+0x490)[0x59f4d0]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7fe4db459c4d]
postgres: jk jk [local] SELECT[0x45d569]
======= Memory map: ========
00400000-008d6000 r-xp 00000000 08:01 4071141
/tmp/pgsql/bin/postgres
00ad5000-00ad6000 r--p 004d5000 08:01 4071141
/tmp/pgsql/bin/postgres
00ad6000-00ae2000 rw-p 004d6000 08:01 4071141
/tmp/pgsql/bin/postgres
00ae2000-00b43000 rw-p 00000000 00:00 0
0215d000-0227e000 rw-p 00000000 00:00 0
[heap]
7fe4d4000000-7fe4d4021000 rw-p 00000000 00:00 0
7fe4d4021000-7fe4d8000000 ---p 00000000 00:00 0
7fe4d908f000-7fe4d90a5000 r-xp 00000000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d90a5000-7fe4d92a4000 ---p 00016000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d92a4000-7fe4d92a5000 r--p 00015000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d92a5000-7fe4d92a6000 rw-p 00016000 08:01 4194383
/lib/libgcc_s.so.1
7fe4d92c1000-7fe4d9342000 rw-p 00000000 00:00 0
7fe4d9342000-7fe4db22e000 rw-s 00000000 00:04 8716337
/SYSV0052ea91 (deleted)
7fe4db22e000-7fe4db23a000 r-xp 00000000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db23a000-7fe4db439000 ---p 0000c000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db439000-7fe4db43a000 r--p 0000b000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db43a000-7fe4db43b000 rw-p 0000c000 08:01 4194415
/lib/libnss_files-2.11.1.so
7fe4db43b000-7fe4db5b5000 r-xp 00000000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db5b5000-7fe4db7b4000 ---p 0017a000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db7b4000-7fe4db7b8000 r--p 00179000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db7b8000-7fe4db7b9000 rw-p 0017d000 08:01 4194349
/lib/libc-2.11.1.so
7fe4db7b9000-7fe4db7be000 rw-p 00000000 00:00 0
7fe4db7be000-7fe4db840000 r-xp 00000000 08:01 4194398
/lib/libm-2.11.1.so
7fe4db840000-7fe4dba3f000 ---p 00082000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba3f000-7fe4dba40000 r--p 00081000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba40000-7fe4dba41000 rw-p 00082000 08:01 4194398
/lib/libm-2.11.1.so
7fe4dba41000-7fe4dba43000 r-xp 00000000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dba43000-7fe4dbc43000 ---p 00002000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc43000-7fe4dbc44000 r--p 00002000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc44000-7fe4dbc45000 rw-p 00003000 08:01 4194363
/lib/libdl-2.11.1.so
7fe4dbc45000-7fe4dbc65000 r-xp 00000000 08:01 4194325
/lib/ld-2.11.1.so
7fe4dbc85000-7fe4dbce7000 rw-p 00000000 00:00 0
7fe4dbce7000-7fe4dbd26000 r--p 00000000 08:01 5512971
/usr/lib/locale/en_DK.utf8/LC_CTYPE
7fe4dbd26000-7fe4dbe44000 r--p 00000000 08:01 5512650
/usr/lib/locale/en_DK.utf8/LC_COLLATE
7fe4dbe44000-7fe4dbe47000 rw-p 00000000 00:00 0
7fe4dbe58000-7fe4dbe59000 r--p 00000000 08:01 5515083
/usr/lib/locale/en_DK.utf8/LC_TIME
7fe4dbe59000-7fe4dbe5a000 r--p 00000000 08:01 5515084
/usr/lib/locale/en_DK.utf8/LC_MONETARY
7fe4dbe5a000-7fe4dbe5b000 r--p 00000000 08:01 5640299
/usr/lib/locale/en_DK.utf8/LC_MESSAGES/SYS_LC_MESSAGES
7fe4dbe5b000-7fe4dbe62000 r--s 00000000 08:01 5511621
/usr/lib/gconv/gconv-modules.cache
7fe4dbe62000-7fe4dbe64000 rw-p 00000000 00:00 0
7fe4dbe64000-7fe4dbe65000 r--p 0001f000 08:01 4194325
/lib/ld-2.11.1.so
7fe4dbe65000-7fe4dbe66000 rw-p 00020000 08:01 4194325
/lib/ld-2.11.1.so
7fe4dbe66000-7fe4dbe67000 rw-p 00000000 00:00 0
7ffffaedd000-7ffffaf0d000 rw-p 00000000 00:00 0
[stack]
7ffffaf8b000-7ffffaf8c000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
LOG:  server process (PID 21514) was terminated by signal 6: Aborted

Thanks.
Jesper


Вложения

Re: tsvector concatenation - backend crash

От
Tom Lane
Дата:
Jesper Krogh <jesper@krogh.cc> writes:
> Attached SQL files gives (at least in my hands) a reliable backend crash
> with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
> I cannot provide a more trimmed down set of vectors the reproduces the 
> bug, thus
> the "obsfucated" dataset. But even deleting single terms in the vectors make
> the bug go away.

Hm ... I can reproduce this on one of my usual machines, but not
another.  What platform are you on exactly?
        regards, tom lane


Re: tsvector concatenation - backend crash

От
Jesper Krogh
Дата:
On 2011-08-26 05:28, Tom Lane wrote:
> Jesper Krogh<jesper@krogh.cc>  writes:
>> Attached SQL files gives (at least in my hands) a reliable backend crash
>> with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
>> I cannot provide a more trimmed down set of vectors the reproduces the
>> bug, thus
>> the "obsfucated" dataset. But even deleting single terms in the vectors make
>> the bug go away.
> Hm ... I can reproduce this on one of my usual machines, but not
> another.  What platform are you on exactly?
64 bit Ubuntu Lucid (amd64).

-- 
Jesper


Re: tsvector concatenation - backend crash

От
jesper@krogh.cc
Дата:
> Hi
>
> Attached SQL files gives (at least in my hands) a reliable backend crash
> with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
> I cannot provide a more trimmed down set of vectors the reproduces the
> bug, thus
> the "obsfucated" dataset. But even deleting single terms in the vectors
> make
> the bug go away.

Ok, I found 8.3.0 to be "good" so i ran a git bisect on it.. it gave
me this commit:

e6dbcb72fafa4031c73cc914e829a6dec96ab6b6 is the first bad commit
commit e6dbcb72fafa4031c73cc914e829a6dec96ab6b6
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Fri May 16 16:31:02 2008 +0000

    Extend GIN to support partial-match searches, and extend tsquery to
support
    prefix matching using this facility.

    Teodor Sigaev and Oleg Bartunov

:040000 040000 febf59ba02bcd4ce3863e880c6bbd989e0b7b1d2
5e96383e628dd27b5c68b0186af18f80fb7ef129 M    doc
:040000 040000 b920deca6f074b83dd5d2bd0446785a23019d11a
3f10e54cdeac63129f34865adcadf34ff74ff9a8 M    src
bisect run success

Which means that 8.3 releases are OK, but 8.4 and forward has the problem.

Which at least touches the same area.. the patch is allthogh over 3K lines,
and my C-skills are not "that good".

Attached is the git bisect script.. just for the archives.

Jesper
Вложения

Re: tsvector concatenation - backend crash

От
Tom Lane
Дата:
Jesper Krogh <jesper@krogh.cc> writes:
> On 2011-08-26 05:28, Tom Lane wrote:
>> Hm ... I can reproduce this on one of my usual machines, but not
>> another.  What platform are you on exactly?

> 64 bit Ubuntu Lucid (amd64).

Huh, weird ... because the platform it's not failing for me on is
Fedora 14 x86_64.  Which is annoying, because that machine has better
tools for looking for memory stomps than the 32-bit HP box where I
do see the problem.  Anyway, will see what I can find.
        regards, tom lane


Re: tsvector concatenation - backend crash

От
Tom Lane
Дата:
jesper@krogh.cc writes:
>> Attached SQL files gives (at least in my hands) a reliable backend crash
>> with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
>> I cannot provide a more trimmed down set of vectors the reproduces the
>> bug, thus
>> the "obsfucated" dataset. But even deleting single terms in the vectors
>> make the bug go away.

I found it.  tsvector_concat does this to compute the worst-case output
size needed:
/* conservative estimate of space needed */out = (TSVector) palloc0(VARSIZE(in1) + VARSIZE(in2));

Unfortunately, that's not really worst case: it could be that the output
will require more alignment padding bytes than the inputs did, if there
is a mix of lexemes with and without position data.  For example, if in1
contains one lexeme of odd length without position data, and in2
contains one lexeme of even length with position data (and no pad byte),
and in1's lexeme sorts before in2's, then we will need a pad byte in the
second lexeme where there was none before.

The core of the fix is to suppose that we might need a newly-added pad
byte for each lexeme:
out = (TSVector) palloc0(VARSIZE(in1) + VARSIZE(in2) + i1 + i2);

which really is an overestimate but I don't feel a need to be tenser
about it.  What I actually committed is a bit longer because I added
some comments and some Asserts ...

> Ok, I found 8.3.0 to be "good" so i ran a git bisect on it.. it gave
> me this commit:
> 
> e6dbcb72fafa4031c73cc914e829a6dec96ab6b6 is the first bad commit
> commit e6dbcb72fafa4031c73cc914e829a6dec96ab6b6
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date:   Fri May 16 16:31:02 2008 +0000
> 
>     Extend GIN to support partial-match searches, and extend tsquery to
> support
>     prefix matching using this facility.

AFAICT this is a red herring: the bug exists all the way back to where
tsvector_concat was added, in 8.3.  I think the reason that your test
case happens to not crash before this commit is that it changed the sort
ordering rules for lexemes.  As you can see from my minimal example
above, we might need different numbers of pad bytes depending on how the
lexemes sort relative to each other.

Anyway, patch is committed; thanks for the report!
        regards, tom lane


Re: tsvector concatenation - backend crash

От
Jesper Krogh
Дата:
On 2011-08-26 23:02, Tom Lane wrote:
> AFAICT this is a red herring: the bug exists all the way back to where
> tsvector_concat was added, in 8.3.  I think the reason that your test
> case happens to not crash before this commit is that it changed the sort
> ordering rules for lexemes.  As you can see from my minimal example
> above, we might need different numbers of pad bytes depending on how the
> lexemes sort relative to each other.
>
> Anyway, patch is committed; thanks for the report!
I've just confirmed the fix.. thanks for your prompt action.

-- 
Jesper