Re: WIP: store additional info in GIN index

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: WIP: store additional info in GIN index
Дата
Msg-id 50BE7183.2080807@fuzzy.cz
обсуждение исходный текст
Ответ на Re: WIP: store additional info in GIN index  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: WIP: store additional info in GIN index
Список pgsql-hackers
On 4.12.2012 20:12, Alexander Korotkov wrote:
> Hi!
>
> On Sun, Dec 2, 2012 at 5:02 AM, Tomas Vondra <tv@fuzzy.cz
> <mailto:tv@fuzzy.cz>> wrote:
>
>     I've tried to apply the patch with the current HEAD, but I'm getting
>     segfaults whenever VACUUM runs (either called directly or from autovac
>     workers).
>
>     The patch applied cleanly against 9b3ac49e and needed a minor fix when
>     applied on HEAD (because of an assert added to ginRedoCreatePTree), but
>     that shouldn't be a problem.
>
>
> Thanks for testing! Patch is rebased with HEAD. The bug you reported was
> fixed.

Applies fine, but I get a segfault in dataPlaceToPage at gindatapage.c.
The whole backtrace is here: http://pastebin.com/YEPuWeuV

The messages written into PostgreSQL log are quite variable - usually it
looks like this:

2012-12-04 22:31:08 CET 31839 LOG:  database system was not properly
shut down; automatic recovery in progress
2012-12-04 22:31:08 CET 31839 LOG:  redo starts at 0/68A76E48
2012-12-04 22:31:08 CET 31839 LOG:  unexpected pageaddr 0/1BE64000 in
log segment 000000010000000000000069, offset 15089664
2012-12-04 22:31:08 CET 31839 LOG:  redo done at 0/69E63638

but I've seen this message too

2012-12-04 22:20:29 CET 31709 LOG:  database system was not properly
shut down; automatic recovery in progress
2012-12-04 22:20:29 CET 31709 LOG:  redo starts at 0/AEAFAF8
2012-12-04 22:20:29 CET 31709 LOG:  record with zero length at 0/C7D5698
2012-12-04 22:20:29 CET 31709 LOG:  redo done at 0/C7D55E


I wasn't able to prepare a simple testcase to reproduce this, so I've
attached two files from my "fun project" where I noticed it. It's a
simple DB + a bit of Python for indexing mbox archives inside Pg.

- create.sql - a database structure with a bunch of GIN indexes on
               tsvector columns on "messages" table

- load.py - script for parsing mbox archives / loading them into the
            "messages" table (warning: it's a bit messy)


Usage:

1) create the DB structure
$ createdb archives
$ psql archives < create.sql

2) fetch some archives (I consistently get SIGSEGV after first three)
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-01.gz
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-02.gz
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-03.gz

3) gunzip and load them using the python script
$ gunzip pgsql-hackers.*.gz
$ ./load.py --db archives pgsql-hackers.*

4) et voila - a SIGSEGV :-(


I suspect this might be related to the fact that the load.py script uses
savepoints quite heavily to handle UNIQUE_VIOLATION (duplicate messages).


Tomas

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andreas Karlsson
Дата:
Сообщение: Re: Tablespaces in the data directory
Следующее
От: Tom Lane
Дата:
Сообщение: Re: ALTER TABLE ... NOREWRITE option