On 4.12.2012 20:12, Alexander Korotkov wrote:
> Hi!
>
> On Sun, Dec 2, 2012 at 5:02 AM, Tomas Vondra <tv@fuzzy.cz
> <mailto:tv@fuzzy.cz>> wrote:
>
> I've tried to apply the patch with the current HEAD, but I'm getting
> segfaults whenever VACUUM runs (either called directly or from autovac
> workers).
>
> The patch applied cleanly against 9b3ac49e and needed a minor fix when
> applied on HEAD (because of an assert added to ginRedoCreatePTree), but
> that shouldn't be a problem.
>
>
> Thanks for testing! Patch is rebased with HEAD. The bug you reported was
> fixed.
Applies fine, but I get a segfault in dataPlaceToPage at gindatapage.c.
The whole backtrace is here: http://pastebin.com/YEPuWeuV
The messages written into PostgreSQL log are quite variable - usually it
looks like this:
2012-12-04 22:31:08 CET 31839 LOG: database system was not properly
shut down; automatic recovery in progress
2012-12-04 22:31:08 CET 31839 LOG: redo starts at 0/68A76E48
2012-12-04 22:31:08 CET 31839 LOG: unexpected pageaddr 0/1BE64000 in
log segment 000000010000000000000069, offset 15089664
2012-12-04 22:31:08 CET 31839 LOG: redo done at 0/69E63638
but I've seen this message too
2012-12-04 22:20:29 CET 31709 LOG: database system was not properly
shut down; automatic recovery in progress
2012-12-04 22:20:29 CET 31709 LOG: redo starts at 0/AEAFAF8
2012-12-04 22:20:29 CET 31709 LOG: record with zero length at 0/C7D5698
2012-12-04 22:20:29 CET 31709 LOG: redo done at 0/C7D55E
I wasn't able to prepare a simple testcase to reproduce this, so I've
attached two files from my "fun project" where I noticed it. It's a
simple DB + a bit of Python for indexing mbox archives inside Pg.
- create.sql - a database structure with a bunch of GIN indexes on
tsvector columns on "messages" table
- load.py - script for parsing mbox archives / loading them into the
"messages" table (warning: it's a bit messy)
Usage:
1) create the DB structure
$ createdb archives
$ psql archives < create.sql
2) fetch some archives (I consistently get SIGSEGV after first three)
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-01.gz
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-02.gz
$ wget
http://archives.postgresql.org/pgsql-hackers/mbox/pgsql-hackers.1997-03.gz
3) gunzip and load them using the python script
$ gunzip pgsql-hackers.*.gz
$ ./load.py --db archives pgsql-hackers.*
4) et voila - a SIGSEGV :-(
I suspect this might be related to the fact that the load.py script uses
savepoints quite heavily to handle UNIQUE_VIOLATION (duplicate messages).
Tomas