Re: [HACKERS] SIGSEGV in BRIN autosummarize

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: [HACKERS] SIGSEGV in BRIN autosummarize
Дата
Msg-id 20171015015656.GC22678@telsasoft.com
обсуждение исходный текст
Ответ на Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote:
> > Also notice the vacuum process was interrupted, same as yesterday (think
> > goodness for full logs).  Our INSERT script is using python
> > multiprocessing.pool() with "maxtasksperchild=1", which I think means we load
> > one file and then exit the subprocess, and pool() creates a new subproc, which
> > starts a new PG session and transaction.  Which explains why autovacuum starts
> > processing the table only to be immediately interrupted.

On Sun, Oct 15, 2017 at 01:57:14AM +0200, Tomas Vondra wrote:
> I don't follow. Why does it explain that autovacuum gets canceled? I
> mean, merely opening a new connection/session should not cancel
> autovacuum. That requires a command that requires table-level lock
> conflicting with autovacuum (so e.g. explicit LOCK command, DDL, ...).

I was thinking that INSERT would do it, but I gather you're right about
autovacuum.  Let me get back to you about this..

> > Due to a .."behavioral deficiency" in the loader for those tables, the crashed
> > backend causes the loader to get stuck, so the tables should be untouched since
> > the crash, should it be desirable to inspect them.
> > 
> 
> It's a bit difficult to guess what went wrong from this backtrace. For
> me gdb typically prints a bunch of lines immediately before the frames,
> explaining what went wrong - not sure why it's missing here.

Do you mean this ?

...
Loaded symbols for /lib64/libnss_files-2.12.so
Core was generated by `postgres: autovacuum worker process   gtt             '.
Program terminated with signal 11, Segmentation fault.
#0  pfree (pointer=0x298c740) at mcxt.c:954
954             (*context->methods->free_p) (context, pointer);

> Perhaps some of those pointers are bogus, the memory was already pfree-d
> or something like that. You'll have to poke around and try dereferencing
> the pointers to find what works and what does not.
> 
> For example what do these gdb commands do in the #0 frame?
> 
> (gdb) p *(MemoryContext)context

(gdb) p *(MemoryContext)context
Cannot access memory at address 0x7474617261763a20

> (gdb) p *GetMemoryChunkContext(pointer)

(gdb) p *GetMemoryChunkContext(pointer)
No symbol "GetMemoryChunkContext" in current context.

I had to do this since it's apparently inlined/macro:
(gdb) p *(MemoryContext *) (((char *) pointer) - sizeof(void *))
$8 = (MemoryContext) 0x7474617261763a20

I uploaded the corefile:
http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz

Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: [HACKERS] pg_regress help output
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)