Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
Дата
Msg-id 5390BA0F.2030103@vmware.com
обсуждение исходный текст
Ответ на BUG #10533: 9.4 beta1 assertion failure in autovacuum process  (levertond@googlemail.com)
Ответы Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-bugs
On 06/05/2014 09:01 PM, levertond@googlemail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      10533
> Logged by:          David Leverton
> Email address:      levertond@googlemail.com
> PostgreSQL version: 9.4beta1
> Operating system:   RHEL 5 x86_64
> Description:
>
> Our application's test suite triggers an assertion failure in an autovacuum
> process under 9.4 beta1.  I wasn't able to reduce it to a nice test case,
> but I hope the backtrace illustrates the problem:

Yes, it does, thanks for the report!

> #0  0x00000032bae30265 in raise () from /lib64/libc.so.6
> #1  0x00000032bae31d10 in abort () from /lib64/libc.so.6
> #2  0x000000000078b69d in ExceptionalCondition (conditionName=<value
> optimized out>, errorType=<value optimized out>,
>      fileName=<value optimized out>, lineNumber=<value optimized out>) at
> assert.c:54
> #3  0x00000000007ad6e2 in palloc (size=16) at mcxt.c:670
> #4  0x00000000004d3592 in GetMultiXactIdMembers (multi=75092,
> members=0x7fff915f9468, allow_old=0 '\000') at multixact.c:1242
> #5  0x0000000000495c9c in MultiXactIdGetUpdateXid (xmax=17061,
> t_infomask=<value optimized out>) at heapam.c:6059
> #6  0x00000000007ba93c in HeapTupleHeaderIsOnlyLocked (tuple=0x42a5) at
> tqual.c:1539
> #7  0x00000000007baf2c in HeapTupleSatisfiesVacuum (htup=<value optimized
> out>, OldestXmin=67407, buffer=347) at tqual.c:1174
> #8  0x00000000005a96eb in heap_page_is_all_visible (onerel=0x2b1b020f3f58,
> blkno=86, buffer=347, tupindex=339, vacrelstats=0x1cfe3148,
>      vmbuffer=0x7fff915fa65c) at vacuumlazy.c:1788
> #9  lazy_vacuum_page (onerel=0x2b1b020f3f58, blkno=86, buffer=347,
> tupindex=339, vacrelstats=0x1cfe3148, vmbuffer=0x7fff915fa65c)
>      at vacuumlazy.c:1220
> ...

MultiXactIdGetUpdateXid() calls GetMultiXactIdMembers(), which can fail
if you run out of memory. That's not cool if you're in a critical
section, as the error will be promoted to PANIC; the assertion checks
that you don't call palloc() while in a critical section, to catch that
kind of problems early. The potential for a problem is there in 9.3 as
well, but the assertion was only added to 9.4 fairly recently. That
function requires very little memory, so it's highly unlikely to fail
with OOM in practice, but in theory it could.

I think we'll need a variant of GetMultiXactIdMembers() that only
returns the update XID, avoiding the palloc(). The straight-forward fix
would be to copy-paste contents of GetMultiXactIdMembers() into
MultiXactIdGetUpdateXid(), and instead of returning the members in an
array, only return the update-xid. But it's a long and complicated
function, so copy-pasting is not a good option. I think it needs to be
refactored into some kind of a helper function that both
MultiXactIdGetUpdateXid() and GetMultiXactIdMembers() could call.

- Heikki

В списке pgsql-bugs по дате отправления:

Предыдущее
От: levertond@googlemail.com
Дата:
Сообщение: BUG #10533: 9.4 beta1 assertion failure in autovacuum process
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process