Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION
Дата
Msg-id 20160224215221.bbbduc5nelq7tf6s@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: BUG #13985: Segmentation fault on PREPARE TRANSACTION
Список pgsql-bugs
On 2016-02-24 17:52:37 -0300, Alvaro Herrera wrote:
> chris.tessels@inergy.nl wrote:
>
> >     Core was generated by `postgres: mailinfo_ow mailinfo_ods 10.50.6.6(4188'.
> >     Program terminated with signal 11, Segmentation fault.
> >
> >     #0  MinimumActiveBackends (min=50) at procarray.c:2472
> >     2472            if (pgxact->xid == InvalidTransactionId)
>
> It's not surprising that you're not able to make this crash
> consistently, because it looks like the problem might be in concurrent
> modifications to the PGXACT array.  This routine, MinimumActiveBackends,
> walks the PGPROC array explicitely without locks.  There are comments
> indicating that this is safe, but evidently something has slipped in
> there.
>
> Apparently this code is trying to dereference an invalid pgxact, but
> it's not clear to me how this happens.  Those structs are allocated in
> advance, and they are referenced in the code via array indexes, so even
> if the pgxact doesn't actually hold data about a valid transaction,
> dereferencing the XID shouldn't cause a crash.

Well, that code is pretty, uh, questionable. E.g. for
        int            pgprocno = arrayP->pgprocnos[index];
        volatile PGPROC *proc = &allProcs[pgprocno];
        volatile PGXACT *pgxact = &allPgXact[pgprocno];
there's no guarantee that pgprocno is actually the same index for both
lookups and the following
        if (pgprocno == -1)
            continue;            /* do not count deleted entries */
check.  It's perfectly reasonable for a compiler to reload pgprocno from
memory, or just always reference it via memory.

I presume what happened here is that initially arrayP->pgprocnos[index]
was -1, but by the time if (pgprocno == -1) is reached, it changed to a
different value.

It's also really crummy that we're doing the PGPROC/PGXACT lookups
before checking whether pgprocno is -1.


At the very least ISTM that we have to make pgprocno volatile (or use a
memory barrier - but we don't have sufficient support for those in the
older branches), and move the PGPROC/PGXACT lookups after the == -1
check.

Andres

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: BUG #13988: "plan should not reference subplan's variable" whilst using row level security
Следующее
От: Ranier VF
Дата:
Сообщение: Re: BUG #13980: UNINITIALIZED READ