Re: [HACKERS] shared memory based stat collector (was: Sharingrecord typmods between backends)

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [HACKERS] shared memory based stat collector (was: Sharingrecord typmods between backends)
Дата
Msg-id 20170814171218.2z7r3x335fakcwa7@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [HACKERS] shared memory based stat collector (was: Sharing recordtypmods between backends)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] shared memory based stat collector (was: Sharing recordtypmods between backends)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas wrote:

> Actually, now that you mention it, I think it *is* broken already, or
> more to the point, if you configure that value, the autovacuum
> launcher hangs up, because of the AutovacuumWorkItem stuff that Alvaro
> added.  When I just tested it, the AV launcher somehow ended up
> waiting for AutovacuumLock and I had to SIGQUIT the server to shut it
> down.  That's actually not really entirely the fault of
> dynamic_shared_memory_type = none, though, because the code in
> autovacuum.c does this:
> 
>                 AutoVacuumDSA = dsa_create(AutovacuumLock->tranche);
>                 /* make sure it doesn't go away even if we do */
>                 dsa_pin(AutoVacuumDSA);
>                 dsa_pin_mapping(AutoVacuumDSA);
> 
> Now, that's actually really broken because if dsa_create() throws an
> error of any kind, you're going to have already assigned the value to
> AutoVacuumDSA, but you will not have pinned the DSA or the DSA
> mapping.  There's evidently some additional bug here because I'd sorta
> expect this code to just go into an infinite loop in this case,
> failing over and over trying to reattach the segment, but evidently
> something even worse happening - perhaps the ERROR isn't releasing
> AutovacuumLock.

Yeah, the problem that lwlocks aren't released is because the launcher
is not in a transaction at that point, so AbortCurrentTransaction()
doesn't release locks like it normally would.  The simplest fix (in the
attached 0001 patch) is to add a LWLockReleaseAll() call to the jmp
block, though I wonder if there should be some other cleanup functions
called from there, or whether perhaps it'd be a better strategy to have
the launcher run in a transaction at all times.

The other problem is that there's no attempt to handle a failed DSA
creation/attachment.  The second patch just adds a PG_TRY block that
sets a flag not to try the DSA calls again if the first one fails.  It
throws a single ERROR line, then autovacuum continues without workitem
support.

I intend to give these patches further thought before pushing anything,
will update this thread no later than tomorrow 19:00 UTC-0300.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] shared memory based stat collector (was: Sharingrecord typmods between backends)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] pl/perl extension fails on Windows