Re: dsa_allocate() faliure

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: dsa_allocate() faliure
Дата
Msg-id 20190211000215.GU31721@telsasoft.com
обсуждение исходный текст
Ответ на Re: dsa_allocate() faliure  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: dsa_allocate() faliure  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Mon, Feb 11, 2019 at 09:45:07AM +1100, Thomas Munro wrote:
> Ouch.  Yeah, that'd do it and matches the evidence.  With this change,
> I couldn't reproduce the problem after 90 minutes with a test case
> that otherwise hits it within a couple of minutes.
...
> Note that this patch addresses the error "dsa_allocate could not find
> %zu free pages".  (The error "dsa_area could not attach to segment" is
> something else and apparently rarer.)

"could not attach" is the error reported early this morning while
stress-testing this patch with queued_alters queries in loops, so that's
consistent with your understanding.  And I guess it preceded getting stuck on
lock; although I don't how long between the first happened and the second, I'm
guess not long and perhaps immedidately; since the rest of the processes were
all stuck as in bug#15585 rather than ERRORing once every few minutes.

I mentioned that "could not attach to segment" occurs in leader either/or
parallel worker.  And most of the time causes an ERROR only, and doesn't wedge
all future parallel workers.  Maybe bug#15585 "wedged" state maybe only occurs
after some pattern of leader+worker failures (?)  I've just triggered bug#15585
again, but if there's a pattern, I don't see it.

Please let me know whether you're able to reproduce the "not attach" bug using
simultaneous loops around the queued_alters query; it's easy here.

Justin


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andreas Karlsson
Дата:
Сообщение: Re: libpq compression
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: dsa_allocate() faliure