Re: Rework the way multixact truncations work

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Rework the way multixact truncations work
Дата
Msg-id 20150922175727.GA1573@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Rework the way multixact truncations work  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Rework the way multixact truncations work  (Robert Haas <robertmhaas@gmail.com>)
Re: Rework the way multixact truncations work  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On 2015-09-22 13:38:58 -0400, Robert Haas wrote:
> Regarding 0003, I'm still very much not convinced that it's a good
> idea to apply this to 9.3 and 9.4.  This patch changes the way we do
> truncation in those older releases; instead of happening at a
> restartpoint, it happens when oldestMultiXid advances.

The primary reason for doing that is that doing it at restartpoints is
simply *wrong*. Restartpoints aren't scheduled in sync with replay -
which means that a restartpoint can (will actually) happen long long
after the checkpoint from the primary has replayed.  Which means that by
the time the restartpoint is performed it's actually not unlikely that
we've already filled all slru segments. Which is bad if we then fail
over/start up.

Aside from the more fundamental issue that restartpoints have to be
"asynchronous" with respect to the checkpoint record for performance
reasons, there's a bunch of additional reasons making this even more
likely to occur: Differing checkpoint segments on the standby and
pending actions (which we got rid off in 9.5+, but ...)

> I realize that you disagree and will probably commit this to those
> branches anyway. But I want it to be clear that I don't endorse that.

I don't plan to commit/backpatch this over your objection.

I do think it'd be the better approach, and I personally think that
we're much more likely to introduce bugs if we backpatch this in a
year. Which I think we'll end up having to. The longer people run on
these branches, the more issues we'll see.

> I wish more people were paying attention to these patches.

+many

> Other issues:
> - If SlruDeleteSegment fails in unlink(), shouldn't we at the very
> least log a message?  If that file is still there when we loop back
> around, it's going to cause a failure, I think.

The existing unlink() call doesn't, that's the only reason I didn't add
a message there. I'm fine with adding a (LOG or WARNING?) message.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Dilger
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Use gender-neutral language in documentation
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Rework the way multixact truncations work