Re: Hard limit on WAL space used (because PANIC sucks)

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Hard limit on WAL space used (because PANIC sucks)
Дата
Msg-id CA+U5nM+uJ-exh+xae5XXzm5snj8kUk4ZW9WFyzGrc4ZNHL024g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Hard limit on WAL space used (because PANIC sucks)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Hard limit on WAL space used (because PANIC sucks)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On 22 January 2014 01:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
>> How are we supposed to wait while e.g. ProcArrayLock? Aborting
>> transactions doesn't work either, that writes abort records which can
>> get signficantly large.
>
> Yeah, that's an interesting point ;-).  We can't *either* commit or abort
> without emitting some WAL, possibly quite a bit of WAL.

Right, which is why we don't need to lock ProcArrayLock. As soon as we
try to write a commit or abort it goes through the normal XLogInsert
route. As soon as wal_buffers fills WALWriteLock will be held
continuously until we free some space.

Since ProcArrayLock isn't held, read-only users can continue.

As Jeff points out, the blocks being modified would be locked until
space is freed up. Which could make other users wait. The code
required to avoid that wait would be complex and not worth any
overhead.

Note that my proposal would not require aborting any in-flight
transactions; they would continue to completion as soon as space is
cleared.

My proposal again, so we can review how simple it was...

1. Allow a checkpoint to complete by updating the control file, rather
than writing WAL. The control file is already there and is fixed size,
so we can be more confident it will accept the update. We could add a
new checkpoint mode for that, or we could do that always for shutdown
checkpoints (my preferred option). EFFECT: Since a checkpoint can now
be called and complete without writing WAL, we are able to write dirty
buffers and then clean out WAL files to reduce space.

2. If we fill the disk when writing WAL we do not PANIC, we signal the
checkpointer process to perform an immediate checkpoint and then wait
for its completion. EFFECT: Since we are holding WALWriteLock, all
other write users will soon either wait for that lock directly or
indirectly.

Both of those points are relatively straightforward to implement and
this proposal minimises seldom-tested code paths.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Hard limit on WAL space used (because PANIC sucks)
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)