Re: WAL Insertion Lock Improvements

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: WAL Insertion Lock Improvements
Дата
Msg-id 20230725224647.2g2i4dzi72yjffsm@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: WAL Insertion Lock Improvements  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
Hi,

On 2023-07-26 07:40:31 +0900, Michael Paquier wrote:
> On Tue, Jul 25, 2023 at 09:49:01AM -0700, Andres Freund wrote:
> > FWIW, I'm working on a patch that replaces WAL insert locks as a whole,
> > because they don't scale all that well.
>
> What were you looking at here?  Just wondering.

Here's what I had written offlist a few days ago:

> The basic idea is to have a ringbuffer of in-progress insertions. The
> acquisition of a position in the ringbuffer is done at the same time as
> advancing the reserved LSN, using a 64bit xadd. The trick that makes that
> possible is to use the high bits of the atomic for the position in the
> ringbuffer. The point of using the high bits is that they wrap around, without
> affecting the rest of the value.
>
> Of course, when using xadd, we can't keep the "prev" pointer in the
> atomic. That's where the ringbuffer comes into play. Whenever one inserter has
> determined the byte pos of its insertion, it updates the "prev byte pos" in
> the *next* ringbuffer entry.
>
> Of course that means that insertion N+1 needs to wait for N to set the prev
> position - but that happens very quickly. In my very hacky prototype the
> relevant path (which for now just spins) is reached very rarely, even when
> massively oversubscribed. While I've not implemented that, N+1 could actually
> do the first "iteration" in CopyXLogRecordToWAL() before it needs the prev
> position, the COMP_CRC32C() could happen "inside" the buffer.
>
>
> There's a fair bit of trickyness in turning that into something working, of
> course :).  Ensuring that the ring buffer of insertions doesn't wrap around is
> non-trivial. Nor is trivial to ensure that the "reduced" space LSN in the
> atomic can't overflow.
>
> I do wish MAX_BACKENDS were smaller...
>
>
> Until last night I thought all my schemes would continue to need something
> like the existing WAL insertion locks, to implement
> WALInsertLockAcquireExclusive().
>
> But I think I came up with an idea to do away with that (not even prototyped
> yet): Use one bit in the atomic that indicates that no new insertions are
> allowed. Whenever the xadd finds that old value actually was locked, it
> "aborts" the insertion, and instead waits for a condition variable (or
> something similar). Of course that's after modifying the atomic - to deal with
> that the "lock holder" reverts all modifications that have been made to the
> atomic when releasing the "lock", they weren't actually successful and all
> those backends will retry.
>
> Except that this doesn't quite suffice - XLogInsertRecord() needs to be able
> to "roll back", when it finds that we now need to log FPIs. I can't quite see
> how to make that work with what I describe above. The only idea I have so far
> is to just waste the space with a NOOP record - it should be pretty rare. At
> least if we updated RedoRecPtr eagerly (or just stopped this stupid business
> of having an outdated backend-local copy).
>
>
>
> My prototype shows this idea to be promising. It's a tad slower at low
> concurrency, but much better at high concurrency. I think most if not all of
> the low-end overhead isn't inherent, but comes from having both old and new
> infrastructure in place (as well as a lot of debugging cruft).


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: WAL Insertion Lock Improvements
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Improve pg_stat_statements by making jumble handle savepoint names better