Re: ReadRecentBuffer() doesn't scale well

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: ReadRecentBuffer() doesn't scale well
Дата
Msg-id 20230629153948.rk7re27qz3qresdt@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: ReadRecentBuffer() doesn't scale well  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: ReadRecentBuffer() doesn't scale well  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Hi,

On 2023-06-29 19:35:30 +1200, Thomas Munro wrote:
> I (re)discovered why I used the lock-then-pin approach.  In the
> comments I mentioned InvalidBuffer(), but the main problem is in its
> caller GetVictimBuffer() which has various sanity checks about
> reference counts that can occasionally fail if you have code randomly
> pinning any old buffer.

You're right. Specifically non-valid buffers are the issue.


> New idea: use the standard PinBuffer() function, but add a mode that
> doesn't pin invalid buffers (with caveat that you can perhaps get a
> false negative due to unlocked read, but never a false positive; see
> commit message).  Otherwise we'd have to duplicate all the same logic
> to use cmpxchg for ReadRecentBuffer(), or rethink the assumptions in
> that other code.

It might be worth using lock free code in more places before long, but I agree
with the solution here.


> As for the lack of usage bump in the back-branches, I think the
> options are: teach PinBuffer_Locked() to increment it optionally, or
> back-patch whatever we come up with for this.

Hm, or just leave it as is.


> For the root buffer optimisation, the obvious place for storage seems
> to be under rd_amcache.  It was originally invented for the cached
> metapage (commit d2896a9ed14) but could accommodate a new struct
> holding whatever we want.  Here is a patch to try that out.
> BTAMCacheData would also be a natural place to put future things
> including a copy of the root page itself, in later work on lock-free
> tricks.

I am wondering if we don't want something more generic than stashing this in
rd_amcache. Don't want to end up duplicating relevant code across the uses of
rd_amcache in every AM.


> @@ -663,38 +663,17 @@ ReadRecentBuffer(RelFileLocator rlocator, ForkNumber forkNum, BlockNumber blockN
>      else
>      {
>          bufHdr = GetBufferDescriptor(recent_buffer - 1);
> -        have_private_ref = GetPrivateRefCount(recent_buffer) > 0;
>  
> -        /*
> -         * Do we already have this buffer pinned with a private reference?  If
> -         * so, it must be valid and it is safe to check the tag without
> -         * locking.  If not, we have to lock the header first and then check.
> -         */
> -        if (have_private_ref)
> -            buf_state = pg_atomic_read_u32(&bufHdr->state);
> -        else
> -            buf_state = LockBufHdr(bufHdr);
> -
> -        if ((buf_state & BM_VALID) && BufferTagsEqual(&tag, &bufHdr->tag))
> +        /* Is it still valid and holding the right tag? */
> +        if (PinBuffer(bufHdr, NULL, true))

I do wonder if we should have an unlocked pre-check for a) the buffer being
valid and b) BufferTagsEqual() matching.  With such a pre-check the race for
increasing the usage count of the wrong buffer is quite small, without the
pre-check it doesn't seem that small anymore.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: pgsql: Fix search_path to a safe value during maintenance operations.
Следующее
От: Jacob Champion
Дата:
Сообщение: Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs