RE: [Patch] Optimize dropping of relation buffers using dlist

Поиск
Список
Период
Сортировка
От k.jamison@fujitsu.com
Тема RE: [Patch] Optimize dropping of relation buffers using dlist
Дата
Msg-id OSBPR01MB234172819B3D59F608FCE33DEF3E0@OSBPR01MB2341.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [Patch] Optimize dropping of relation buffers using dlist  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы RE: [Patch] Optimize dropping of relation buffers using dlist
Список pgsql-hackers
On Wednesday, September 16, 2020 5:32 PM, Kyotaro Horiguchi wrote:
> At Wed, 16 Sep 2020 10:05:32 +0530, Amit Kapila <amit.kapila16@gmail.com>
> wrote in
> > On Wed, Sep 16, 2020 at 9:02 AM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > >
> > > At Wed, 16 Sep 2020 08:33:06 +0530, Amit Kapila
> > > <amit.kapila16@gmail.com> wrote in
> > > > On Wed, Sep 16, 2020 at 7:46 AM Kyotaro Horiguchi
> > > > <horikyota.ntt@gmail.com> wrote:
> > > By the way I'm not sure that actually happens, but if one smgrextend
> > > call exnteded the relation by two or more blocks, the cache is
> > > invalidated and succeeding smgrnblocks returns lseek()'s result.
> > >
> >
> > Can you think of any such case? I think in recovery we use
> > XLogReadBufferExtended->ReadBufferWithoutRelcache for reading the
> page
> > which seems to be extending page-by-page but there could be some case
> > where that is not true. One idea is to run regressions and add an
> > Assert to see if we are extending more than a block during recovery.
>
> I agree with you. Actually XLogReadBufferExtended is the only point to read a
> page while recovery and seems calling ReadBufferWithoutRelcache page by
> page up to the target page. The only case I found where the cache is
> invalidated is ALTER TABLE SET TABLESPACE while wal_level=minimal and
> not during recovery. smgrextend is called without smgrnblocks called at the
> time.
>
> Considering that the behavior of lseek can be a problem only just after
> extending a file, an assertion in smgrextend seems to be enough. Although,
> I'm not confident on the diagnosis.
>
> --- a/src/backend/storage/smgr/smgr.c
> +++ b/src/backend/storage/smgr/smgr.c
> @@ -474,7 +474,14 @@ smgrextend(SMgrRelation reln, ForkNumber forknum,
> BlockNumber blocknum,
>      if (reln->smgr_cached_nblocks[forknum] == blocknum)
>          reln->smgr_cached_nblocks[forknum] = blocknum + 1;
>      else
> +    {
> +        /*
> +         * DropRelFileNodeBuffers relies on the behavior that
> nblocks cache
> +         * won't be invalidated by file extension while recoverying.
> +         */
> +        Assert(!InRecovery);
>          reln->smgr_cached_nblocks[forknum] =
> InvalidBlockNumber;
> +    }
>  }
>
> > > Don't
> > > we need to guarantee the cache to be valid while recovery?
> > >
> >
> > One possibility could be that we somehow detect that the value we are
> > using is cached one and if so then only do this optimization.
>
> I basically like this direction.  But I'm not sure the additional parameter for
> smgrnblocks is acceptable.
>
> But on the contrary, it might be a better design that DropRelFileNodeBuffers
> gives up the optimization when smgrnblocks(,,must_accurate = true) returns
> InvalidBlockNumber.
>

Thank you for your thoughtful reviews and discussions Horiguchi-san, Tsunakawa-san and Amit-san.
Apologies for my carelessness. I've addressed the bugs in the previous version.
1. Getting the total number of blocks for all the specified forks
2. Hashtable probing conditions

I added the suggestion of putting an assert on smgrextend for the XLogReadBufferExtended case,
and I thought that would be enough. I think modifying the smgrnblocks with the addition of new
parameter would complicate the source code because a number of functions call it.
So I thought that maybe putting BlockNumberIsValid(nblocks) in the condition would suffice.
Else, we do full scan of buffer pool.

+                       if ((nblocks / (uint32)NBuffers) < BUF_DROP_FULLSCAN_THRESHOLD &&
+                               BlockNumberIsValid(nblocks))

+                       else
+                       {
                //full scan

Attached is the v14 of the patch. It compiles and passes the tests.
Hoping for your continuous reviews and feedback. Thank you very much.

Regards,
Kirk Jamison


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexey Kondratov
Дата:
Сообщение: Re: Concurrency issue in pg_rewind
Следующее
От: Amit Langote
Дата:
Сообщение: Re: logical/relation.c header description