Re: Unhappy about API changes in the no-fsm-for-small-rels patch

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Дата
Msg-id CAA4eK1LE0f-EBoKdKWaShpL_fE6TBdfjjceQ1fEFrtL2ffExiw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Unhappy about API changes in the no-fsm-for-small-rels patch  (Andres Freund <andres@anarazel.de>)
Ответы Re: Unhappy about API changes in the no-fsm-for-small-rels patch  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, Apr 22, 2019 at 10:34 PM Andres Freund <andres@anarazel.de> wrote:
> On 2019-04-22 18:49:44 +0530, Amit Kapila wrote:
> >  /*
> > @@ -1132,9 +1110,6 @@ fsm_local_set(Relation rel, BlockNumber cur_nblocks)
> >       BlockNumber blkno,
> >                               cached_target_block;
> >
> > -     /* The local map must not be set already. */
> > -     Assert(!FSM_LOCAL_MAP_EXISTS);
> > -
> >       /*
> >        * Starting at the current last block in the relation and working
> >        * backwards, mark alternating blocks as available.
> > @@ -1142,7 +1117,7 @@ fsm_local_set(Relation rel, BlockNumber cur_nblocks)
> >       blkno = cur_nblocks - 1;
> >       while (true)
> >       {
> > -             fsm_local_map.map[blkno] = FSM_LOCAL_AVAIL;
> > +             rel->fsm_local_map->map[blkno] = FSM_LOCAL_AVAIL;
> >               if (blkno >= 2)
> >                       blkno -= 2;
> >               else
> > @@ -1150,13 +1125,13 @@ fsm_local_set(Relation rel, BlockNumber cur_nblocks)
> >       }
> >
> >       /* Cache the number of blocks. */
> > -     fsm_local_map.nblocks = cur_nblocks;
> > +     rel->fsm_local_map->nblocks = cur_nblocks;
> >
> >       /* Set the status of the cached target block to 'unavailable'. */
> >       cached_target_block = RelationGetTargetBlock(rel);
> >       if (cached_target_block != InvalidBlockNumber &&
> >               cached_target_block < cur_nblocks)
> > -             fsm_local_map.map[cached_target_block] = FSM_LOCAL_NOT_AVAIL;
> > +             rel->fsm_local_map->map[cached_target_block] = FSM_LOCAL_NOT_AVAIL;
> >  }
>
> I think there shouldn't be any need for this anymore. After this change
> we shouldn't invalidate the map until there's no space on it - thereby
> addressing the cost overhead, and greatly reducing the likelihood that
> the local FSM can lead to increased bloat.
>

If we invalidate it only when there's no space on the page, then when
should we set it back to available, because if we don't do that, then
we might miss the space due to concurrent deletes.

>
> >  /*
> > @@ -1168,18 +1143,18 @@ fsm_local_set(Relation rel, BlockNumber cur_nblocks)
> >   * This function is used when there is no FSM.
> >   */
> >  static BlockNumber
> > -fsm_local_search(void)
> > +fsm_local_search(Relation rel)
> >  {
> >       BlockNumber target_block;
> >
> >       /* Local map must be set by now. */
> > -     Assert(FSM_LOCAL_MAP_EXISTS);
> > +     Assert(FSM_LOCAL_MAP_EXISTS(rel));
> >
> > -     target_block = fsm_local_map.nblocks;
> > +     target_block = rel->fsm_local_map->nblocks;
> >       do
> >       {
> >               target_block--;
> > -             if (fsm_local_map.map[target_block] == FSM_LOCAL_AVAIL)
> > +             if (rel->fsm_local_map->map[target_block] == FSM_LOCAL_AVAIL)
> >                       return target_block;
> >       } while (target_block > 0);
> >
> > @@ -1189,7 +1164,22 @@ fsm_local_search(void)
> >        * first, which would otherwise lead to the same conclusion again and
> >        * again.
> >        */
> > -     FSMClearLocalMap();
> > +     fsm_clear_local_map(rel);
>
> I'm not sure I like this. My inclination would be that we should be able
> to check the local fsm repeatedly even if there's no space in the
> in-memory representation - otherwise the use of the local FSM increases
> bloat.
>

Do you mean to say that we always check all the pages (say 4)
irrespective of their state in the local map?

I think we should first try to see in this new scheme (a) when to set
the map, (b) when to clear it, (c) how to use.  I have tried to
summarize my thoughts about it, let me know what do you think about
the same?

When to set the map.
At the beginning (the first time relation is used in the backend), the
map will be clear.  When the first time in the backend, we find that
FSM doesn't exist and the number of blocks is lesser than
HEAP_FSM_CREATION_THRESHOLD, we set the map for the total blocks that
exist at that time and mark all or alternate blocks as available.

Also, when we find that none of the blocks are available in the map
(basically they are marked invalid which means we have previously
checked that there is no space in them), we should get the number of
blocks and if they are less than the threshold, then add it to the
map.


When to clear the map?
Once we find out that the particular page doesn't have space, we can
mark the corresponding page in the map as invalid (or not available to
check).  After relation extension, we can check if the latest block is
greater than the threshold value, then we can clear the map.  At
truncate or some other similar times, when relcache entry is
invalidated, automatically the map will be cleared.

How to use the map?
Now, whenever we find the map exists, we can check the blocks that are
marked as available in it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Regression test PANICs with master-standby setup on samemachine
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Regression test PANICs with master-standby setup on samemachine