Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Дата
Msg-id CAH2-WzkU5B7Rh4LvevfeW5E5tg3YtgO5GVGaU6EEXtxmM2Nshg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Список pgsql-hackers
On Thu, Dec 19, 2019 at 6:55 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I pushed this earlier today -- it became commit 9f83468b. Attached is
> v27, which fixes the bitrot against the master branch.

Attached is v28, which fixes bitrot from my recent commits to refactor
VACUUM-related code in nbtpage.c.

Other changes:

* A big overhaul of the nbtree README changes -- "posting list splits"
now becomes its own section.

I tried to get the general idea across about posting lists in this new
section without repeating myself too much. Posting list splits are
probably the most subtle part of the overall design of the patch.
Posting lists piggy-back on a standard atomic action (insertion into a
leaf page, or leaf page split) on the one hand. On the other hand,
they're a separate and independent step at the conceptual level.

Hopefully the general idea comes across as clearly as possible. Some
feedback on that would be good.

* PageIndexTupleOverwrite() is now used for VACUUM's "updates", and
has been taught to not unset an LP_DEAD bit that happens to already be
set.

As the comments added by my recent commit 4b25f5d0 now mention, it's
important that VACUUM not unset LP_DEAD bits accidentally. VACUUM will
falsely unset the BTP_HAS_GARBAGE page flag at times, which isn't
ideal. Even still, unsetting LP_DEAD bits themselves is much worse
(even though BTP_HAS_GARBAGE exists purely to hint that one or more
LP_DEAD bits are set on the page).

Maybe we should go further here, and reconsider whether or not VACUUM
should *ever* unset BTP_HAS_GARBAGE. AFAICT, the only advantage of
nbtree VACUUM clearing it is that doing so might save a backend a
useless scan of the line pointer array to check for the LP_DEAD bits
directly. But the backend will have to split the page when that
happens anyway, which is a far greater cost. It's probably not even
noticeable, since we're already doing lots of stuff with the page when
it happens.

The BTP_HAS_GARBAGE hint probably mattered back when the "getting
tired" mechanism was used (i.e. prior to commit dd299df8). VACUUM
sometimes had a choice to make about which page to use, so quickly
getting an idea about LP_DEAD bits made a certain amount of
sense...but that's not how it works anymore. (Granted, we still do it
that way with pg_upgrade'd indexes from before Postgres 12, but I
don't think that that needs to be given any weight now.)

Thoughts on this?
--
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum
Следующее
От: Ashutosh Sharma
Дата:
Сообщение: Re: Assigning ROW variable having NULL value to RECORD type variabledoesn't give any structure to the RECORD variable.