Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Дата
Msg-id CAH2-Wz=Tr6mxMsKRmv_=9-05_O9QWqOzQ8GweRV2DXS6+Y38QQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Список pgsql-hackers
On Fri, Jan 10, 2020 at 1:36 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Still, v29 doesn't resolve the following points you've raised, where I
> haven't reached a final opinion on what to do myself. These items are
> as follows (I'm quoting your modified patch file sent on January 8th
> here):

Still no progress on these items, but I am now posting v30. A new
version seems warranted, because I now want to revive a patch from a
couple of years back as part of the deduplication project -- it would
be good to get feedback on that sooner rather than later. This is a
patch that you [Heikki] are already familiar with -- the patch to
speed up compactify_tuples() [1]. Sokolov Yura is CC'd here, since he
is the original author.

The deduplication patch is much faster with this in place. For
example, with v30:

pg@regression:5432 [25216]=# create unlogged table foo(bar int4);
CREATE TABLE
pg@regression:5432 [25216]=# create index unlogged_foo_idx on foo(bar);
CREATE INDEX
pg@regression:5432 [25216]=# insert into foo select g from
generate_series(1, 1000000) g, generate_series(1,10) i;
INSERT 0 10000000
Time: 17842.455 ms (00:17.842)

If I revert the "Bucket sort for compactify_tuples" commit locally,
then the same insert statement takes 31.614 seconds! In other words,
the insert statement is made ~77% faster by that commit alone. The
improvement is stable and reproducible.

Clearly there is a big compactify_tuples() bottleneck that comes from
PageIndexMultiDelete(). The hot spot is quite visible with "perf top
-e branch-misses".

The compactify_tuples() patch stalled because it wasn't clear if it
was worth the trouble at the time. It was originally written to
address a much smaller PageRepairFragmentation() bottleneck in heap
pruning. ISTM that deduplication alone is a good enough reason to
commit this patch. I haven't really changed anything about the
2017/2018 patch -- I need to do more review of that. We probably don't
need to qsort() inlining stuff (the bucket sort thing is the real
win), but I included it in v30 all the same.

Other changes in v30:

* We now avoid extra _bt_compare() calls within _bt_check_unique() --
no need to call _bt_compare() once per TID (once per equal tuple is
quite enough).

This is a noticeable performance win, even though the change was
originally intended to make the logic in _bt_check_unique() clearer.

* Reduced the limit on the size of a posting list tuple to 1/6 of a
page -- down from 1/3.

This seems like a good idea on the grounds that it keeps our options
open if we split a page full of duplicates due to UPDATEs rather than
INSERTs (i.e. we split a page full of duplicates that isn't also the
rightmost page among pages that store only those duplicates). A lower
limit is more conservative, and yet doesn't cost us that much space.

* Refined nbtsort.c/CREATE INDEX to work sensibly with non-standard
fillfactor settings.

This last item is a minor bugfix, really.

[1] https://commitfest.postgresql.org/14/1138/
-- 
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: pause recovery if pitr target not reached
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Improve errors when setting incorrect bounds for SSL protocols