Re: GiST VACUUM

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: GiST VACUUM
Дата
Msg-id 479336FE-B713-4DFB-9661-9BFCCA70F4EF@yandex-team.ru
обсуждение исходный текст
Ответ на Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers

> 19 июля 2018 г., в 16:28, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
> Hmm. So, while we are scanning the right sibling, which was moved to lower-numbered block because of a concurrent
split,the original page is split again? That's OK, we've already scanned all the tuples on the original page, before we
recurseto deal with the right sibling. (The corresponding B-tree code also releases the lock on the original page when
recursing)
Seems right.

>
> I did some refactoring, to bring this closer to the B-tree code, for the sake of consistency. See attached patch.
Thisalso eliminates the 2nd pass by gistvacuumcleanup(), in case we did that in the bulkdelete-phase already. 
Thanks!

>
> There was one crucial thing missing: in the outer loop, we must ensure that we scan all pages, even those that were
addedafter the vacuum started. 
Correct. Quite a neat logic behind the order of acquiring npages, comparing and vacuuming page. Notes in FIXME look
correctexcept function names. 

> There's a comment explaining that in btvacuumscan(). This version fixes that.
>
> I haven't done any testing on this. Do you have any test scripts you could share?
I use just a simple tests that setup replication and does random inserts and vaccums. Not a rocket science, just a
mutatedscript 
for i in $(seq 1 12); do
size=$((100 * 2**$i))
./psql postgres -c "create table x as select cube(random()) c from generate_series(1,$size) y; create index on x using
gist(c);"
./psql postgres -c "delete from x;"
./psql postgres -c "VACUUM x;"
./psql postgres -c "VACUUM x;"
./psql postgres -c "drop table x;"
./psql postgres -c "create table x as select cube(random()) c from generate_series(1,$size) y; create index on x using
gist(c);"
./psql postgres -c "delete from x where (c~>1)>0.1;"
./psql postgres -c "VACUUM x;"
./psql postgres -c "insert into x select cube(random()) c from generate_series(1,$size) y;"
./psql postgres -c "VACUUM x;"
./psql postgres -c "delete from x where (c~>1)>0.1;"
./psql postgres -c "select pg_size_pretty(pg_relation_size('x_c_idx'));"
./psql postgres -c "VACUUM FULL x;"
./psql postgres -c "select pg_size_pretty(pg_relation_size('x_c_idx'));"
./psql postgres -c "drop table x;"
done

> I think we need some repeatable tests for the concurrent split cases.
It is hard to trigger left splits until we delete pages. I'll try to hack gistNewBuffer() to cause something similar.

> Even if it involves gdb or some other hacks that we can't include in the regression test suite, we need something
now,while we're hacking on this. 
>
> One subtle point, that I think is OK, but gave me a pause, and probably deserves comment somewhere: A concurrent root
splitcan turn a leaf page into one internal (root) page, and two new leaf pages. The new root page is placed in the
sameblock as the old page, while both new leaf pages go to freshly allocated blocks. If that happens while vacuum is
running,might we miss the new leaf pages? As the code stands, we don't do the "follow-right" dance on internal pages,
sowe would not recurse into the new leaf pages. At first, I thought that's a problem, but I think we can get away with
it.The only scenario where a root split happens on a leaf page, is when the index has exactly one page, a single leaf
page.Any subsequent root splits will split an internal page rather than a leaf page, and we're not bothered by those.
Inthe case that a root split happens on a single-page index, we're OK, because we will always scan that page either
before,or after the split. If we scan the single page before the split, we see all the leaf tuples on that page. If we
scanthe single page after the split, it means that we start the scan after the split, and we will see both leaf pages
aswe continue the scan. 
Yes, only page 0 may change type, and page 0 cannot split to left.


I'm working on triggering left split during vacuum. Will get back when done. Thanks!

Best regards, Andrey Borodin.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Possible bug in logical replication.
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: Bug in gin insert redo code path during re-compression of emptygin data leaf pages