Yet another fast GiST build

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Yet another fast GiST build
Дата
Msg-id 1A36620E-CAD8-4267-9067-FB31385E7C0D@yandex-team.ru
обсуждение исходный текст
Ответы Re: Yet another fast GiST build  (Darafei "Komяpa" Praliaskouski <me@komzpa.net>)
Re: Yet another fast GiST build  (Heikki Linnakangas <hlinnaka@iki.fi>)
Re: Yet another fast GiST build  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Список pgsql-hackers
Hi!

In many cases GiST index can be build fast using z-order sorting.

I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
So, I've implemented yet another version of B-tree-like GiST build.
It's main use case and benefits can be summarized with small example:

postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
SELECT 3000000
Time: 5061,967 ms (00:05,062)
postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
CREATE INDEX
Time: 6140,227 ms (00:06,140)
postgres=# create index ON x using gist (point );
CREATE INDEX
Time: 32061,200 ms (00:32,061)

As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is
significantlysmaller. 

Nikita's PoC is faster because it uses parallel build, but it intervenes into B-tree code a lot (for reuse). This
patchsetis GiST-isolated. 
My biggest concern is that passing function to relation option seems a bit hacky. You can pass there any function
matchingsort support signature. 
Embedding this function into opclass makes no sense: it does not affect scan anyhow.

In current version, docs and tests are not implemented. I want to discuss overall design. Do we really want yet another
GiSTbuild, if it is 3-10 times faster? 

Thanks!

Best regards, Andrey Borodin.

[0] https://github.com/postgres/postgres/compare/master...glukhovn:gist_btree_build


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Can't we give better table bloat stats easily?
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free orcorruption (!prev)