Re: Better tracking of free space during SP-GiST index build

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Better tracking of free space during SP-GiST index build
Дата
Msg-id 9d10bdab-3040-4b3e-ac55-008e6d9c6ce3@2ndquadrant.com
обсуждение исходный текст
Ответ на Better tracking of free space during SP-GiST index build  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Better tracking of free space during SP-GiST index build  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

On 08/25/2016 01:45 AM, Tom Lane wrote:
> Over in the thread about the SP-GiST inet opclass, I threatened to post
> a patch like this, and here it is.
>
> The basic idea is to track more than just the very latest page we've used
> in each of the page categories that SP-GiST works with.  I started with an
> arrangement that gave an equal number of cache slots to each category, but
> soon realized that that was dumb, because there are usually way more leaf
> pages than anything else.  So this version has a little table of how many
> slots to give to each category.  The constants could maybe use a bit more
> fiddling, if we have some more test data sets to try this on.
>
> On the IRRExplorer data set we discussed in the other thread, this reduces
> the index size from 132MB to 120MB.  Poking into that more closely with
> pg_filedump, the total free space within the index drops from 42MB to
> 28MB.  If you think those numbers don't add up, you're right --- this
> seems to result in more non-leaf tuples than before.  I'm not sure why;
> maybe more aggressive sucking up of free space results in more splits.
> (Maybe adjustment of the default spgist fillfactor would be in order
> to counteract that?)  But the index search time doesn't seem to be hurt,
> so perhaps there's nothing to worry about.
>
> As coded, this makes no attempt to preferentially select pages with the
> most or least free space.  I don't know if it'd be worth any cycles to
> do that.
>
> I'll put this in the commitfest queue.  It could use review from someone
> with the time and motivation to do performance testing/tuning.
>

I can do a bit of benchmarking on this, I guess - possibly next week, 
but I can't promise that 100%. I'm not a spgist-expert and I won't have 
time to dive into the code, so it'll be mostly blackbox testing. Any 
hints what would be worth/interesting to test?

ISTM it'd be interesting to test both index creation, maintenance and 
querying, while varying the fillfactor. What other parameters would be 
interesting to tweak, and what datasets might be useful?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martín Marqués
Дата:
Сообщение: Re: pg_dump with tables created in schemas created by extensions
Следующее
От: Robert Haas
Дата:
Сообщение: increasing the default WAL segment size