Re: WIP: Fast GiST index build

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: WIP: Fast GiST index build
Дата
Msg-id CAPpHfdv7PjysfkGY_T-O4JmgTpdHM0QFqbPrwNmQ9S8Pn6gWmQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: Fast GiST index build  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Tue, Aug 16, 2011 at 4:04 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
I can see that that's equal to the formula given in the paper, log_B(M/4B), but I couldn't see any explanation for that formula in the paper. Your explanation makes sense, but where did it come from?
I didn't find it too. But it has to reservse memory for both sub-tree and active buffers. While we'are reserving memory for sub-tree in effective_cache_size and memory for last pages of buffers in maintenance_work_mem.
 
It seems a bit pessimistic: while it's true that the a subtree can't be larger than 2 * maxIndexTuplesPerPage ^ levelStep, you can put a tighter upper bound on it. The max size of a subtree of depth n can be calculated as the geometric series:

r^0 + r^1 + r^2 + ... + r^n = (1 - r^(n + 1)) / (1 - r)

where r = maxIndexTuplesPerPage. For r=2 those formulas are equal, but for a large r and small n (which is typical), the 2 * maxIndexTuplesPerPage^levelStep formula gives a value that's almost twice as large as the real max size of a subtree.
Thus, we can calculate:
levelstep = min(log_r(1 + effective_cache_size_in_pages*(r - 1)) - 1, log_r(maintenance_work_mem_in_pages - 1))
and get more precise result. But also we need at least very rough estimate of memory occupied by node buffers hash tab and path items.

------
With best regards,
Alexander Korotkov.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: walprotocol.h vs frontends
Следующее
От: Robert Haas
Дата:
Сообщение: Re: src/backend/storage/ipc/README