PATCH: Extending the HyperLogLog API a bit

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема PATCH: Extending the HyperLogLog API a bit
Дата
Msg-id 568594AD.5030101@2ndquadrant.com
обсуждение исходный текст
Ответы Re: PATCH: Extending the HyperLogLog API a bit  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: PATCH: Extending the HyperLogLog API a bit  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
Hi,

while working on the bloom filters for hashjoins, I've started using the 
HLL library committed as part of the sorting improvements for 9.5. I 
propose adding two more functions to the API, which I think are quite 
useful:

1) initHyperLogLogError(hyperLogLogState *cState, double error)
   Instead of specifying bwidth (essentially the number of bits used   for addressing in the counter), this allows
specifyingthe expected   error rate for the counter, which is
 
      error_rate = 1.04 / sqrt(2^bwidth)
   So for 5% we get bwidth=5, and so on. This makes the API a bit easier   the use, because there are pretty much no
commentsabout the meaning   of bwidth, and the existing callers simply use 10 without details.
 

2) freeHyperLogLog(hyperLogLogState *cState)
   I think it's a good idea to provide function "undoing" what init   does, i.e. freeing the internal memory etc.
Currentlythat's trivial   to do, but perhaps we'll make the structure more complicated in the   future (albeit that
mightbe unlikely).
 

FWIW I've already used this in the patch marrying hash joins and bloom 
filters.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: WIP: bloom filter in Hash Joins with batches
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: PATCH: Extending the HyperLogLog API a bit