Re: HyperLogLog.c and pg_leftmost_one_pos32()

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: HyperLogLog.c and pg_leftmost_one_pos32()
Дата
Msg-id 1a112df8b6d82c2701c57f9457f872ecc20128b3.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: HyperLogLog.c and pg_leftmost_one_pos32()  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: HyperLogLog.c and pg_leftmost_one_pos32()  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Wed, 2020-07-29 at 17:32 -0700, Peter Geoghegan wrote:
> How did you test this? What kind of difference are we talking about?

Essentially:
  initHyperLogLog(&hll, 5)
  for i in 0 .. one billion
    addHyperLogLog(&hll, hash(i))
  estimateHyperLogLog

The numbers are the same regardless of bwidth.

Before my patch, it takes about 15.6s. After my patch, it takes about
6.6s, so it's more than a 2X speedup (including the hash calculation).

As a part of HashAgg, when I test the worst case, it goes from a 4-5%
penalty to ~1% (within noise).

> I think that you should change back the rhs() variable names to match
> HyperLogLog upstream (as well as the existing rhs() comments).

Done.

> > I think it's overall an improvement, but
> > addHyperLogLog() itself seemed to show up as a cost, so it can hurt
> > spilled-but-still-in-memory cases. I'd also like to backpatch this
> > to
> > 13 (as I already did for 9878b643), if that's acceptable.
> 
> I still wonder if it was ever necessary to add HLL to abbreviated
> keys. It only served to avoid some pretty narrow worse cases, at the
> expense of typical cases. Given that the existing users of HLL are
> pretty narrow, and given the importance of preserving the favorable
> performance characteristics of hash aggregate, I'm inclined to agree
> that it's worth backpatching to 13 now. Assuming it is a really
> measurable cost in practice.

Yes, the difference (at least in a tight loop, on my machine) is not
subtle. I went ahead and committed and backpatched.

Regards,
    Jeff Davis






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Missing CFI in hlCover()?
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: HyperLogLog.c and pg_leftmost_one_pos32()