Re: HyperLogLog.c and pg_leftmost_one_pos32()

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: HyperLogLog.c and pg_leftmost_one_pos32()
Дата
Msg-id CAH2-WzkGvDKVDo+0YvfvZ+1CE=iCi88DCOGFF3i1hTGGaxcKPw@mail.gmail.com
обсуждение исходный текст
Ответ на HyperLogLog.c and pg_leftmost_one_pos32()  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: HyperLogLog.c and pg_leftmost_one_pos32()  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Wed, Jul 29, 2020 at 10:08 AM Jeff Davis <pgsql@j-davis.com> wrote:
> Is there a reason that HyperLogLog doesn't use pg_leftmost_one_pos32()?

Yes: HyperLogLog predates pg_leftmost_one_pos32().

> I tried the following patch and some brief performance tests seem to
> show an improvement.

Makes sense.

How did you test this? What kind of difference are we talking about? I
ported this code from the upstream C++ as part of the original
abbreviated keys commit. Note that the cardinality of abbreviated keys
are displayed when you set "trace_sort = on".

> This came up because my recent commit 9878b643 uses HLL for estimating
> the cardinality of spill files, which solves a few annoyances with
> overpartitioning[1].

I think that you should change back the rhs() variable names to match
HyperLogLog upstream (as well as the existing rhs() comments).

> I think it's overall an improvement, but
> addHyperLogLog() itself seemed to show up as a cost, so it can hurt
> spilled-but-still-in-memory cases. I'd also like to backpatch this to
> 13 (as I already did for 9878b643), if that's acceptable.

I still wonder if it was ever necessary to add HLL to abbreviated
keys. It only served to avoid some pretty narrow worse cases, at the
expense of typical cases. Given that the existing users of HLL are
pretty narrow, and given the importance of preserving the favorable
performance characteristics of hash aggregate, I'm inclined to agree
that it's worth backpatching to 13 now. Assuming it is a really
measurable cost in practice.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: [PATCH] Initial progress reporting for COPY command
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Default setting for enable_hashagg_disk