Re: [HACKERS] A design for amcheck heapam verification

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] A design for amcheck heapam verification
Дата
Msg-id CAH2-Wzmsi_cPaF65AgD14z4KphTned+njSgptkWqoYiDL=qYxA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] A design for amcheck heapam verification  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] A design for amcheck heapam verification  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Tue, Aug 29, 2017 at 7:22 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Indeed.  Thank you for working on this!  To summarise a couple of
> ideas that Peter and I discussed off-list a while back:  (1) While
> building the hash table for a hash join we could build a Bloom filter
> per future batch and keep it in memory, and then while reading from
> the outer plan we could skip writing tuples out to future batches if
> there is no chance they'll find a match when read back in later (works
> only for inner joins and only pays off in inverse proportion to the
> join's selectivity);

Right. Hash joins do tend to be very selective, though, so I think
that this could help rather a lot. With just 8 or 10 bits per element,
you can eliminate almost all the batch write-outs on the outer side.
No per-worker synchronization for BufFiles is needed when this
happens, either. It seems like that could be very important.

> To use this for anything involving parallelism where a Bloom filter
> must be shared we'd probably finish up having to create a shared
> version of bloom_init() that either uses caller-provided memory and
> avoids the internal pointer, or allocates DSA memory.  I suppose you
> could consider splitting your bloom_init() function up into
> bloom_estimate() and bloom_init(user_supplied_space, ...) now, and
> getting rid of the separate pointer to bitset (ie stick char
> bitset[FLEXIBLE_ARRAY_MEMBER] at the end of the struct)?

Makes sense. Not hard to add that.

> I was just observing that there is an opportunity for code reuse here.
> This function's definition would ideally be something like:
>
> double
> bloom_prop_bits_set(bloom_filter *filter)
> {
>     return popcount(filter->bitset, NBYTES(filter)) / (double) NBITS(filter);
> }
>
> This isn't an objection to the way you have it, since we don't have a
> popcount function yet!  We have several routines in the tree for
> counting bits, though not yet the complete set from Hacker's Delight.

Right. I'm also reminded of the lookup tables for the visibility/freeze map.

> Your patch brings us one step closer to that goal.  (The book says
> that this approach is good far sparse bitsets, but your comment says
> that we expect something near 50%.  That's irrelevant anyway since a
> future centralised popcount() implementation would do this in
> word-sized chunks with a hardware instruction or branch-free-per-word
> lookups in a table and not care at all about sparseness.)

I own a copy of Hacker's Delight (well, uh, Daniel Farina lent me his
copy about 2 years ago!). pop()/popcount() does seem like a clever
algorithm, that we should probably think about adopting in some cases,
but I should point at that the current caller to my
bloom_prop_bits_set() function is an elog() DEBUG1 call. This is not
at all performance critical.

> + * Test if bloom filter definitely lacks element.
>
> I think where "Bloom filter" appears in prose it should have a capital
> letter (person's name).

Agreed.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] segment size depending *_wal_size defaults (wasincreasing the default WAL segment size)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] segment size depending *_wal_size defaults (wasincreasing the default WAL segment size)