Re: Avoiding hash join batch explosions with extreme skew and weird stats

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Avoiding hash join batch explosions with extreme skew and weird stats
Дата
Msg-id CAAKRu_Zs-RsQypoMUmMO3=xHoF08w8ShEGZL=iH4Uyyhv8YeHA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Avoiding hash join batch explosions with extreme skew and weird stats  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Avoiding hash join batch explosions with extreme skew and weirdstats  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers


On Tue, Jun 4, 2019 at 12:15 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> One question I have is, how would the OR'd together bitmap be
> propagated to workers after the first chunk? That is, when there are
> no tuples left in the outer bunch, for a given inner chunk, would you
> load the bitmaps from each worker into memory, OR them together, and
> then write the updated bitmap back out so that each worker starts with
> the updated bitmap?

I was assuming we'd elect one participant to go read all the bitmaps,
OR them together, and generate all the required null-extended tuples,
sort of like the PHJ_BUILD_ALLOCATING, PHJ_GROW_BATCHES_ALLOCATING,
PHJ_GROW_BUCKETS_ALLOCATING, and/or PHJ_BATCH_ALLOCATING states only
involve one participant being active at a time. Now you could hope for
something better -- why not parallelize that work?  But on the other
hand, why not start simple and worry about that in some future patch
instead of right away? A committed patch that does something good is
better than an uncommitted patch that does something AWESOME.


What if you have a lot of tuples -- couldn't the bitmaps get pretty
big? And then you have to OR them all together and if you can't put
the whole bitmap from each worker into memory at once to do it, it
seems like it would be pretty slow. (I mean maybe not as slow as
reading the outer side 5 times when you only have 3 chunks on the
inner + all the extra writes from my unmatched tuple file idea, but still...)

--
Melanie Plageman

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melanie Plageman
Дата:
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weird stats
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weird stats