Re: Avoiding hash join batch explosions with extreme skew and weirdstats
От | Tomas Vondra |
---|---|
Тема | Re: Avoiding hash join batch explosions with extreme skew and weirdstats |
Дата | |
Msg-id | 20190607140540.gx736kravrzna57o@development обсуждение исходный текст |
Ответ на | Re: Avoiding hash join batch explosions with extreme skew and weird stats (Melanie Plageman <melanieplageman@gmail.com>) |
Список | pgsql-hackers |
On Thu, Jun 06, 2019 at 04:37:19PM -0700, Melanie Plageman wrote: >On Thu, May 16, 2019 at 3:22 PM Thomas Munro <thomas.munro@gmail.com> wrote: > >> Admittedly I don't have a patch, just a bunch of handwaving. One >> reason I haven't attempted to write it is because although I know how >> to do the non-parallel version using a BufFile full of match bits in >> sync with the tuples for outer joins, I haven't figured out how to do >> it for parallel-aware hash join, because then each loop over the outer >> batch could see different tuples in each participant. You could use >> the match bit in HashJoinTuple header, but then you'd have to write >> all the tuples out again, which is more IO than I want to do. I'll >> probably start another thread about that. >> >> >Going back to the idea of using the match bit in the HashJoinTuple header >and writing out all of the outer side for every chunk of the inner >side, I was wondering if there was something we could do that was kind >of like mmap'ing the outer side file to give the workers in parallel >hashjoin the ability to update a match bit in the tuple in place and >avoid writing the whole outer side out each time. > I think this was one of the things we discussed in Ottawa - we could pass index of the tuple (in the batch) along with the tuple, so that each worker know which bit to set. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: