Re: pgsql: Add parallel-aware hash joins.
От | Thomas Munro |
---|---|
Тема | Re: pgsql: Add parallel-aware hash joins. |
Дата | |
Msg-id | CAEepm=3SeFvsfnnOLSA3tLtBe-rtyL=c+vfzyPCsViBjk521qw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pgsql: Add parallel-aware hash joins. (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: pgsql: Add parallel-aware hash joins.
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-committers |
On Sun, Dec 31, 2017 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> You mentioned that prairiedog sees the problem about one time in >> thirty. Would you mind checking if it goes away with this patch >> applied? > > I've run 55 cycles of "make installcheck" without seeing a failure > with this patch installed. That's not enough to be totally sure > of course, but I think this probably fixes it. Thanks! > However ... I noticed that my other dinosaur gaur shows the other failure > mode we see in the buildfarm, the "increased_batches = t" diff, and > I can report that this patch does *not* help that. The underlying > EXPLAIN output goes from something like > > ! Buckets: 4096 Batches: 8 Memory Usage: 208kB > > to something like > > ! Buckets: 4096 (originally 4096) Batches: 16 (originally 8) Memory Usage: 176kB > > so again we have a case where the plan didn't change but the execution > behavior did. This isn't quite 100% reproducible on gaur/pademelon, > but it fails more often than not seems like, so I can poke into it > if you can say what info would be helpful. Right. That's apparently unrelated and is the last build-farm issue on my list (so far). I had noticed that certain BF animals are prone to that particular failure, and they mostly have architectures that I don't have so a few things are probably just differently sized. At first I thought I'd tweak the tests so that the parameters were always stable, and I got as far as installing Debian on qemu-system-ppc (it took a looong time to compile PostgreSQL), but that seems a bit cheap and flimsy... better to fix the size estimation error. I assume that what happens here is the planner's size estimation code sometimes disagrees with Parallel Hash's chunk-based memory accounting, even though in this case we had perfect tuple count and tuple size information. In an earlier version of the patch set I refactored the planner to be chunk-aware (even for parallel-oblivious hash join), but later in the process I tried to simplify and shrink the patch set and avoid making unnecessary changes to non-Parallel Hash code paths. I think I'll need to make the planner aware of the maximum amount of fragmentation possible when parallel-aware (something like: up to one tuple's worth at the end of each chunk, and up to one whole wasted chunk per participating backend). More soon. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-committers по дате отправления: