Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
| От | Tom Lane |
|---|---|
| Тема | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
| Дата | |
| Msg-id | 8525.1576007382@sss.pgh.pa.us обсуждение исходный текст |
| Ответ на | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
| Ответы |
Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
|
| Список | pgsql-bugs |
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> As for the performance impact, I did this:
> create table dim (id int, val text);
> insert into dim select i, md5(i::text) from generate_series(1,1000000) s(i);
> create table fact (id int, val text);
> insert into fact select mod(i,1000000)+1, md5(i::text) from generate_series(1,25000000) s(i);
> set max_parallel_workers_per_gather = 0;
> select count(*) from fact join dim using (id);
> So a perfectly regular join between 1M and 25M table. On my machine,
> this takes ~8851ms on master and 8979ms with the patch (average of about
> 20 runs with minimal variability). That's ~1.4% regression, so a bit
> more than the 0.4% mentioned before. Not a huge difference though, and
> some of it might be due to different binary layout etc.
Hmm ... I replicated this experiment here, using my usual precautions
to get more-or-less-reproducible numbers [1]. I concur that the
patch seems to be slower, but only by around half a percent on the
median numbers, which is much less than the run-to-run variation.
So that would be fine --- except that in my first set of runs,
I forgot the "set max_parallel_workers_per_gather" step and hence
tested this same data set with a parallel hash join. And in that
scenario, I got a repeatable slowdown of around 7.5%, which is far
above the noise floor. So that's not good --- why does this change
make PHJ worse?
regards, tom lane
[1] https://www.postgresql.org/message-id/31686.1574722301%40sss.pgh.pa.us
В списке pgsql-bugs по дате отправления: