I see that partition-wise aggregate plan too uses parallel index, am I missing something?
You're right, I missed that, oops.
Q18 takes some 390 secs with patch and some 147 secs without it.
This looks strange. This patch set does not touch parallel or seq scan as such. I am not sure why this is happening. All these three queries explain plan shows much higher execution time for parallel/seq scan.
Yeah strange it is.
Off-list I have asked Rafia to provide me the perf machine access where she is doing this bench-marking to see what's going wrong. Thanks Rafia for the details.
What I have observed that, there are two sources, one with HEAD and other with HEAD+PWA. However the configuration switches were different. Sources with HEAD+PWA has CFLAGS="-ggdb3 -O0" CXXFLAGS="-ggdb3 -O0" flags in addition with other sources. i.e. HEAD+PWA is configured with debugging/optimization enabled which account for the slowness.
I have run EXPLAIN for these three queries on both the sources having exactly same configuration switches and I don't find any slowness with PWA patch-set.
Thus, it will be good if you re-run the benchmark by keeping configuration switches same on both the sources and share the results.
Thanks
However, do you see similar behaviour with patches applied, "enable_partition_wise_agg = on" and "enable_partition_wise_agg = off" ?
I tried that for query 18, with patch and enable_partition_wise_agg = off, query completes in some 270 secs. You may find the explain analyse output for it in the attached file. I noticed that on head the query plan had parallel hash join however with patch and no partition-wise agg it is using nested loop joins. This might be the issue.
Also, does rest of the queries perform better with partition-wise aggregates?
As far as this setting goes, there wasn't any other query using partition-wise-agg, so, no.
BTW, just an FYI, this experiment is on scale factor 20.