On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote:
> So I think we're not costing the batching properly / at all.
Thank you for all of the testing! I think the results are good: even
for cases where HashAgg is the wrong choice, it's not too bad. You're
right that costing is not done, and when it is, I think it will avoid
these bad choices most of the time.
> A couple more comments:
>
> 1) IMHO we should rename hashagg_mem_overflow to
> enable_hashagg_overflow
> or something like that. I think that describes the GUC purpose better
> (and it's more consistent with enable_hashagg_spill).
The other enable_* GUCs are all planner GUCs, so I named this one
differently to stand out as an executor GUC.
> 2) show_hashagg_info
>
> I think there's a missing space after ":" here:
>
> " Batches: %d Disk Usage:%ldkB",
>
> and maybe we should use just "Disk:" just like in we do for sort:
Done, thank you.
> 3) I'm not quite sure what to think about the JIT recompile we do for
> EEOP_AGG_INIT_TRANS_SPILLED etc. I'm no llvm/jit expert, but do we do
> that for some other existing cases?
Andres asked for that explicitly to avoid branches in the non-spilling
code path (or at least branches that are likely to be mispredicted).
Regards,
Jeff Davis