Hi,
On 08/20/2015 04:15 AM, Tomas Vondra wrote:
> Hello KaiGain-san,
>
> On 08/19/2015 03:19 PM, Kohei KaiGai wrote:
>> Unless we have no fail-safe mechanism when planner estimated too
>> large number of tuples than actual needs, a strange estimation will
>> consume massive amount of RAMs. It's a bad side effect.
>> My previous patch didn't pay attention to the scenario, so needs to
>> revise the patch.
>
> I agree we need to put a few more safeguards there (e.g. make sure we
> don't overflow INT when counting the buckets, which may happen with the
> amounts of work_mem we'll see in the wild soon).
>
> But I think we should not do any extensive changes to how we size the
> hashtable - that's not something we should do in a bugfix I think.
Attached are two alternative version of a backpatch. Both limit the
nbuckets so that it never exceeds MaxAllocSize - effectively 512MB due
to the doubling (so ~67M buckets on 64-bit architectures).
The only difference is that the 'alternative' patch limits max_pointers
+ /* ensure we don't exceed the maximum allocation size */
+ max_pointers = Min(max_pointers, MaxAllocSize / sizeof(void*));
so it affects both nbuckets and nbatches. That seems a bit more correct,
but I guess whoever gets this many batches would be grateful even for
the quick crash.
For master, I think the right patch is what KaiGai-san posted in June. I
don't think we should really try making it smarter about handling
overestimates at this point - that's 9.6 stuff IMNSHO.
I find it a bit awkward that we only have MemoryContextAllocHuge and
repalloc_huge, especially as nodeHash.c needs MemoryContextAllocHuge +
memset to zero the chunk.
So I think we should extend the memutils API by adding palloc0_huge (and
possibly palloc_huge, although that's not needed by nodeHash.c).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services