Re: Fix overflow of nbatch
От | Tomas Vondra |
---|---|
Тема | Re: Fix overflow of nbatch |
Дата | |
Msg-id | 3258ad4c-4fbf-4557-9757-3d286a69ae42@vondra.me обсуждение исходный текст |
Ответ на | Re: Fix overflow of nbatch (Melanie Plageman <melanieplageman@gmail.com>) |
Ответы |
Re: Fix overflow of nbatch
|
Список | pgsql-hackers |
On 10/9/25 16:16, Melanie Plageman wrote: > On Wed, Oct 8, 2025 at 4:46 PM Tomas Vondra <tomas@vondra.me> wrote: >> >> On 10/8/25 19:37, Melanie Plageman wrote: >> >>> Yes, this was wrong. I forgot an extra / sizeof(HashJoinTuple). I meant: >>> >>> hash_table_bytes / sizeof(HashJoinTuple) > MaxAllocSize / >>> sizeof(HashJoinTuple) / 2 >> >> But the hash table is not allocated as a single chunk of memory, so I >> think MaxAllocSize would be the wrong thing to use here, no? Hash table >> is a collection of tuples (allocated one by one), that's why >> get_hash_memory_limit() uses SIZE_MAX. MaxAllocSize matters for >> nbuckets, because that indeed is allocated as a contiguous array, ofc. >> >> Also, why bother with /sizeof(HashJoinTuple) on both sides? Without it >> we get >> >> hash_table_bytes > MaxAllocSize / 2 >> >> but again, that doesn't make much sense - the hash table can be larger, >> it's not a single palloc. > > It came from the earlier clamping of nbuckets: > > max_pointers = hash_table_bytes / sizeof(HashJoinTuple); > max_pointers = Min(max_pointers, MaxAllocSize / sizeof(HashJoinTuple)); > dbuckets = Min(dbuckets, max_pointers); > > I don't really get why this divides hash_table_bytes by > sizeof(HashJoinTuple) since, as you say, hash_table_bytes is supposed > to accommodate both the bucket_bytes and inner_rel_bytes. > I think that's simply to allocate enough buckets for the the expected number of tuples, not more. And cap it to MaxAllocSize. Some of this may be redundant, I suppose. >> And I think it should be just >> >> if (hash_table_bytes > SIZE_MAX / 2) >> break; >> >> as a protection against hash_table_bytes overflowing SIZE_MAX (which on >> 64-bits seems implausible, but could happen on 32-bit builds). > > That's roughly the check I ended up with -- except I used > space_allowed because it will be larger than hash_table_bytes if there > is a skew hashtable. I've updated the comment and such in attached v3. > Ah, thanks. I was just hacking on this too, but I'll switch to your v3. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: