Re: Fix overflow of nbatch

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Fix overflow of nbatch
Дата
Msg-id ddd90505-d7cd-4329-a34a-7efdc4a387cf@vondra.me
обсуждение исходный текст
Ответ на Re: Fix overflow of nbatch  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers
On 10/13/25 18:05, Melanie Plageman wrote:
> On Thu, Oct 9, 2025 at 7:36 PM Tomas Vondra <tomas@vondra.me> wrote:
>>
>> 1) A couple comments adjusted. It feels a bit too audacious to correct
>> comments written by native speaker, but it seems cleaner to me like this.
> 
> I attached a patch with a few more suggested adjustments (0003). The
> more substantive tweaks are:
> 
> I don't really like that this comment says it is about nbuckets
> overflowing MaxAllocSize because overflow means something specific and
> this sounds like we are saying the nbuckets variable will overflow
> MaxAllocSize but what we mean is that nbuckets worth of HashJoinTuples
> could overflow MaxAllocSize. You don't have to use my wording, but I'm
> not sure about this wording either.
> 
> /* Check that nbuckets wont't overflow MaxAllocSize */
> if (nbuckets > MaxAllocSize / sizeof(HashJoinTuple) / 2)
>         break;
> 

I think the wording is fine, it just needs to talk about "buckets" and
not "nbuckets". I did that in the attached v6.

> Also, upon re-reading the comment we wrote together:
> 
>         * With extremely low work_mem values, nbuckets may have been set
>         * higher than hash_table_bytes / sizeof(HashJoinTuple). We don't try
>         * to correct that here, we accept nbuckets to be oversized.
> 
> I'm not so sure it belongs above the nbuckets allocation check.
> Perhaps we should move it to where we are doubling nbuckets. Or even
> above where we clamp it to 1024.
> 

Yeah, I'm not happy with the place either. But mentioning this above the
clamp to 1024 doesn't seem great either.

> I'm actually wondering if we want this comment at all. Does it
> emphasize an edge case to a confusing point? I'm imagining myself
> studying it in the future and having no idea what it means.
> 

You may be right. I thought someone might read the code in the future,
possibly while investigating a case when the loop stopped too early. And
will be puzzled, not realizing nbuckets might be too high. But frankly,
that's super unlikely. It only applies to cases with extremely low
work_mem values, that's quite unlikely on machines doing massive joins.

I'll think about it in the morning, but I'll probably remove it.

> I've kept it in but moved it in 0003.
> 
>> 4) added the doubling of num_skew_mcvs, and also the overflow protection
>> for that
> 
> Do we really need to check if num_skew_mcvs will overflow? Shouldn't
> it always be smaller than nbuckets? Maybe it can be an assert.
> 

Good point. I wasn't sure if that's guaranteed, but after looking at the
skew_mcvs calculation again I think you're right. So +1 to assert.

>> You suggested this in another message:
>>
>>> We do need to recalculate num_skew_mcvs once at the end before
>>> returning.
>>
>> But I think the doubling has the same effect, right? I don't want to
>> redo the whole "if (useskew) { ... }" block at the end.
> 
> Yea, it would have to be some kind of helper or something. I worried
> just doubling num_skew_mcvs would drift significantly because of
> integer truncation -- perhaps even a meaningful amount. But it was
> just an intuition -- I didn't plug in any numbers and try.
> 

I think the integer truncation should not matter. AFAIK it could be off
by 1 on the first loop (due to rounding), and then the error gets
doubled on every loop. So with 8 loops we might be off by 127, right?
But with the 2% SKEW_HASH_MEM_PERCENT that difference is negligible
compared to the actual skew_mcvs value, I think.

All these formulas are rough guesses based on arbitrary constants (like
SKEW_HASH_MEM_PERCENT) anyway.


I'll give this a bit more testing and review tomorrow, and then I'll
push. I don't want to hold this back through pgconf.eu.


regards

-- 
Tomas Vondra

Вложения

В списке pgsql-hackers по дате отправления: