Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Дата
Msg-id 67e5af9b-f509-7b9e-f7a1-e01112f6c18d@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Andres Freund <andres@anarazel.de>)
Ответы Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs

On 12/06/2017 09:46 PM, Andres Freund wrote:
> On 2017-12-06 21:38:42 +0100, Tomas Vondra wrote:
>> It's one thing when the hash table takes longer to lookup something or
> 
> longer aka "forever".
> 

Not necessarily.

The datasets I shared are somewhat extreme in the sense that there are
many contiguous sequences of hash values, but it only takes one such
sequence with at least SH_GROW_MAX_MOVE values to trigger the issue. So
the hash table may still be perfectly fine for most keys, and only
slightly slower for the keys in the sequence.

> 
>> when it consumes a bit more memory. Say, ~2x more than needed, give or
>> take. I'm perfectly fine with that, particularly when it's a worst-case
>> evil data set like this one.
> 
> I think the way to prevent that kind of attack is to add randomization.
> 

By randomization you mean universal hashing [1], or something else?

[1] https://en.wikipedia.org/wiki/Universal_hashing

In any case, randomization should make it much harder (or pretty much
impossible) to construct datasets that are guaranteed to cause failures
of this kind. No problem with that (although if someone hits the issue,
it will be more difficult to reproduce).

It would still be possible to hit it by chance, but that should be very
unlikely (OTOH the current code assumes the same thing about collisions
and sequences, and here we are.)

> 
>> FWIW I've constructed the data sets for two reasons - to convince myself
>> that my understanding of the simplehash code is correct, and to provide
>> a data set triggering the other growth condition in simplehash code. My
>> understanding is that if we stop growing the table after the load factor
>> drops below some threshold (as TL proposed earlier in this thread), it
>> should address both of these cases.
> 
> Yea, I'm not adverse to adding a few stopgaps that break in a less
> annoying manner. WAll I'm saying is that I don't think we need to be
> super concerned about this specific way of breaking things.
> 

OK, understood.


cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in aninfinite loop