Re: Solving hash table overrun problems

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Solving hash table overrun problems
Дата	4 марта 2005 г. 19:12:48
Msg-id	27970.1109952763@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Solving hash table overrun problems (Bruno Wolff III <bruno@wolff.to>)
Список	pgsql-hackers

Дерево обсуждения

Bruno Wolff III <bruno@wolff.to> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Bruno Wolff III <bruno@wolff.to> writes:
>>> If K is way low this could be very slow.
>> 
>> How so?  You're not concerned about the time to do the division itself
>> are you?

> No, rather having lots of entries in the same hash buckets.

That won't happen because we are going to set K with an eye to the
maximum number of rows we intend to hold in memory (given work_mem).
With the addition of the dynamic batch splitting logic, that number
of rows is actually reasonably accurate.

The only way this scheme can really lose badly is if there are large
numbers of tuples with exactly the same hash code, so that no matter how
much we increase N we can't split up the bucketload.  This is a risk for
*any* hashing scheme, however.  In practice we have to rely on the
planner to not choose hashing when there are only a few distinct values
for the key.

> I just noticed that it wasn't mentioned that an overflow could occur at this
> step.

It can't, because we aren't loading the outer tuples into the hash
table.  We are just considering them one at a time and probing for
matches.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Bruno Wolff III
Дата: 04 марта 2005 г., 19:06:28
Сообщение: Re: Solving hash table overrun problems

Следующее

От: Bruce Momjian
Дата: 04 марта 2005 г., 19:20:18
Сообщение: I am in Copenhagen

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Solving hash table overrun problems

Предыдущее

Следующее