Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Дата
Msg-id 5300.1545852065@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: reducing the footprint of ScanKeyword (was Re: Large writablevariables)  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2018-12-26 10:45:11 -0500, Robert Haas wrote:
>> I'm not sure that I understand quite what you have in mind for a
>> serialized non-perfect hashtable.  Are you thinking that we'd just
>> construct a simplehash and serialize it?

> I was basically thinking that we'd have the perl script implement a
> simple hash and put the keyword (pointers) into an array, handling
> conflicts with the simplest linear probing thinkable. As there's never a
> need for modifications, that ought to be fairly simple.

I think it was Knuth who said that when you use hashing, you are putting
a great deal of faith in the average case, because the worst case is
terrible.  The applicability of that to this problem is that if you hit
a bad case (say, a long collision chain affecting some common keywords)
you could end up with poor performance that affects a lot of people for
a long time.  And our keyword list is not so static that you could prove
once that the behavior is OK and then forget about it.

So I'm suspicious of proposals to use simplistic hashing here.

There might well be some value in Robert's idea of keying off the first
letter to get rid of the first few binary-search steps, not least because
those steps are particularly terrible from a cache-footprint perspective.
I'm not sold on doing anything significantly more invasive than that.

            regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: random() (was Re: New GUC to sample log queries)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: random() (was Re: New GUC to sample log queries)