Re: WIP: dynahash replacement for buffer table

Поиск
Список
Период
Сортировка
От Ants Aasma
Тема Re: WIP: dynahash replacement for buffer table
Дата
Msg-id CA+CSw_vQVZnqsQE8X10fAZ486FCTj-h0Dkby=JJsgA578AmLEg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: dynahash replacement for buffer table  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: WIP: dynahash replacement for buffer table
Re: WIP: dynahash replacement for buffer table
Список pgsql-hackers
On Tue, Oct 14, 2014 at 6:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> With regard for using a hash table for the buffer mapping lock I'm
>> doubtful that any form of separate chaining is the right one. We
>> currently have a quite noticeable problem with the number of cache
>> misses in the buffer mapping hash (and also the heavyweight lock mgr) -
>> if we stay with hashes that's only going to be fundamentally lower than
>> dynahash if we change the type of hashing. I've had good, *preliminary*,
>> results using an open addressing + linear probing approach.
>
> I'm very skeptical of open addressing + linear probing.  Such hash
> tables tend to degrade over time, because you have to leave tombstones
> behind to ensure that future scans advance far enough.  There's really
> no way to recover without rebuilding the whole hash table, and
> eventually it degrades to linear search.  If we're spending too much
> time walking hash chains, I think the solution is to increase the
> number of buckets so that the chains get shorter.

What about cuckoo hashing? There was a recent paper on how to do fine
grained locking with cuckoo hashes. [1]

I'm imagining a bucketized cuckoo hash with 5 item buckets (5-way
associativity). This allows us to fit the bucket onto 2 regular sized
cache lines and have 8 bytes left over. Buckets would be protected by
seqlocks stored in the extra space. On the read side we would only
need 2 read barriers (basically free on x86), and we are guaranteed to
have an answer by fetching two pairs of cache lines. We can trade
memory bandwidth for latency by issuing prefetches (once we add
primitives for this). Alternatively we can trade insert speed for
lookup speed by using asymmetrically sized tables.

[1] https://www.cs.princeton.edu/~mfreed/docs/cuckoo-eurosys14.pdf

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Maximum number of WAL files in the pg_xlog directory
Следующее
От: Ryan Johnson
Дата:
Сообщение: Re: WIP: dynahash replacement for buffer table