Re: WIP: dynahash replacement for buffer table

Поиск
Список
Период
Сортировка
От Ants Aasma
Тема Re: WIP: dynahash replacement for buffer table
Дата
Msg-id CA+CSw_uRPJY5WReBtrS3oiTX0CYqOPbJh1SUNgZdKrytA0OxMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: dynahash replacement for buffer table  (Ants Aasma <ants@cybertec.at>)
Список pgsql-hackers
On Oct 15, 2014 7:32 PM, "Ants Aasma" <ants@cybertec.at> wrote:
> I'm imagining a bucketized cuckoo hash with 5 item buckets (5-way
> associativity). This allows us to fit the bucket onto 2 regular sized
> cache lines and have 8 bytes left over. Buckets would be protected by
> seqlocks stored in the extra space. On the read side we would only
> need 2 read barriers (basically free on x86), and we are guaranteed to
> have an answer by fetching two pairs of cache lines. We can trade
> memory bandwidth for latency by issuing prefetches (once we add
> primitives for this). Alternatively we can trade insert speed for
> lookup speed by using asymmetrically sized tables.

I was thinking about this. It came to me that we might not even need
BufferTag's to be present in the hash table. In the hash table itself
we store just a combination of 4 byte buffer tag hash-buffer id. This
way we can store 8 pairs in one cacheline. Lookup must account for the
fact that due to hash collisions we may find multiple matches one of
which may be the buffer we are looking for.  We can make condition
exceedingly unlikely by taking advantage of the fact that we have two
hashes anyway, in each table we use the lookup hash of the other table
for tagging. (32bit hash collision within 8 items). By having a
reasonably high load factor (75% should work) and 8 bytes per key we
even have a pretty good chance of fitting the hotter parts of the
buffer lookup table in CPU caches.

We use a separate array for the locks (spinlocks should be ok here)
for fine grained locking, this may be 1:1 with the buckets or we can
map many buckets to a single lock. Otherwise the scheme should work
mostly the same. Using this scheme we can get by on the lookup side
using 64 bit atomic reads with no barriers, we only need atomic ops
for pinning the found buffer.

I haven't had the chance to check out how second-chance hashing works
and if this scheme would be applicable to it.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ryan Johnson
Дата:
Сообщение: Re: WIP: dynahash replacement for buffer table
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: UPSERT wiki page, and SQL MERGE syntax