Re: tbm_lossify causes "unbalanced" hashtables / bitmaps

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: tbm_lossify causes "unbalanced" hashtables / bitmaps
Дата
Msg-id 30437.1474666813@sss.pgh.pa.us
обсуждение исходный текст
Ответ на tbm_lossify causes "unbalanced" hashtables / bitmaps  (Andres Freund <andres@anarazel.de>)
Ответы Re: tbm_lossify causes "unbalanced" hashtables / bitmaps  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> Because iteration (both in my implementation and dynahash) is
> essentially in bucket order, the front of the hashtable will be mostly
> empty, whereas later parts of the hashtable will be full (i.e. a lot of
> collisions). The reason for that is that we'll find all parts of the
> hashtable that are uncompressed and "early" in the hashspace, and
> they'll possibly hash to something later in the table.

Hm, yeah, I had supposed that this would hit a random part of the key
space, but you're right that over successive calls it's not good.

My idea of an appropriate fix would be to resume the scan at the same
point where the last scan stopped, and work circularly around the table
when necessary.  But I'm not sure there is any really good way to do that
in the dynahash infrastructure.  Maybe it'd work to keep the iteration
state around, but I don't remember how well that interacts with other
insertions/deletions.  What about in your implementation?

There's also the point mentioned in the existing comment, that it'd be
better to go after pages with more bits set first.  Not sure of an
inexpensive way to do that (ie, one that doesn't involve multiple
scans of the hashtable).  But your results suggest that maybe it'd
be worth making tbm_lossify slower in order to get better results.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: tbm_lossify causes "unbalanced" hashtables / bitmaps
Следующее
От: Andres Freund
Дата:
Сообщение: Re: tbm_lossify causes "unbalanced" hashtables / bitmaps