Re: performance of bitmap scans in nested loop joins

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: performance of bitmap scans in nested loop joins
Дата
Msg-id 21246.1116278184@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: performance of bitmap scans in nested loop joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> ... Where we are losing is mostly on the actual manipulation
> of the bitmaps (particularly hash_seq_search which is done in
> tbm_begin_iterate; and it looks like memory allocation for the bitmap
> hashtables is nontrivial too).  I had already had a TODO item to look
> into speeding up hash_seq_search ... will see what I can find.

I got around to looking at this some more.  It seems that the basic
problem is that dynahash.c isn't optimized for very small hashtables.
The test case I'm looking at (which may be an oversimplification of
Sergey's problem) never has more than one entry in the hashtables it
creates for in-memory bitmaps, and so the hashtable overhead is pretty
significant.  Particularly the overhead of creating and deleting 'em.

The other uses of dynahash.c that are at all performance-critical seem
to deal with hashtables that are (a) fairly large and (b) long-lived.
So I'm thinking that hacking dynahash.c itself to improve this situation
would probably be counterproductive overall.

An idea that comes to mind is to allow tidbitmap.c to handle the cases
of zero or one page entry directly, storing the page entry in a simple
field of the TIDBitmap struct.  Only when the bitmap needs entries for
more than one heap page will we create a subsidiary hash table.  (Note
that we could store bits for more than one tuple, if they happen to be
on the same heap page.)

This would uglify the logic in tidbitmap.c a bit, but it doesn't seem
completely preposterous.  The main objection I can see is that it would
only optimize the zero-or-one-page cases, which might be insufficient.

Anyone have a better idea for speeding up operations on small bitmaps?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hannu Krosing
Дата:
Сообщение: Re: SQL Request Size
Следующее
От: Tom Lane
Дата:
Сообщение: Re: SQL Request Size