Re: Performance problem with low correlation data

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Performance problem with low correlation data
Дата
Msg-id 407d949e0907061123g5d2008e6l712d36ee4a28fce8@mail.gmail.com
обсуждение исходный текст
Ответ на Performance problem with low correlation data  (Scara Maccai <m_lists@yahoo.it>)
Список pgsql-general
On Mon, Jul 6, 2009 at 6:32 PM, Scara Maccai<m_lists@yahoo.it> wrote:


> The "best" way to read the table would still be a nested loop, but a loop on the
> "t" values, not on the ne_id values, since data for the same timestamp is "close".

But that would be a different query -- there's no restrictions on the
t values in this one.

> How can I tell to PG to use an algorithm such as:
>
> fetch the heap
>        for each quarter
>                for each id found where groupname='a group name'
>                        fetch all the indexes

Have you tried something using IN or EXISTS instead of a join? The
algorithm you describe doesn't work for the join because it has to
produce a record which includes the matching group columns. A bitmap
scan would return the various groups (I know in your case there's only
one but the optimizer can't be sure) mixed together.

That might work better in 8.4 than 8.3 as the IN and EXISTS handling
is improved.

Actually I wonder if doing a sequential scan with a hash join against
the group list wouldn't be a better option. That would scan more of
the heap but if they're really that spread out it might not make much
difference, and it would avoid having to do all the index scans. You
could experiment with turning enable_nestloop off and see how fast the
hash join plan is.

--
greg
http://mit.edu/~gsstark/resume.pdf

В списке pgsql-general по дате отправления:

Предыдущее
От: Dave Page
Дата:
Сообщение: Re: Windows installer for pg-8.4 confusion
Следующее
От: Michael Gould
Дата:
Сообщение: Added parameter for CREATE ROLE