Re: hash join vs nested loop join

Поиск
Список
Период
Сортировка
От Huan Ruan
Тема Re: hash join vs nested loop join
Дата
Msg-id CAD1stZtH9WbvHnKtcRVD_w5OYabXh9k18vUhygvYBe8yozPk6A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: hash join vs nested loop join  ("Kevin Grittner" <kgrittn@mail.com>)
Список pgsql-performance
On 21 December 2012 01:06, Kevin Grittner <kgrittn@mail.com> wrote:
Huan Ruan wrote:
> Kevin Grittner wrote:

>> Frankly, at 12 microseconds per matched pair of rows, I think
>> you're doing OK.
>
> This plan is the good one, I want the indexscan nested loop join and this
> is only achieved after making all these costing factors change. Before
> that, it was hash join and was very slow.
>
> However, I'm worried about the config changes being too 'extreme', i.e.
> both sequential I/O and random I/O have the same cost and being only 0.1.
> So, I was more wondering why I have to make such dramatic changes to
> convince the optimiser to use NL join instead of hash join. And also, I'm
> not sure what impact will these changes have on other queries yet. e.g.
> will a query that's fine with hash join now choose NL join and runs slower?

I understand the concern, but PostgreSQL doesn't yet have a knob to
turn for "cache hit ratio". You essentially need to build that into
the page costs. Since your cache hit ratio (between shared buffers
and the OS) is so high, the cost of page access relative to CPU
costs has declined and there isn't any effective difference between
sequential and random access. As the level of caching changes, you
may need to adjust. In one production environment where there was
significant caching, but far enough from 100% to matter, we tested
various configurations and found the fastest plans being chosen
with seq_page_cost = 0.3 and random_page_cost = 0.5. Tune to your
workload.


Thanks Kevin. I think I get some ideas now that I can try on the production server when we switch. 

В списке pgsql-performance по дате отправления:

Предыдущее
От: Charles Gomes
Дата:
Сообщение: Re: Performance on Bulk Insert to Partitioned Table
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Performance on Bulk Insert to Partitioned Table