Re: Why hash join instead of nested loop?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Why hash join instead of nested loop?
Дата
Msg-id 18143.1123614059@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Why hash join instead of nested loop?  (Rhett Garber <rhettg@gmail.com>)
Ответы Re: Why hash join instead of nested loop?  (Rhett Garber <rhettg@gmail.com>)
Список pgsql-performance
Rhett Garber <rhettg@gmail.com> writes:
> Well that could be an issue, is this abnormally large:
> #shared_buffers = 1536          # min 16, at least max_connections*2, 8KB each
> shared_buffers = 206440
> #sort_mem = 131072              # min 64, size in KB
> sort_mem = 524288               # min 64, size in KB
> vacuum_mem = 131072             # min 1024, size in K

The vacuum_mem number is OK I think, but both of the others seem
unreasonably large.  Conventional wisdom about shared_buffers is that
the sweet spot is maybe 10000 or so buffers, rarely more than 50000.
(Particularly in pre-8.0 releases, there are code paths that grovel
through all the buffers linearly, so there is a significant cost to
making it too large.)  Don't worry about it being too small to make
effective use of RAM --- we rely on the kernel's disk cache to do that.

sort_mem is *per sort*, and so half a gig in a machine with only a
couple of gig is far too much except when you know you have only one
query running.  A couple dozen backends each trying to use half a gig
will drive you into the ground in no time.  Conventional wisdom here
is that the global setting should be conservatively small (perhaps
10Mb to 100Mb depending on how many concurrent backends you expect to
have), and then you can explicitly increase it locally with SET for
specific queries that need it.

In terms of the problem at hand, try the test case with a few different
values of sort_mem (use SET to adjust it, you don't need to touch the
config file) and see what happens.  I think the cost you're seeing is
just startup overhead to zero a hash table of a few hundred meg ...

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Michael Fuhr
Дата:
Сообщение: Re: Table locking problems?
Следующее
От: Rhett Garber
Дата:
Сообщение: Re: Why hash join instead of nested loop?