Обсуждение: PG 9.x prefers slower Hash Joins?

Поиск
Список
Период
Сортировка

PG 9.x prefers slower Hash Joins?

От
Anthony Presley
Дата:
In relation to my previous thread (about SW RAID vs. HW RAID on a P400), I was able to narrow down the filesystem speed and in general, our new system (running PG 9.1) is about 3x - 5x faster on the IO.

In looking at the query plans in more depth, it appears that PG 9.0 and 9.1 are both preferring to do hash joins, which seem to have a "linear" time and are slower than PG 8.4 doing an index scan.

For example, on PG 9.x:
  http://explain.depesz.com/s/qji - This takes 307ms, all the time.  Doesn't matter if it's "cached", or fresh from a reboot.

Same query on PG 8.4:
  http://explain.depesz.com/s/8Pd - This can take 2-3s the first time, but then takes 42ms once it's cached.

Both of these servers have the same indexes, similar postgresql.conf, and almost identical data.  However, the old server is doing some different planning than the new server.  I've run analyze on both of these databases.  Some relevant PG parameters:

  max_connections = 150
  shared_buffers = 6400MB (have tried as high as 20GB)
  work_mem = 20MB (have tried as high as 100MB)
  effective_io_concurrency = 6
  fsync = on
  synchronous_commit = off
  wal_buffers = 16MB
  checkpoint_segments = 30  (have tried 200 when I was loading the db)
  random_page_cost = 2.5
  effective_cache_size = 10240MB  (have tried as high as 16GB)

If I disable the hashjoin, I get massive improvements on PG 9.x ... as fast (or faster) than our PG 8.4 instance.


--
Anthony Presley

Re: PG 9.x prefers slower Hash Joins?

От
Jeff Janes
Дата:
On Tue, Sep 13, 2011 at 4:56 AM, Anthony Presley <anthony@resolution.com> wrote:
> In relation to my previous thread (about SW RAID vs. HW RAID on a P400), I
> was able to narrow down the filesystem speed and in general, our new system
> (running PG 9.1) is about 3x - 5x faster on the IO.
> In looking at the query plans in more depth, it appears that PG 9.0 and 9.1
> are both preferring to do hash joins, which seem to have a "linear" time and
> are slower than PG 8.4 doing an index scan.
>
> For example, on PG 9.x:
>   http://explain.depesz.com/s/qji - This takes 307ms, all the time.  Doesn't
> matter if it's "cached", or fresh from a reboot.
> Same query on PG 8.4:
>   http://explain.depesz.com/s/8Pd - This can take 2-3s the first time, but
> then takes 42ms once it's cached.

Does executing this same query repeatedly with the same parameters
reflect real production use patterns of your system?


> Both of these servers have the same indexes, similar postgresql.conf, and
> almost identical data.  However, the old server is doing some different
> planning than the new server.  I've run analyze on both of these databases.

...
> If I disable the hashjoin, I get massive improvements on PG 9.x ... as fast
> (or faster) than our PG 8.4 instance.

Can you include buffers in your explain analyze?  Also, what is the
plan for the new server when hashjoin is disabled?

What if you lower random_page_cost to 1 (or to whatever value seq_page_cost is)?


Cheers,

Jeff