Re: Re: Query planner using hash join when merge join seems orders of magnitude faster

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Re: Query planner using hash join when merge join seems orders of magnitude faster
Дата
Msg-id 20318.1470092209@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Query planner using hash join when merge join seems orders of magnitude faster  (Branden Visser <mrvisser@gmail.com>)
Ответы Re: Re: Query planner using hash join when merge join seems orders of magnitude faster  (Branden Visser <mrvisser@gmail.com>)
Список pgsql-general
Branden Visser <mrvisser@gmail.com> writes:
> I just wanted to update that I've found evidence that fixing the
> planner row estimation may not actually influence it to use the more
> performant merge join instead of hash join. I have found instances
> where the row estimation is *overestimated* by a magnitude of 4x
> (estimates 2.4m rows) and still chooses hash join over merge join,
> where merge join is much faster (45s v.s. 12s).

I wonder why the merge join is faster exactly.  It doesn't usually have a
huge benefit unless the inputs are presorted already.  The one case I can
think of where it can win quite a lot is if the range of merge keys in one
input is such that we can skip reading most of the other input.  (Extreme
example: one input has keys 1..10, but the other input has keys 1..10000.
We only need to read the first 1% of the second input, assuming there's an
index on its key column so that we don't have to read the whole thing
anyway to sort it.)

The planner is aware of that effect, but I wonder if it's misestimating it
for some reason.  Anyway it would be worth looking closely at your EXPLAIN
ANALYZE results to determine whether early-stop is happening or not.  It'd
manifest as one join input node showing an actual number of rows returned
that's less than you'd expect.

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Force pg_hba.conf user with LDAP
Следующее
От: Venkata Balaji N
Дата:
Сообщение: Re: How to best archetect Multi-Tenant SaaS application using Postgres