Re: Outer Join performance in PostgreSQL

Поиск
Список
Период
Сортировка
От Michael Alan Dorman
Тема Re: Outer Join performance in PostgreSQL
Дата
Msg-id 87d5l9r1mm.fsf@hero.mallet-assembly.org
обсуждение исходный текст
Ответ на Outer Join performance in PostgreSQL  (Ashok Agrawal <Ashok.Agrawal@Sun.COM>)
Ответы Re: Outer Join performance in PostgreSQL  (Ashok Agrawal <Ashok.Agrawal@Sun.COM>)
Список pgsql-performance
Ashok Agrawal <Ashok.Agrawal@Sun.COM> writes:
> I noticed outer join is very very slow in postgresql as compared
> to Oracle.

I think the three things the people best able to help you are going to
ask for are 1) what version of PostgreSQL, 2) what are the tables, and
how many rows in each, and 3) output from 'explain analyze' rather
than just 'explain'.

That said, I'm willing to take an amateurish stab at it even without
that.

In fact, I don't think the outer joins are the issue at all.  I see
that you're forcing a right join from ecms_certificate_types to
ecms_cert_headers.  This seems to be causing postgresql to think it
must (unnecessarily) consider three quarters of a billion rows, which,
if I'm reading right, seems to be producing the majority of the
estimated cost:

>  Hash Join  (cost=1666049.74..18486619.37 rows=157735046 width=874)
>    Hash Cond: ("outer".certificate_type_id = "inner".certificate_type_id)
>    ->  Merge Right Join  (cost=1666048.13..11324159.05 rows=643816513 width=826)

In fact, looking at the fact that you're doing a COALESCE on a column
from b, it seems to me that doing a right join from ecms_cert_headers
to ecms_certificate_types is just wrong.  It seems to me that that
should be a left join as well.

With that in mind, I would rewrite the whole FROM clause as:

           FROM ecms_cert_headers a
LEFT OUTER JOIN ecms_certificate_types b
             ON (a.certificate_type_id = b.certificate_type_id)
LEFT OUTER JOIN taxpack_user c
             ON (a.created_by = c.emp_no)
LEFT OUTER JOIN taxpack_user d
             ON (a.updated_by = d.emp_no)
LEFT OUTER JOIN taxpack_user e
             ON (a.approved_by = e.emp_no)
          WHERE a.dln_code = '17319'

It seems to me that this more reflects the intent of the data that is
being retrieved.  I would also expect it to be a boatload faster.

Assuming I've understood the intent correctly, I would guess that the
difference is the result of the Oracle planner being able to eliminate
the right join or something.

Mike

В списке pgsql-performance по дате отправления:

Предыдущее
От: Ron Peacetree
Дата:
Сообщение: Re: Some help on buffers and other performance tricks
Следующее
От: Ron Peacetree
Дата:
Сообщение: Re: Sort performance on large tables