Re: Planner reluctant to start from subquery

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Planner reluctant to start from subquery
Дата
Msg-id 16717.1138899504@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Planner reluctant to start from subquery  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-performance
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes [offlist]:
> Attached is a pg_dump -c file with only the required rows (none of
> which contain confidential data), and 0.1% of the rows from the larger
> tables.  It does show the same pattern of costing and plan choice.

Thanks for the test case.  The first thing I found out was that HEAD
does generate the fast plan from the IN case, while 8.1 does not, and
after a bit of digging the reason became clear.  The initial state
that the planner starts from is essentially

    SELECT ... FROM ((C JOIN P) LEFT JOIN WPCT) IN-JOIN D

(IN-JOIN being a notation for the way the planner thinks about IN, which
is that it's a join with some special runtime behavior).  The problem
with this is that outer joins don't always commute with other joins,
and up through 8.1 we didn't have any code to analyze whether or not
re-ordering outer joins is safe.  So we never did it at all.  HEAD does
have such code, and so it is able to re-order the joins enough to
generate the fast plan, which is essentially

    SELECT ... FROM ((C IN-JOIN D) JOIN P) LEFT JOIN WPCT

This is why eliminating the OUTER JOIN improved things for you.  Your
manual rearrangement into a JOIN-with-GROUP-BY inside the OUTER JOIN
essentially duplicates the IN-JOIN rearrangement that HEAD is able to
do for itself.

BTW, the reason why getting rid of the OR improved matters is that:
(a) with the "WPCT"."profileName" IS NOT NULL part as a top-level WHERE
clause, the planner could prove that it could reduce the OUTER JOIN to
a JOIN (because no null-extended row would pass that qual), whereupon
it had join order flexibility again.
(b) with the "C"."caseType" = 'PA' AND "C"."isConfidential" = false
part as a top-level WHERE clause, there still wasn't any join order
flexibility, but this added restriction on C reduced the number of C
rows enough that there wasn't a performance problem anyway.

So it's all fairly clear now what is happening.  The good news is we
have this fixed for 8.2, the bad news is that that patch is much too
large to consider back-patching.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Marc Morin"
Дата:
Сообщение: Re: partitioning and locking problems
Следующее
От: Alan Stange
Дата:
Сообщение: Re: Huge Data sets, simple queries