not exits slow compared to not in. (nested loops killing me)

Поиск
Список
Период
Сортировка
От mark
Тема not exits slow compared to not in. (nested loops killing me)
Дата
Msg-id BANLkTinjCKm3kC4GemetBmvOcwL5HBwqww@mail.gmail.com
обсуждение исходный текст
Ответы Re: not exits slow compared to not in. (nested loops killing me)  (Craig Ringer <craig@postnewspapers.com.au>)
Список pgsql-performance
Hi all,

I am trying to speed up a query on a DB I inherited and I am falling
flat on my face .

I changed a query from NOT IN to use NOT EXISTS and my query time went
from 19000ms to several hours (~50000000  ms). this shocked me so much
I pretty much had to post. This seems like a corner case of the
planner not knowing that the nested-loops are going to turn out badly
in this case. The planner choosing a 13hr nested loop here is
basically the reason I am posting.

I have played around with rewriting this query using some CTEs and a
left join but thus far my results are not encouraging.   Given what
little I know , it seems like a LEFT JOIN where right_table.col is
null gets the same performance and estimates as a NOT EXISTS. (and
still picks a nested loop in this case)

I can see where it all goes to hell time wise, turning off nested
loops seems to keep it from running for hours for this query, but not
something I am looking to do globally. The time is not really that
much better than just leaving it alone with a NOT IN.

two queries are at http://pgsql.privatepaste.com/a0b672bab0#

the "pretty" explain versions :

NOT IN (with large work mem - 1GB)
http://explain.depesz.com/s/ukj

NOT IN (with only 64MB for work_mem)
http://explain.depesz.com/s/wT0

NOT EXISTS (with 64MB of work_mem)
http://explain.depesz.com/s/EuX

NOT EXISTS (with nested loop off. and 64MB of work_mem)
http://explain.depesz.com/s/UXG

LEFT JOIN/CTE (with nested loop off and 1GB of work_mem)
http://explain.depesz.com/s/Hwm

table defs, with estimated row counts (which all 100% match exact row count)
http://pgsql.privatepaste.com/c2ff39b653

tried running an analyze across the whole database, no affect.

I haven't gotten creative with explicit join orders yet .

postgresql 9.0.2.

willing to try stuff for people as I can run things on a VM for days
and it is no big deal. I can't do that on production machines.

thoughts ? ideas ?


-Mark

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: poor performance when recreating constraints on large tables
Следующее
От: Josh Berkus
Дата:
Сообщение: 8.4/9.0 simple query performance regression