Обсуждение: Query 4-5 times slower after ANALYZE

Поиск
Список
Период
Сортировка

Query 4-5 times slower after ANALYZE

От
"Philippe Lang"
Дата:
Hi,

I'm using Postgresql 8.3.6 under Freebsd 7.1.

After a fresh restore of a customer dump (running version 8.2.7 at the
moment), a rather big query executes in about 30 seconds. As soon as I
run ANALYZE, it is instantly 4-5 times slower. I could check that
multiples times.

Here is the EXPLAIN ANALYZE before the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.t
xt

And here the the EXPLAIN ANALYZE after the ANALYZE:
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.tx
t

Any idea what could be turned on/off in order not to have this slowdown
after the ANALYZE?

Best regards,

---------------
Philippe Lang
Attik Syste,

Re: Query 4-5 times slower after ANALYZE

От
Bill Moran
Дата:
In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:
>
> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>
> After a fresh restore of a customer dump (running version 8.2.7 at the
> moment), a rather big query executes in about 30 seconds. As soon as I
> run ANALYZE, it is instantly 4-5 times slower. I could check that
> multiples times.
>
> Here is the EXPLAIN ANALYZE before the ANALYZE:
> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.txt
>
> And here the the EXPLAIN ANALYZE after the ANALYZE:
> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.txt
>
> Any idea what could be turned on/off in order not to have this slowdown
> after the ANALYZE?

I opened one of those links figuring I'd take a few minutes to see if I could
muster up some advice ... and just started laughing ... definitely not the
type of query that one can even understand in just a few minutes!

Anyway, the real reason I posted -- I doubt if anyone will be able to make
sense of a query plan that complex without the actual query, so you'll
probably want to post it as well.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Query 4-5 times slower after ANALYZE

От
Gregory Stark
Дата:
Bill Moran <wmoran@potentialtech.com> writes:

> I opened one of those links figuring I'd take a few minutes to see if I could
> muster up some advice ... and just started laughing ... definitely not the
> type of query that one can even understand in just a few minutes!

You might consider setting default_statistics_target to 100 and re-analyzing.
The estimates don't look too far off but like Bill I haven't analyzed it very
carefully.

One other things that stands out, using comparisons like
 ('now'::date - creation_date) >= <expression>

is going to make it hard to optimize. Better to use something like
 creation_date <= now() - <expression>

Both because of the now() instead of 'now'::date and because the latter is a
comparison that can be indexed instead of an expression which could use an
index on creation_date.


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

Re: Query 4-5 times slower after ANALYZE

От
tv@fuzzy.cz
Дата:
> In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:
>>
>> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>>
>> After a fresh restore of a customer dump (running version 8.2.7 at the
>> moment), a rather big query executes in about 30 seconds. As soon as I
>> run ANALYZE, it is instantly 4-5 times slower. I could check that
>> multiples times.
>>
>> Here is the EXPLAIN ANALYZE before the ANALYZE:
>> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.txt
>>
>> And here the the EXPLAIN ANALYZE after the ANALYZE:
>> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.txt
>>
>> Any idea what could be turned on/off in order not to have this slowdown
>> after the ANALYZE?
>
> I opened one of those links figuring I'd take a few minutes to see if I
> could
> muster up some advice ... and just started laughing ... definitely not the
> type of query that one can even understand in just a few minutes!
>
> Anyway, the real reason I posted -- I doubt if anyone will be able to make
> sense of a query plan that complex without the actual query, so you'll
> probably want to post it as well.

Yeah, I had the same problem ;-) Anyway the reason why this query is so
slow is scanning the "customers" table - before the analysis an index scan
is used (and it's really fast), while after the analyze a sequential scan
is used instead (it's scanned several times and it takes almost 30 seconds
every time).

The question is why is a sequential scan chosen instead of index scan -
the estimated row counts seem quite precise, so maybe there's something
wrong with the cost settings. Have you modified the _cost parameters? What
are the current values? See this

http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html

Try to increase the seq_page_cost a (you may do that for the session only)
and see if the performance improves. You may even disable the sequential
scan using enable_seqscan = off.

Another cause might be a low statistics target - try to increase it with

ALTER TABLE SET STATISTICS

but the funny thing is it's not necessarily the "customers" table ;-)

regards
Tomas


Re: Query 4-5 times slower after ANALYZE

От
Gregory Stark
Дата:
Oh, the other thing you could try experimenting with are these two parameters.

Your query has *way* more tables than the default values for these so you
would have to raise them substantially. Given that the query was running in
30s you may find that this increases the planning time by more time than it
saves in the query -- the reason these limits exist at all..

geqo_threshold
join_collapse_limit
from_collapse_limit

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

Re: Query 4-5 times slower after ANALYZE

От
Sebastian Pawłowski
Дата:
hi

it's look better with explain.depesz.com :) you can easily find that
in "before" it uses index scan and in "after" it uses seq


before:

http://explain.depesz.com/s/RC

after:

http://explain.depesz.com/s/nm

try as mentioned before to change costs

Best regards,

Sebastian Pawłowski


Wiadomość napisana w dniu 2009-03-18, o godz. 10:55, przez Philippe
Lang:

> Hi,
>
> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>
> After a fresh restore of a customer dump (running version 8.2.7 at the
> moment), a rather big query executes in about 30 seconds. As soon as I
> run ANALYZE, it is instantly 4-5 times slower. I could check that
> multiples times.
>
> Here is the EXPLAIN ANALYZE before the ANALYZE:
> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before.t
> xt
>
> And here the the EXPLAIN ANALYZE after the ANALYZE:
> http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.tx
> t
>
> Any idea what could be turned on/off in order not to have this
> slowdown
> after the ANALYZE?
>
> Best regards,
>
> ---------------
> Philippe Lang
> Attik Syste,
>
> -
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Query 4-5 times slower after ANALYZE

От
"Philippe Lang"
Дата:
pgsql-general-owner@postgresql.org wrote:
> In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:
>>
>> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>>
>> After a fresh restore of a customer dump (running version 8.2.7 at
>> the moment), a rather big query executes in about 30 seconds. As
>> soon as I run ANALYZE, it is instantly 4-5 times slower. I could
>> check that multiples times.
>>
>> Here is the EXPLAIN ANALYZE before the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before
>> .txt
>>
>> And here the the EXPLAIN ANALYZE after the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.
>> txt
>>
>> Any idea what could be turned on/off in order not to have this
>> slowdown after the ANALYZE?
>
> I opened one of those links figuring I'd take a few minutes to see if
> I could muster up some advice ... and just started laughing ...
> definitely not the type of query that one can even understand in just
> a few minutes!
>
> Anyway, the real reason I posted -- I doubt if anyone will be able to
> make sense of a query plan that complex without the actual query, so
> you'll probably want to post it as well.

:) What? I thought you would read that like Neo was reading the
Matrix... :)

Bill, Gregory, Tomas and Sebastian, thanks for your answers. I have
tried everything your mentioned:

- increase seq_page_cost
- increase geqo_threshold
- increase join_collapse_limit
- increase from_collapse_limit

But it did not help (except disabling completely sequential scans), and
for a reason I think I understand better now: part of the query looks
like:

----------------
SELECT

c.id AS customer_id,
c.name AS customer_name,
d.id AS document_id,
d.number AS document_number,
d.vref AS document_vref,
dt.name AS type,
d.creation_date AS value_date

FROM documents AS d

LEFT JOIN payment_terms AS pt
ON d.payment_term_id = pt.id

INNER JOIN reminder_levels AS rl
ON d.reminder_level_id = rl.id

INNER JOIN document_types AS dt
ON d.document_type_id = dt.id

INNER JOIN projects AS p
ON d.project_id = p.id

INNER JOIN customers AS c
ON p.customer_id = c.id

WHERE d.reminder = 1

AND solde_po(CURRENT_DATE, c.id) > 0

AND d.creation_date <= CURRENT_DATE
----------------

The heavy part here is the "solde_po" call (at the end), which takes up
most CPU time. That's why scanning the customers table takes up so much
time. I imagine a small change in the way this table is scanned can have
enormous effects in the overall execution time, like when an sequential
scan is preferred over an index scan. Does that sound correct?

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated? Or are they treated completely separately?

Philippe

P.S. Thanks for the link to "explain.depesz.com"! Great tool!







Re: Query 4-5 times slower after ANALYZE

От
Bill Moran
Дата:
In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:

[snip]

> > Anyway, the real reason I posted -- I doubt if anyone will be able to
> > make sense of a query plan that complex without the actual query, so
> > you'll probably want to post it as well.
>
> :) What? I thought you would read that like Neo was reading the
> Matrix... :)

Yeah ... not quite like that, although it appears that there are some
on-list who do have that level of skill ...

[more snip ...]

> AND solde_po(CURRENT_DATE, c.id) > 0
>
> AND d.creation_date <= CURRENT_DATE
> ----------------
>
> The heavy part here is the "solde_po" call (at the end), which takes up
> most CPU time. That's why scanning the customers table takes up so much
> time. I imagine a small change in the way this table is scanned can have
> enormous effects in the overall execution time, like when an sequential
> scan is preferred over an index scan. Does that sound correct?

Quite likely.

> A small question here: solde_po is an SQL function (not PLPGSQL). Is it
> "inlined" in the parent query before the whole query execution plan is
> calculated? Or are they treated completely separately?

Is that function STABLE?  If it's VOLATILE, can it be rewritten to be
STABLE?  That might make a lot of difference if it's repeatedly called
with the same values.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Query 4-5 times slower after ANALYZE

От
Tom Lane
Дата:
"Philippe Lang" <philippe.lang@attiksystem.ch> writes:
> A small question here: solde_po is an SQL function (not PLPGSQL). Is it
> "inlined" in the parent query before the whole query execution plan is
> calculated?

You should be able to tell that by inspecting the filter conditions
in the ANALYZE output.  Do you see solde_po(), or its innards?

(Or in other words: it depends, and you didn't give us sufficient
info to say.)

            regards, tom lane