Обсуждение: Fwd: views much slower in 9.3 than 8.4

Поиск
Список
Период
Сортировка

Fwd: views much slower in 9.3 than 8.4

От
"Carson, Leonard"
Дата:

There is only one server at this point.  The 8.4 machine was upgraded to 9.3 about a year ago and we have no 8.4 backups so it's difficult if not impossible to recreate the 8.4 environment AFAIK.  One of our developers pointed out the discrepancy in execution times.  I decomposed a slow view and found out that it consists of a view calling a view calling a view (3 deep).  This is the analyze explain plan of the innermost view:


And as you requested:

SELECT version();

Postgresql 9.3.4

SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');

autovacuum,on,configuration file
checkpoint_completion_target,0.9,configuration file
checkpoint_segments,16,configuration file
client_encoding,UTF8,session
client_min_messages,notice,configuration file
DateStyle,"ISO, MDY",configuration file
deadlock_timeout,5s,configuration file
default_text_search_config,pg_catalog.english,configuration file
effective_cache_size,8GB,configuration file
effective_io_concurrency,6,configuration file
lc_messages,en_US.UTF-8,configuration file
lc_monetary,en_US.UTF-8,configuration file
lc_numeric,en_US.UTF-8,configuration file
lc_time,en_US.UTF-8,configuration file
listen_addresses,*,configuration file
log_checkpoints,on,configuration file
log_connections,on,configuration file
log_destination,stderr,configuration file
log_directory,/dbms/postgresql/logs/dtfdev,configuration file
log_disconnections,on,configuration file
log_duration,off,configuration file
log_error_verbosity,verbose,configuration file
log_filename,postgresql-%a.log,configuration file
log_hostname,on,configuration file
log_line_prefix,"%t [%p]: [%l-1] db=%d,user=%u ",configuration file
log_lock_waits,on,configuration file
log_min_duration_statement,0,configuration file
log_min_error_statement,error,configuration file
log_min_messages,warning,configuration file
log_rotation_age,1d,configuration file
log_rotation_size,500MB,configuration file
log_statement,none,configuration file
log_temp_files,0,configuration file
log_timezone,US/Pacific,configuration file
log_truncate_on_rotation,on,configuration file
logging_collector,on,configuration file
maintenance_work_mem,256MB,configuration file
max_connections,200,configuration file
max_stack_depth,8MB,configuration file
port,2222,configuration file
random_page_cost,2,configuration file
search_path,"acct, ""$user"", public",session
shared_buffers,4GB,configuration file
ssl,on,configuration file
temp_buffers,16MB,configuration file
TimeZone,US/Pacific,configuration file
wal_level,minimal,configuration file
wal_sync_method,fdatasync,configuration file
work_mem,5MB,configuration file

server has 24GB of RAM

from postgresql.conf:
shared_buffers = 4GB
effective_cache_size = 8GB
work_mem = 5MB  (note: I increased work_mem to 500MB and repeated the experiment, no difference in exec. time)


On Mar 18, 2015, at 10:16 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

"Carson, Leonard" <lcarson@sdsc.edu> wrote:

While acknowledging that nested loops and sequential table scans
account for 85% of the execution time which suggests that a
better query may be needed, why would the same query run in
seconds on 8.x but take minutes on 9.x?

First, please show the output of this from both servers:

SELECT version();
SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');

Then, for your newer server, please follow the steps outlined here:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

My first guess would be that at some point your costing parameters
were tuned on the old system, but have not yet been tuned on the
new one.  Rather than blindly using the old settings for the new
server, it would be good to see the information requested on the
above-cited page to determine good settings for the new server.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Fwd: views much slower in 9.3 than 8.4

От
Tom Lane
Дата:
"Carson, Leonard" <lcarson@sdsc.edu> writes:
> There is only one server at this point.  The 8.4 machine was upgraded to 9.3 about a year ago and we have no 8.4
backupsso it's difficult if not impossible to recreate the 8.4 environment AFAIK.  One of our developers pointed out
thediscrepancy in execution times.  I decomposed a slow view and found out that it consists of a view calling a view
callinga view (3 deep).  This is the analyze explain plan of the innermost view: 

> http://explain.depesz.com/s/IMg

You're probably going to need to show us the actual view definitions.

I'm suspicious that the underlying cause might have to do with recent
versions being warier about optimizing sub-selects containing volatile
functions than 8.4 was.  However, that theory doesn't seem to explain
the horribly bad join size estimates you're showing.

            regards, tom lane


Re: views much slower in 9.3 than 8.4

От
"Carson, Leonard"
Дата:
Here are the 3 views and some timing notes:
thanks, lcarson

On Mar 18, 2015, at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Carson, Leonard" <lcarson@sdsc.edu> writes:
There is only one server at this point.  The 8.4 machine was upgraded to 9.3 about a year ago and we have no 8.4 backups so it's difficult if not impossible to recreate the 8.4 environment AFAIK.  One of our developers pointed out the discrepancy in execution times.  I decomposed a slow view and found out that it consists of a view calling a view calling a view (3 deep).  This is the analyze explain plan of the innermost view:

http://explain.depesz.com/s/IMg

You're probably going to need to show us the actual view definitions.

I'm suspicious that the underlying cause might have to do with recent
versions being warier about optimizing sub-selects containing volatile
functions than 8.4 was.  However, that theory doesn't seem to explain
the horribly bad join size estimates you're showing.

regards, tom lane

Re: views much slower in 9.3 than 8.4

От
Tom Lane
Дата:
"Carson, Leonard" <lcarson@sdsc.edu> writes:
> Here are the 3 views and some timing notes:
> http://pgsql.privatepaste.com/decae31693#

That doesn't really leave us any wiser than before, unfortunately.

It's clear that the root of the problem is the drastic underestimation of
the size of the rq/a join, but it's not clear why that's happening, nor
why 8.4 would not have fallen into the same trap.

Would it be possible to provide the data in the join columns involved in
that part of the query?  To wit
requests.account_id
requests.start_date
allocations.account_id
allocations.initial_start_date
allocations.resource_id

            regards, tom lane


Re: views much slower in 9.3 than 8.4

От
Tom Lane
Дата:
I wrote:
> "Carson, Leonard" <lcarson@sdsc.edu> writes:
>> Here are the 3 views and some timing notes:
>> http://pgsql.privatepaste.com/decae31693#

> That doesn't really leave us any wiser than before, unfortunately.

> It's clear that the root of the problem is the drastic underestimation of
> the size of the rq/a join, but it's not clear why that's happening, nor
> why 8.4 would not have fallen into the same trap.

Leonard was kind enough to provide the problematic data off-list, and
here's what I find after some poking around: 8.4 is not, in fact, any
smarter than the more recent versions, it just happens to get lucky
on this particular query.  The core of the problem is this aspect of
the projv view:

SELECT ...
   FROM
    allocations a,
    ... other relations ...
  WHERE
    a.initial_start_date = (SELECT max(allocations.initial_start_date)
                            FROM allocations
                            WHERE allocations.account_id = a.account_id AND
                                  allocations.resource_id = a.resource_id)
    AND ... a bunch of other conditions ...

(There's a similar consider-only-the-latest-row construction for
accounts_history, which doubles the problem, but let's just look at this
one for now.)  Now there are two things that are bad about this: first,
the construct requires executing the MAX-subselect for every row of "a",
which is expensive and the planner knows it.  Second, it's very very hard
for the planner to guess right about the selectivity of this condition on
a.initial_start_date.  It ends up using DEFAULT_EQ_SEL which is 0.005,
but given Leonard's data set the actual selectivity is just about 1.0,
ie, there are no records that aren't the latest for their account_id cum
resource_id and thus no rows are eliminated by the condition anyway.

So we have an expensive scan on "a" that is going to produce many more
rows than the planner thinks.  By the time we get done joining to
accounts_history, which has a similar problem, the planner is estimating
only one row out of the join (vs. nearly 24000 in reality), and it's
setting the total cost estimate at 148163 cost units.  This just totally
bollixes planning of the joins to the remaining half-dozen tables.
The single-row estimate is nearly fatal in itself, because it encourages
nestloop joining which is pretty inappropriate here.  But the other
problem is that the planner considers less-than-1% differences in cost
estimates to be "in the noise", which means that it's not going to
consider cost differences of less than 1480 units in the remaining join
steps to be significant.  This is how come we end up with the apparently
brain-dead decisions to use seqscans on some of the other tables such as
"pi" and "ac": comparing the seqscan to a potential inner indexscan, the
total cost of the join is "the same" according to the 1% rule, and then
the first tiebreaker is startup cost, and the indexscan has a higher
startup cost.

Now, 8.4 also had the 1% rule, but it had slightly different tiebreaking
procedures, which caused it to end up picking the inner indexscans over
the seqscans; and in this particular data set inner indexscans do far
better than repeated seqscans when the rq/a/ah join turns out to produce
24000 times more tuples than predicted.  But I can't persuade myself that
the tiebreaking changes amount to a bug.  (I did experiment with varying
the tiebreaking rules a bit, but I think that would just be moving the
pain around.)

Long-term planner fixes for this type of problem might include improving
the statistics enough that we could get better rowcount estimates.
(Cross-column stats would help, since a contributing factor is that some
of the joins are on two join columns that are pretty heavily correlated.)
Another thing we've discussed is making risk estimates, whereby we could
realize that the nestloop-plus-seqscan plans are going to be a lot worse
if our rowcount estimates are off at all.  But both of those things are
research projects.

What seems like a potential near-term fix for Leonard is to recast his
views to do the latest-row selection more intelligently.  I experimented
with redoing the projv view like this to eliminate the
subselects-in-WHERE:

SELECT ...
   FROM
    acct.requests rq,
    acct.fields_of_science fos,
    acct.accounts ac,
    acct.allocations a,
    (select account_id, resource_id, max(initial_start_date) AS initial_start_date
     FROM acct.allocations GROUP BY 1,2) a_latest,
    acct.transaction_types tt,
    acct.resources ar,
    acct.accounts_history ah,
    (select account_id, resource_id, max(activity_time) AS activity_time
     FROM acct.accounts_history GROUP BY 1,2) ah_latest,
    acct.allocation_states sx,
    acct.principal_investigators pi,
    acct.people p
  WHERE
 a.account_id = ac.account_id AND
 a.account_id = a_latest.account_id AND
 a.resource_id = a_latest.resource_id AND
 a.initial_start_date = a_latest.initial_start_date AND
 rq.account_id = a.account_id AND
 rq.start_date = a.initial_start_date AND
 ar.resource_id = a.resource_id AND
 a.allocation_type_id = tt.transaction_type_id AND
 ah.account_id = a.account_id AND
 ah.resource_id = a.resource_id AND
 ah.account_id = ah_latest.account_id AND
 ah.resource_id = ah_latest.resource_id AND
 ah.activity_time = ah_latest.activity_time AND
 sx.state_id = ah.state_id AND
 rq.primary_fos_id = fos.field_of_science_id AND
 pi.request_id = rq.request_id AND
 p.person_id = pi.person_id
;

That produces significantly better plans.  It doesn't look like the
rowcount estimates are better :-( ... but the total estimated cost
is now down in the range of 3000 or so cost units, which means that
the 1% rule doesn't keep us from adopting the inner indexscans.  And
this is fundamentally a better way to do latest-row selection, anyhow.

(I guess another potential research project is to do this sort of
aggregated-subselect transformation automatically.  But don't hold
your breath for that to happen, either.)

            regards, tom lane


Re: views much slower in 9.3 than 8.4

От
Kevin Grittner
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> But the other problem is that the planner considers less-than-1%
> differences in cost estimates to be "in the noise", which means
> that it's not going to consider cost differences of less than
> 1480 units in the remaining join steps to be significant.  This
> is how come we end up with the apparently brain-dead decisions to
> use seqscans on some of the other tables such as "pi" and "ac":
> comparing the seqscan to a potential inner indexscan, the total
> cost of the join is "the same" according to the 1% rule,

The 1% rule itself might be something to add to the R&D list.  I
have seen it cause big problems in production, although the users
in that case had made a mistake which significantly contributed to
it being an issue.  They had used the enable_seqscan = off setting
for one query which they had been unable to wrestle into good
performance in other ways, but accidentally neglected to turn it
back on after that query.  Now, seqscans were rarely a good idea
with their permanent tables, but they had a couple queries which
used very small temporary tables with no indexes.  It chose the
seqscan in spite of the setting; but, when run with seqscans off,
that gave all candidate plans such a high cost that they all looked
"equal" and the tie-breaker logic picked a horrible one.  (The
faster plans did have lower cost, but not by enough to exceed the
1% threshold.)  Now, had they not made a questionable choice in
disabling seqscan in production, compounded by an error in not
turning it off again, they would not have had their main web
application slow to unusable levels at times -- but it seems to me
that it might be reasonable to have some absolute cost maximum
difference test that needs to be met in addition to the percentage
difference, as kind of a "safety" on this foot-gun.

I'm not sold on this as being a good idea, and had not been
planning on raising it without further research; but since it plays
into this other scenario it seems worth mentioning as material for
potential R&D.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: views much slower in 9.3 than 8.4

От
Tom Lane
Дата:
Kevin Grittner <kgrittn@ymail.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> But the other problem is that the planner considers less-than-1%
>> differences in cost estimates to be "in the noise", which means
>> that it's not going to consider cost differences of less than
>> 1480 units in the remaining join steps to be significant.  This
>> is how come we end up with the apparently brain-dead decisions to
>> use seqscans on some of the other tables such as "pi" and "ac":
>> comparing the seqscan to a potential inner indexscan, the total
>> cost of the join is "the same" according to the 1% rule,

> The 1% rule itself might be something to add to the R&D list.

Perhaps.  But it does make for a significant difference in planner speed,
and I would argue that any case where it really hurts is by definition
a cost estimation failure somewhere else.

> [ disable_cost skews the behavior pretty badly ]

True.  Your example suggests that it might be nice to have something other
than a cost-delta way of discriminating against seqscans.  In principle
this consideration could be added to add_path(), although I'm pretty
hesitant to make that function even more complex/slower.

Perhaps another way would be to generate seqscan paths last (I think
they're first at the moment), and only generate them if we didn't find
any other path for the rel.

Nestloops for join rels have the same issue and would need to be handled
similarly, whatever solution we pick.

            regards, tom lane


Re: views much slower in 9.3 than 8.4

От
Jim Nasby
Дата:
On 3/30/15 10:52 AM, Tom Lane wrote:
> Kevin Grittner<kgrittn@ymail.com>  writes:
>> >Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>> >>But the other problem is that the planner considers less-than-1%
>>> >>differences in cost estimates to be "in the noise", which means
>>> >>that it's not going to consider cost differences of less than
>>> >>1480 units in the remaining join steps to be significant.  This
>>> >>is how come we end up with the apparently brain-dead decisions to
>>> >>use seqscans on some of the other tables such as "pi" and "ac":
>>> >>comparing the seqscan to a potential inner indexscan, the total
>>> >>cost of the join is "the same" according to the 1% rule,
>> >The 1% rule itself might be something to add to the R&D list.

> Perhaps.  But it does make for a significant difference in planner speed,
> and I would argue that any case where it really hurts is by definition
> a cost estimation failure somewhere else.

What I wish we had was some way to represent "confidence" in the
accuracy of a specific plan node, with the goal of avoiding plans that
cost out slightly cheaper but if we guessed wrong on something will blow
up spectacularly. Nested loops are an example; if you miscalculate
either of the sides by very much you can end up with a real mess unless
the rowcounts were already pretty trivial to begin with.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com