Обсуждение: Possible Performance Regression with Transitive Comparisons vs. Constants
Hey guys, I ran into this while we were working on an upgrade project. We're moving from 8.2 (don't ask) to 9.1, and started getting terrible performance for some queries. I've managed to boil it down to a test case: create temp table my_foo as select a.id, '2012-01-01'::date + (random()*365)::int AS created_dt from generate_series(1,5000) as a(id); create temp table my_bar as select b.id, (random()*4999)::int + 1 as aid, '2012-01-01'::date + (random()*365)::int AS created_dt from generate_series(1,500000) as b(id); analyze my_foo; analyze my_bar; create index idx_foo_id on my_foo (id); create index idx_foo_const on my_foo (created_dt); create index idx_bar_id on my_bar(id); create index idx_bar_aid on my_bar(aid); create index idx_bar_const on my_bar (created_dt); Ok, simple enough, right? Now do this: explain analyze select b.* from my_foo a, my_bar b where a.created_dt = '2012-05-05' and b.created_dt between a.created_dt and a.created_dt + interval '1 month'; explain analyze select b.* from my_foo a, my_bar b where a.created_dt = '2012-05-05' and b.created_dt between '2012-05-05' and '2012-05-05'::date + interval '1 month'; These do not create the same query plan, which itself is odd. But the other thing, is that query 1 is about 4-8x slower than query 2, but only when I test it on PostgreSQL 9.1. When I test it on 8.2 (eww) they're about equal in performance. I should note that the plan for both cases in 8.2, performs better than query 1 in 9.1. So I've got two questions: 1. Is it normal for trivially equal values to be non-optimal like this? 2. What on earth happened between 8.2 and 9.1 that made performance worse for this test case? Just to address any questions, I've tested this in multiple environments, and it's always consistent. 9.1 performs worse than 8.2 here, so long as you rely on PostgreSQL to make the equivalence instead of doing it manually. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Shaun Thomas <sthomas@optionshouse.com> writes: > I ran into this while we were working on an upgrade project. We're > moving from 8.2 (don't ask) to 9.1, and started getting terrible > performance for some queries. I've managed to boil it down to a test case: 9.1.what? For me, 8.2.23 and 9.1.6 produce the same plan and just about the same runtime for your query 1. For query 2, 9.1.6 prefers to stick in a Materialize node, which cuts the runtime 30% or so --- but if I set enable_material to off then I get the same plan and runtime as with 8.2. Perhaps you should show the EXPLAIN ANALYZE outputs you're actually getting, rather than assuming others will get the same thing. regards, tom lane (PS: it does seem that HEAD has got some kind of issue here, because it's picking a plain not bitmap indexscan. I'll go look at that. But I don't see that misbehavior in 9.1.)
On 09/28/2012 03:35 PM, Tom Lane wrote: > 9.1.what? For me, 8.2.23 and 9.1.6 produce the same plan and just > about the same runtime for your query 1. I withdraw that part of my question. I apparently didn't look closely enough at the actual output. I was basing the version assumption on the query speed on the new server, when it was probably due to cache effects. The first part of the question stands, though... Why isn't the optimizer substituting these values? a.created_date should be exactly equivalent to '2012-05-05', but it's clearly not being treated that way. With the full substitutions, I'm seeing things like this: http://explain.depesz.com/s/3T4 With the column names, it's this: http://explain.depesz.com/s/Fq7 This is on 8.2, but the behavior is the same on 9.1. From 130s to 23s simply by substituting the constant wherever the column name is encountered. For reference, the queries are, slow: select a.id, f.ezorder_id from reporting.account a join ezorder f on f.account_id = a.account_id where a.process_date = '2012-09-27' and f.date_created between a.process_date - interval '6 months' and a.process_date and a.row_out is null And fast: select a.id, f.ezorder_id from reporting.account a join ezorder f on f.account_id = a.account_id where a.process_date = '2012-09-27' and f.date_created between '2012-09-27'::date - interval '6 months' and '2012-09-27' and a.row_out is null We discovered this during the upgrade, but it seems to equally apply to both 8.2 and 9.1. I've been telling the devs to replace any of these they find all day. I can't quite say why we never "noticed" this before, but it got exposed today pretty plainly. If this were a compiler, I'd have expected it to treat the values as equivalent, but that's clearly not what's happening. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Shaun Thomas <sthomas@optionshouse.com> writes: > The first part of the question stands, though... Why isn't the optimizer > substituting these values? a.created_date should be exactly equivalent > to '2012-05-05', but it's clearly not being treated that way. No version of Postgres has ever substituted constants in the way you're imagining, and I wouldn't hold my breath waiting for it to happen. The reason is that "x = constant" only creates a requirement for x to be btree-equal to the constant, and btree equality doesn't guarantee equality for all purposes. In this example we'd have to assume that btree-equality guaranteed identical results from the date + interval addition operator. While that happens to be true for this operator, the planner can't know that. A real-world example of the kind of case I'm worried about is that in IEEE-spec float arithmetic, minus zero and plus zero compare equal --- but there are functions that give different results for the two values. Another is that the char(n) type's equality operator will say that 'foo' and 'foo ' are equal, but those values are definitely distinguishable by some operations, eg length(). There are some cases where the planner can effectively propagate constants, but they rely on transitivity of btree equality operators. For instance if we have x = constant and x = y, with compatible equality operators, we can deduce y = constant. But that doesn't imply that y *is* the constant, just that it's btree-equal to it. There have been some discussions of inventing a stronger notion of equality than btree equality, so that we could know when it's safe to make this type of substitution; but nothing's been done about that. Personally I think it's fairly rare that any real win would come from this type of constant substitution, and so it's very likely that adding it would just create a net drag on performance (because of the added planner cycles spent looking for substitution opportunities, which would happen in every query whether it got any benefit or not). Another point here is that at least for the one side of your BETWEEN operator, b.created_dt >= a.created_dt, we could in fact combine that with a.created_dt = '2012-05-05' to deduce b.created_dt >= '2012-05-05', because we know from the btree opclass for dates that these = and >= operators have compatible semantics. Again though, it seems likely that the cost of looking for such opportunities would outweigh the benefits. In this particular example I don't think it'd do much good --- the reason the planner isn't picking a plan similar to the "fast" one is that it doesn't know that the BETWEEN with variable limits will select only a relatively small part of the table. Providing a constant limit for just one side wouldn't fix that. regards, tom lane