Обсуждение: BUG #11500: PRIMARY KEY index not being used
The following bug has been logged on the website: Bug reference: 11500 Logged by: Marko Tiikkaja Email address: marko@joh.to PostgreSQL version: 9.1.12 Operating system: Linux Description: Hi, We've been observing a performance problem where a PRIMARY KEY index is not being used. The problem looks like this: pg2=#* explain analyze select * from events where eventid = 132685185 and processed = 0; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Index Scan using index_events_processed on events (cost=0.00..7.73 rows=1 width=106) (actual time=31.808..31.808 rows=0 loops=1) Index Cond: (processed = 0) Filter: (eventid = 132685185) Total runtime: 31.852 ms (4 rows) pg2=#* explain analyze select * from events where eventid = 132685185 and processed+0 = 0; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Scan using events_pkey on events (cost=0.00..12.38 rows=1 width=106) (actual time=0.071..0.071 rows=0 loops=1) Index Cond: (eventid = 132685185) Filter: ((processed + 0) = 0) Total runtime: 0.109 ms (4 rows) I'm guessing that's happening because the index on processed is smaller (7GB, relpages=900880 vs 3.7GB, relpages=478225). The statistics say that there are no rows where processed=0 (and it's not far from the truth), but it's still a risky plan compared to the PK lookup. The index index_events_processed is an index on events(processed), which should probably be a partial index on WHERE processed = 0, but I thought I'd report this plan anyway. Any thoughts?
On Fri, Sep 26, 2014 at 11:02 AM, <marko@joh.to> wrote: > The statistics say that > there are no rows where processed=0 (and it's not far from the truth), but > it's still a risky plan compared to the PK lookup. > Any thoughts? PostgreSQL 9.0 introduced this optimization for greater/less operators: > When looking up statistics for greater/less-than comparisons, if the > comparison value is in the first or last histogram bucket, use an index > (if available) to fetch the current actual column minimum or maximum. > This greatly improves the accuracy of estimates for comparison values > near the ends of the data range, particularly if the range is constantly > changing due to addition of new data. Not sure whether it's a good idea a bad idea, but perhaps a solution is to expand this to equality lookups too? Does using "WHERE processed <= 0" work around the problem? (Assuming you don't have any negative numbers in this column). > The index > index_events_processed is an index on events(processed), which should > probably be a partial index on WHERE processed = 0, but I thought I'd > report this plan anyway. I guess you would still have this problem, unless your new index contains the eventid column. Regards, Marti
On 9/26/14 2:02 PM, Marti Raudsepp wrote: > On Fri, Sep 26, 2014 at 11:02 AM, <marko@joh.to> wrote: >> The statistics say that >> there are no rows where processed=0 (and it's not far from the truth), but >> it's still a risky plan compared to the PK lookup. > >> Any thoughts? > > PostgreSQL 9.0 introduced this optimization for greater/less operators: > >> When looking up statistics for greater/less-than comparisons, if the >> comparison value is in the first or last histogram bucket, use an index >> (if available) to fetch the current actual column minimum or maximum. >> This greatly improves the accuracy of estimates for comparison values >> near the ends of the data range, particularly if the range is constantly >> changing due to addition of new data. > > Not sure whether it's a good idea a bad idea, but perhaps a solution > is to expand this to equality lookups too? I'm not sure that's the right idea to be honest. The problem is that the planner is taking a risk by using an index which could contain (theoretically) any number of matching rows, instead of using the primary key which is guaranteed to only contain 0 or 1 rows. Sure, peeking into the index to see that there are indeed some processed=0 rows would probably discourage the planner from using it, but why bother? > Does using "WHERE processed <= 0" work around the problem? (Assuming > you don't have any negative numbers in this column). I unfortunately already dropped the problematic index, so I can't answer that. .marko
marko@joh.to writes: > PostgreSQL version: 9.1.12 > We've been observing a performance problem where a PRIMARY KEY index is not > being used. The problem looks like this: In 9.2 and again in 9.3, we significantly changed the planner's modeling of index access costs, with a view towards making better choices when there are multiple plausible indexes to use. If you can reproduce this misbehavior in 9.3 or later it would be worth looking into; but we're not going to change 9.1's estimation rules at this late date. People running on back branches tend to want plan stability in my experience. regards, tom lane