Обсуждение: bad plan with custom data types
First guys, thanks for the help getting my mapped data types together. They work great until I run a query on a table where one is an index column. The planner produces really bad plans (uses nested loop where merge join would work). I ran ANALYZE on each of the tables in the query before the test. First, the two tables are indexed by the same key (date, model, bucket). The types of model, bucket, and symbol are my custom types (they store as an integer and display as a string, mapped from a table). When I created the operators, I created the operator= to support merge, join is set to eqjoinsel. When I first run explain I see: explain select * from create_retail_bucket inner join execution using (date_, model_, bucket, symbol) where create_retail_bucket.date_ = '20061101'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------Merge Join (cost=1514021.91..1524365.32 rows=14 width=205) Merge Cond: (("outer".bucket = "inner".bucket) AND ("outer".symbol= "inner".symbol) AND ("outer".model_ = "inner".model_)) -> Sort (cost=1297329.34..1299646.21 rows=926747 width=167) Sort Key: create_retail_bucket.bucket, create_retail_bucket.symbol, create_retail_bucket.model_ -> Bitmap Heap Scan on create_retail_bucket (cost=6810.61..1159975.47 rows=926747 width=167) Recheck Cond: (date_ = '2006-11-01'::date) -> Bitmap Index Scan on create_retail_bucket_date_model_bucket_idx (cost=0.00..6810.61 rows=926747 width=0) Index Cond: (date_ = '2006-11-01'::date) -> Sort (cost=216692.56..217397.92rows=282143 width=54) Sort Key: execution.bucket, execution.symbol, execution.model_ -> Bitmap Heap Scan on execution (cost=2076.50..191150.06 rows=282143 width=54) Recheck Cond: ('2006-11-01'::date = date_) -> Bitmap Index Scan on execution_date_model_bucket_idx (cost=0.00..2076.50 rows=282143 width=0) Index Cond: ('2006-11-01'::date = date_) Then I disable, bitmap, nested and seqscan and get: Merge Join (cost=3435041.20..3445384.61 rows=14 width=205) Merge Cond: (("outer".bucket = "inner".bucket) AND ("outer".symbol= "inner".symbol) AND ("outer".model_ = "inner".model_)) -> Sort (cost=2861383.46..2863700.32 rows=926747 width=167) Sort Key: create_retail_bucket.bucket, create_retail_bucket.symbol, create_retail_bucket.model_ -> Index Scan using create_retail_bucket_date_model_bucket_idx on create_retail_bucket (cost=0.00..2724029.58 rows=926747 width=167) Index Cond: (date_ = '2006-11-01'::date) -> Sort (cost=573657.75..574363.11 rows=282143 width=54) Sort Key: execution.bucket, execution.symbol, execution.model_ -> Index Scan using execution_date_model_bucket_idxon execution(cost=0.00..548115.25 rows=282143 width=54) Index Cond: ('2006-11-01'::date= date_) I don't understand why it re-sorts the data even though the indexes are in the same order? Please help!! This is driving me up a wall.... Thanks, Greg
Greg Mitchell <gmitchell@atdesk.com> writes: > I don't understand why it re-sorts the data even though the indexes are in > the same order? I'm betting there's something wrong with your custom type definition, such that the planner is failing to make any connection between the index and the desired sort order. But you've not shown us any details. regards, tom lane
Below are the operator definitions for one type. The other types are exactly the same, just different argument types and procedures. I noticed when I select the subset of each table (by date) out into a temporary table and add the indices, joining on those to tables yields the expected plan (merge join with 2 index scans). I think the planner believes that the bitmap heap scan, sorted and then merged is a better plan. What I don't understand is why? It seems to me that it could do a btree lookup for the given date (or beginning of date range as needed), for each index, and walk the trees in-order merging. Run-time-wise, it takes 42 seconds to run this as a single query, vs 18 seconds to split the problem up with temp tables. CREATE OPERATOR < ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = >, NEGATOR = >=, RESTRICT = scalarltsel, JOIN = scalarltjoinsel, PROCEDURE = bucket_t_lt ); CREATE OPERATOR <= ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = >=, NEGATOR = >, RESTRICT = scalarltsel, JOIN = scalarltjoinsel, PROCEDURE = bucket_t_le ); CREATE OPERATOR = ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = =, NEGATOR = <>, RESTRICT = eqsel, JOIN = eqjoinsel, HASHES, MERGES, PROCEDURE = bucket_t_eq ); CREATE OPERATOR >= ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = <=, NEGATOR = <, RESTRICT = scalargtsel, JOIN = scalargtjoinsel, PROCEDURE = bucket_t_ge ); CREATE OPERATOR > ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = <, NEGATOR = <=, RESTRICT = scalargtsel, JOIN = scalargtjoinsel, PROCEDURE = bucket_t_gt ); CREATE OPERATOR <> ( LEFTARG = bucket_t, RIGHTARG = bucket_t, COMMUTATOR = <>, NEGATOR = =, RESTRICT = neqsel, JOIN = neqjoinsel, PROCEDURE = bucket_t_ne ); Tom Lane wrote: > Greg Mitchell <gmitchell@atdesk.com> writes: >> I don't understand why it re-sorts the data even though the indexes are in >> the same order? > > I'm betting there's something wrong with your custom type definition, > such that the planner is failing to make any connection between the > index and the desired sort order. But you've not shown us any details. > > regards, tom lane
Greg Mitchell <gmitchell@atdesk.com> writes: > I don't understand why it re-sorts the data even though the indexes are in > the same order? What are the available indexes exactly? It looks to me from the names that the indexes probably *don't* match the sort order the merge is using. What I'm wondering is whether the planner should be expected to find a merge plan that adapts to the available indexes. In the light of morning I doubt this has anything to do with custom data types at all, but with the fact that the planner doesn't exhaustively search through every possible combination of mergejoin conditions. If you turn off enable_sort as well, does it find a sort-free merge plan? regards, tom lane
On Wed, Nov 22, 2006 at 08:20:40AM -0500, Greg Mitchell wrote: > Below are the operator definitions for one type. The other types are > exactly the same, just different argument types and procedures. Looks good, what about the operator class definition? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
> What are the available indexes exactly? It looks to me from the names > that the indexes probably *don't* match the sort order the merge is > using. What I'm wondering is whether the planner should be expected to > find a merge plan that adapts to the available indexes. In the light > of morning I doubt this has anything to do with custom data types at > all, but with the fact that the planner doesn't exhaustively search > through every possible combination of mergejoin conditions. The indices are on (date, model, bucket) and I'm telling it to join on (date, model, bucket, symbol), where date is a constant. My expectation is that it would merge on (model, bucket, symbol) in-order, though the plan shows it having a merge condition (bucket, symbol, model). > If you turn off enable_sort as well, does it find a sort-free merge > plan? Yes, but not a very good one.... TDB=> explain select * from create_retail_bucket inner join execution using (date_, model_, bucket, symbol) where create_retail_bucket.date_ = '20061101'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- NestedLoop (cost=2427.93..2077606.20 rows=18 width=205) Join Filter: ("inner".symbol = "outer".symbol) -> Bitmap HeapScan on execution (cost=2427.93..219154.20 rows=323408 width=54) Recheck Cond: ('2006-11-01'::date = date_) -> Bitmap Index Scan on execution_date_model_bucket_idx (cost=0.00..2427.93 rows=323408 width=0) Index Cond: ('2006-11-01'::date = date_) -> Index Scan using create_retail_bucket_date_model_bucket_idxon create_retail_bucket (cost=0.00..5.73 rows=1 width=167) Index Cond: ((create_retail_bucket.date_ = '2006-11-01'::date) AND (create_retail_bucket.model_ = "outer".model_) AND (create_retail_bucket.bucket = "outer".bucket)) (8 rows)
Greg Mitchell <gmitchell@atdesk.com> writes: >> What are the available indexes exactly? > The indices are on (date, model, bucket) and I'm telling it to join on > (date, model, bucket, symbol), where date is a constant. My expectation > is that it would merge on (model, bucket, symbol) in-order, though the > plan shows it having a merge condition (bucket, symbol, model). Well, that expectation is certainly a fantasy, since the indexscan output wouldn't be sorted on symbol. It would be possible to use (date, model, bucket), or just (model, bucket), as the merge key and then apply symbol equality as a post-merge "Filter" condition. The problem here is that after constant propagation the date equality condition has been dropped as redundant, leaving (model, bucket) or (model) as the only merge keys that would work, and the planner is not capable of recognizing that the indexscan outputs can be treated as sorted that way rather than sorted with date as the major sort key. Since 8.1 there is some code in there that can make that kind of deduction with respect to simple indexscan plans, but I now realize it's in the wrong place to help with mergejoins :-(. We probably ought to rework things so that this consideration is understood by the general "pathkey" code rather than being a special hack in indxpath.c. Maybe I'm missing something, but offhand it seems like anytime we've included a constant in a pathkey equivalence set, we could decide that that pathkey is a no-op and consider a pathkey list including such a pathkey to be equal to an otherwise-identical pathkey list omitting the no-op pathkey. Too big a change to consider for 8.2 at this late date, unfortunately. I'll try to take a look at it for 8.3. For the archives: this can be reproduced in the regression database with set enable_hashjoin TO 0; set enable_nestloop TO 0; set enable_bitmapscan TO 0; explain select * from tenk1 a join tenk1 b using(thousand, tenthous) where a.thousand = 555; The sort steps in the resulting plan are redundant, but the planner fails to see it. 8.1 and 8.2 do understand they don't need a sort for explain select * from tenk1 where thousand = 555 order by tenthous; regards, tom lane