Обсуждение: Am I wasting my time with partitions?
It seems to me that postgresql doesn't use indexes when being asked for an ordered result sets from a partitioned table. I have an application where this is critical, but I was hoping to use partitions because of the ease of rotating out old rows. Simply put, I have a table called LineItems which I need to be able to page from and so I need to be able to ask for N rows ordered on a certain index (with possible constraints). eg. SELECT * FROM T ORDER BY col1,col2 LIMIT 10; This works fine and is quick on a single table: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadb=> \d lineitems Table "test2.lineitems" Column | Type | Modifiers --------------+--------------------------------+-----------lineitem_key | bigint | not nulltime | timestamp(6) without time zone | not nulldescription | text | not nullbarcode |text | not nullamount | bigint | not null Indexes: "lineitems_amount_index" btree (amount, lineitem_key) metadb=> explain select * from lineitems order by amount,lineitem_key limit 10; QUERY PLAN --------------------------------------------------------------------------------------------------------Limit (cost=0.00..0.74rows=10 width=49) -> Index Scan using lineitems_amount_index on lineitems (cost=0.00..39791.76 rows=535500 width=49) (2 rows)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If I partition the table by creating a top level table L, and inherited tables L1, L2 and issue the same request it does sequential scans on all the tables and takes orders of magnitude longer (see below). In the example below I would have hoped that it would have used an index scan on each of the tables returning 10 rows each and then done a merge on them. Am I asking too much? Should I just use a single table and take the hits on deletes and vacuums? RegardsStuart >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadb=> \d L Table "test2.l" Column | Type | Modifiers --------------+--------------------------------+-----------lineitem_key | bigint | not nulltime | timestamp(6) without time zone | not nulldescription | text | not nullbarcode |text | not nullamount | bigint | not null Indexes: "l_amount_index" btree (amount, lineitem_key) metadb=> \d L1 Table "test2.l1" Column | Type | Modifiers --------------+--------------------------------+-----------lineitem_key | bigint | not nulltime | timestamp(6) without time zone | not nulldescription | text | not nullbarcode |text | not nullamount | bigint | not null Indexes: "l1_amount_index" btree (amount, lineitem_key) Inherits: l metadb=> \d L2 Table "test2.l2" Column | Type | Modifiers --------------+--------------------------------+-----------lineitem_key | bigint | not nulltime | timestamp(6) without time zone | not nulldescription | text | not nullbarcode |text | not nullamount | bigint | not null Indexes: "l2_amount_index" btree (amount, lineitem_key) Inherits: l metadb=> explain select * from l order by amount,lineitem_key limit 10; QUERY PLAN -------------------------------------------------------------------------------------Limit (cost=22207.70..22207.72 rows=10width=88) -> Sort (cost=22207.70..23548.09 rows=536156 width=88) Sort Key: test2.l.amount, test2.l.lineitem_key -> Result (cost=0.00..10621.56 rows=536156 width=88) -> Append (cost=0.00..10621.56rows=536156 width=88) -> Seq Scan on l (cost=0.00..16.90 rows=690 width=88) -> Seq Scan on l1 l (cost=0.00..4951.00 rows=250000 width=49) -> Seq Scan on l2 l (cost=0.00..5653.66 rows=285466 width=49) (8 rows) NB. Just addressing one of the inherited tables works fine. metadb=> explain select * from l1 order by amount,lineitem_key limit 10; QUERY PLAN ------------------------------------------------------------------------------------------Limit (cost=0.00..0.74 rows=10width=49) -> Index Scan using l1_amount_index on l1 (cost=0.00..18554.20 rows=250000 width=49) (2 rows)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Stuart Brooks wrote: > It seems to me that postgresql doesn't use indexes when being asked for > an ordered result sets from a partitioned table. I have an application > where this is critical, but I was hoping to use partitions because of > the ease of rotating out old rows. > metadb=> explain select * from l order by amount,lineitem_key limit 10; > QUERY PLAN > -> Seq Scan on l (cost=0.00..16.90 rows=690 width=88) > -> Seq Scan on l1 l (cost=0.00..4951.00 > rows=250000 width=49) > -> Seq Scan on l2 l (cost=0.00..5653.66 > rows=285466 width=49) > NB. Just addressing one of the inherited tables works fine. > > metadb=> explain select * from l1 order by amount,lineitem_key limit 10; Well, you don't have an index it can use to find the smallest (amount,lineitem) across all of lX. If PG was smart enough to figure out that it only needed to check l1, then you do. Unfortunately it isn't. If you add the constraint you use to partition by, does that help you? -- Richard Huxton Archonet Ltd
> It seems to me that postgresql doesn't use indexes when being asked > for an ordered result sets from a partitioned table. I have an > application where this is critical, but I was hoping to use partitions > because of the ease of rotating out old rows. > >> metadb=> explain select * from l order by amount,lineitem_key limit 10; >> QUERY PLAN >> -> Seq Scan on l (cost=0.00..16.90 rows=690 >> width=88) >> -> Seq Scan on l1 l (cost=0.00..4951.00 >> rows=250000 width=49) >> -> Seq Scan on l2 l (cost=0.00..5653.66 >> rows=285466 width=49) > >> NB. Just addressing one of the inherited tables works fine. >> >> metadb=> explain select * from l1 order by amount,lineitem_key limit 10; > > Well, you don't have an index it can use to find the smallest > (amount,lineitem) across all of lX. If PG was smart enough to figure > out that it only needed to check l1, then you do. Unfortunately it isn't. > You're right, it can't determine which of the partitions will have the smallest value, but what it could do is pull the smallest value from each and compare. In the absence of the LIMIT there wouldn't be much which could be done, but the the LIMIT means it only actually needs to pull 10 rows from each partition. An alternative way of doing this would be: (SELECT * FROM L1 ORDER BY amount,lineitem_key LIMIT 10) UNION (SELECT * FROM L2 ORDER BY amount,lineitem_key LIMIT 10) ORDER BY amount,lineitem_key LIMIT 10; Unfortunately this means one can't just address the parent table, but it does essentially what I'd hoped postgres would do for me :) It would be quite a long query if there were 100 partitions! > If you add the constraint you use to partition by, does that help you? > I tried to strip the example down to its bare essentials but in this case I would be partitioning by lineitem_key and would obviously index and add a CONSTRAINT on that as well. I don't think it would help though, the query needs to merge from all tables. Thanks for the response,Stuart