Обсуждение: Size of IN list affects query plan
Hi, I would like to know, how does the size of the IN list affect query planner. I have a query select distinct on (event_id, tag_id) et.id, e.id as event_id, t.id as tag_id, t.name, t.user_id, t.shared, t.color, case when ea.id <> e.id then true else false end as inherited from do_event e join do_event ea on (ea.tree_id = e.tree_id and ea.lft <= e.lft and ea.rght >= e.rght) join do_event_tags et on (et.event_id = ea.id) join do_tag t on (t.id = et.tag_id) where e.id in (LIST_OF_INTEGERS) and (t.user_id = 14 or t.shared) order by event_id, tag_id, inherited; and have doubts, if the size of the list does not impact the plan significantly. If LIST_OF_INTEGERS has <=233 values, the query is really fast: Unique (cost=2351.85..2353.71 rows=249 width=33) (actual time=24.515..24.654 rows=163 loops=1) -> Sort (cost=2351.85..2352.47 rows=249 width=33) (actual time=24.513..24.549 rows=166 loops=1) Sort Key: e.id, t.id, (CASE WHEN (ea.id <> e.id) THEN true ELSE false END) Sort Method: quicksort Memory: 37kB -> Hash Join (cost=2217.89..2341.94 rows=249 width=33) (actual time=18.987..24.329 rows=166 loops=1) Hash Cond: (et.event_id = ea.id) -> Hash Join (cost=4.73..119.62 rows=1612 width=29) (actual time=0.151..4.634 rows=2312 loops=1) Hash Cond: (et.tag_id = t.id) -> Seq Scan on do_event_tags et (cost=0.00..79.47 rows=5147 width=12) (actual time=0.006..1.531 rows=5147 loops=1) -> Hash (cost=4.08..4.08 rows=52 width=21) (actual time=0.119..0.119 rows=49 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 3kB -> Seq Scan on do_tag t (cost=0.00..4.08 rows=52 width=21) (actual time=0.019..0.097 rows=49 loops=1) Filter: ((user_id = 14) OR shared) -> Hash (cost=2157.26..2157.26 rows=4472 width=8) (actual time=18.782..18.782 rows=270 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 11kB -> Nested Loop (cost=428.35..2157.26 rows=4472 width=8) (actual time=0.597..18.595 rows=270 loops=1) Join Filter: ((ea.lft <= e.lft) AND (ea.rght >= e.rght)) -> Bitmap Heap Scan on do_event e (cost=428.35..926.22 rows=232 width=16) (actual time=0.568..0.895 rows=233 loops=1) Recheck Cond: (id = ANY ('{110364,110377,42337,1503,5490,106267,106607,108419,108836,108556,108744,108466,108467,106331,3717,105404,35179,3398,5675,5896,5888,5287,4679,4275,4042,1599,4041,3311,1588,1605,1607,1606,1604,1594,1850,110494,110041,107955,110373,110068,110114,109503,109925,108959,108964,109189,109598,109142,109304,109607,107902,106668,109121,109101,109056,4621,109031,2574,5092,1674,106452,108901,108849,108713,108783,108766,108386,108455,2560,108397,1538,2007,108000,108389,108336,108456,36796,28985,108003,108421,108399,4871,106884,6371,36026,108204,108022,107941,107967,107911,107928,47944,107010,106640,107037,106994,107011,55313,105862,106332,106498,5850,13369,106161,5859,28465,106385,106444,102751,106371,105131,2610,102753,4833,4936,4755,4699,105402,14087,4798,4942,36249,55513,75790,75789,4238,6370,5744,5745,5149,4731,42297,34841,31190,17339,31155,31242,17701,17642,31203,31218,31376,5856,5141,18154,27146,17590,17566,13692,4867,1842,6365,6354,5480,5481,4382,5893,6355,5907,5886,5826,5028,4665,5230,5482,5273,4181,5091,4869,4983,4968,4961,4905,4906,4036,1483,4284,4790,4348,4648,4655,4647,4656,3075,4596,2144,4274,4592,4506,4549,4595,4188,4548,4511,4333,4306,4291,4240,4268,4114,3665,3547,1563,2102,1514,3579,3607,3501,2834,2436,3069,1400,2359,3056,3173,2897,2837,2780,2137,1447,1280,421,412,2076,1200,1691,446,1444,399,374,444,419,449}'::integer[])) -> Bitmap Index Scan on do_event_pkey (cost=0.00..428.29 rows=232 width=0) (actual time=0.538..0.538 rows=233 loops=1) Index Cond: (id = ANY ('{110364,110377,42337,1503,5490,106267,106607,108419,108836,108556,108744,108466,108467,106331,3717,105404,35179,3398,5675,5896,5888,5287,4679,4275,4042,1599,4041,3311,1588,1605,1607,1606,1604,1594,1850,110494,110041,107955,110373,110068,110114,109503,109925,108959,108964,109189,109598,109142,109304,109607,107902,106668,109121,109101,109056,4621,109031,2574,5092,1674,106452,108901,108849,108713,108783,108766,108386,108455,2560,108397,1538,2007,108000,108389,108336,108456,36796,28985,108003,108421,108399,4871,106884,6371,36026,108204,108022,107941,107967,107911,107928,47944,107010,106640,107037,106994,107011,55313,105862,106332,106498,5850,13369,106161,5859,28465,106385,106444,102751,106371,105131,2610,102753,4833,4936,4755,4699,105402,14087,4798,4942,36249,55513,75790,75789,4238,6370,5744,5745,5149,4731,42297,34841,31190,17339,31155,31242,17701,17642,31203,31218,31376,5856,5141,18154,27146,17590,17566,13692,4867,1842,6365,6354,5480,5481,4382,5893,6355,5907,5886,5826,5028,4665,5230,5482,5273,4181,5091,4869,4983,4968,4961,4905,4906,4036,1483,4284,4790,4348,4648,4655,4647,4656,3075,4596,2144,4274,4592,4506,4549,4595,4188,4548,4511,4333,4306,4291,4240,4268,4114,3665,3547,1563,2102,1514,3579,3607,3501,2834,2436,3069,1400,2359,3056,3173,2897,2837,2780,2137,1447,1280,421,412,2076,1200,1691,446,1444,399,374,444,419,449}'::integer[])) -> Index Scan using do_event_tree_id on do_event ea (cost=0.00..5.29 rows=1 width=16) (actual time=0.005..0.040 rows=59 loops=233) Index Cond: (tree_id = e.tree_id) Total runtime: 24.853 ms (24 rows) If if has >=234 values, it is very slow: Unique (cost=2379.26..2381.14 rows=250 width=33) (actual time=3851.030..3851.175 rows=163 loops=1) -> Sort (cost=2379.26..2379.89 rows=250 width=33) (actual time=3851.027..3851.073 rows=166 loops=1) Sort Key: e.id, t.id, (CASE WHEN (ea.id <> e.id) THEN true ELSE false END) Sort Method: quicksort Memory: 37kB -> Nested Loop (cost=139.77..2369.30 rows=250 width=33) (actual time=44.373..3850.597 rows=166 loops=1) Join Filter: ((ea.lft <= e.lft) AND (ea.rght >= e.rght)) -> Hash Join (cost=139.77..1286.83 rows=1612 width=41) (actual time=10.784..41.107 rows=2312 loops=1) Hash Cond: (ea.id = et.event_id) -> Seq Scan on do_event ea (cost=0.00..840.97 rows=28997 width=16) (actual time=0.013..10.275 rows=28997 loops=1) -> Hash (cost=119.62..119.62 rows=1612 width=29) (actual time=10.607..10.607 rows=2312 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 137kB -> Hash Join (cost=4.73..119.62 rows=1612 width=29) (actual time=0.468..8.024 rows=2312 loops=1) Hash Cond: (et.tag_id = t.id) -> Seq Scan on do_event_tags et (cost=0.00..79.47 rows=5147 width=12) (actual time=0.007..2.578 rows=5147 loops=1) -> Hash (cost=4.08..4.08 rows=52 width=21) (actual time=0.434..0.434 rows=49 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 3kB -> Seq Scan on do_tag t (cost=0.00..4.08 rows=52 width=21) (actual time=0.030..0.405 rows=49 loops=1) Filter: ((user_id = 14) OR shared) -> Index Scan using do_event_tree_id on do_event e (cost=0.00..0.65 rows=1 width=16) (actual time=0.803..1.644 rows=1 loops=2312) Index Cond: (tree_id = ea.tree_id) Filter: (id = ANY ('{110364,110377,42337,1503,5490,106267,106607,108419,108836,108556,108744,108466,108467,106331,3717,105404,35179,3398,5675,5896,5888,5287,4679,4275,4042,1599,4041,3311,1588,1605,1607,1606,1604,1594,1850,110494,110041,107955,110373,110068,110114,109503,109925,108959,108964,109189,109598,109142,109304,109607,107902,106668,109121,109101,109056,4621,109031,2574,5092,1674,106452,108901,108849,108713,108783,108766,108386,108455,2560,108397,1538,2007,108000,108389,108336,108456,36796,28985,108003,108421,108399,4871,106884,6371,36026,108204,108022,107941,107967,107911,107928,47944,107010,106640,107037,106994,107011,55313,105862,106332,106498,5850,13369,106161,5859,28465,106385,106444,102751,106371,105131,2610,102753,4833,4936,4755,4699,105402,14087,4798,4942,36249,55513,75790,75789,4238,6370,5744,5745,5149,4731,42297,34841,31190,17339,31155,31242,17701,17642,31203,31218,31376,5856,5141,18154,27146,17590,17566,13692,4867,1842,6365,6354,5480,5481,4382,5893,6355,5907,5886,5826,5028,4665,5230,5482,5273,4181,5091,4869,4983,4968,4961,4905,4906,4036,1483,4284,4790,4348,4648,4655,4647,4656,3075,4596,2144,4274,4592,4506,4549,4595,4188,4548,4511,4333,4306,4291,4240,4268,4114,3665,3547,1563,2102,1514,3579,3607,3501,2834,2436,3069,1400,2359,3056,3173,2897,2837,2780,2137,1447,1280,421,412,2076,1200,1691,446,1444,399,374,444,419,449,1021}'::integer[])) Total runtime: 3851.458 ms (22 rows) Increasing effective_cache_size or work_mem does not seem to have impact. Thanks for any hint, John
On Fri, Nov 8, 2013 at 6:04 AM, Jan Walter <john@commontongue.com> wrote:
Hi,
I would like to know, how does the size of the IN list affect query planner.
I have a query
select distinct on (event_id, tag_id) et.id,
e.id as event_id, t.id as tag_id, t.name,
t.user_id, t.shared, t.color,
case
when ea.id <> e.id then true
else false
end as inherited
from do_event e
join do_event ea on (ea.tree_id = e.tree_id and ea.lft <= e.lft and ea.rght >= e.rght)
join do_event_tags et on (et.event_id = ea.id)
join do_tag t on (t.id = et.tag_id)
where e.id in (LIST_OF_INTEGERS) and
(t.user_id = 14 or t.shared)
order by event_id, tag_id, inherited;
Looking at your EXPLAIN ANALYZE plan I was immediately reminded of this article http://www.datadoghq.com/2013/08/100x-faster-postgres-performance-by-changing-1-line/, where changing the array to a VALUES() clause was a huge win for them.
Jan Walter <john@commontongue.com> writes: > I would like to know, how does the size of the IN list affect query planner. AFAICT, the reason the second plan is slow is the large number of checks of the IN list. The planner does account for the cost of that, but it's drastically underestimating that cost relative to the cost of I/O for the heap and index accesses. I suppose that your test case is fully cached in memory, which helps make the CPU costs more important than I/O costs. If you think this is representative of your real workload, then you need to decrease random_page_cost (and maybe seq_page_cost too) to make the cost estimates correspond better to that reality. regards, tom lane
Thanks for your comments. On 8.11.2013 15:31, Tom Lane wrote: > AFAICT, the reason the second plan is slow is the large number of > checks of the IN list. The planner does account for the cost of that, > but it's drastically underestimating that cost relative to the cost of > I/O for the heap and index accesses. I suppose that your test case is > fully cached in memory, which helps make the CPU costs more important > than I/O costs. If you think this is representative of your real > workload, then you need to decrease random_page_cost (and maybe > seq_page_cost too) to make the cost estimates correspond better to > that reality. I am not sure I understand it well - in the first case (fast query), cache is utilized in a better way? Going down with random_page_cost gives me fast query plans with big lists as you expected. I tested the slow query on different machines with (default) settings of seq_page_cost, and I am getting those fast query plans, too, so I am curious what else could affect that (same db vacuum analyzed). Anyway it opens a question if big (tens to hundreds) IN lists is a bad practice, or just something that has to be used carefully. I have to admit I am surprised that this rather standard technique leads to so wide range of performance. On 8.11.2013 15:31, bricklen wrote: > Looking at your EXPLAIN ANALYZE plan I was immediately reminded of > this article > http://www.datadoghq.com/2013/08/100x-faster-postgres-performance-by-changing-1-line/, > where changing the array to a VALUES() clause was a huge win for them. Yeah, I saw it before. Unfortunately that does not help significantly in my case. Jan