wrong join result set estimate

Поиск
Список
Период
Сортировка
От Evgeny Shishkin
Тема wrong join result set estimate
Дата
Msg-id 5C1F9E10-588B-478C-ADC7-84B2CBA504F4@gmail.com
обсуждение исходный текст
Список pgsql-performance
Hello,

i have a problem with relatively easy query.

EXPLAIN ANALYZE SELECT content.* FROM content JOIN blog ON blog.id = content.blog_id JOIN community_prop ON blog.id = community_prop.blog_id JOIN community ON community.id = community_prop.id WHERE community.id IN (33, 55, 61, 1741, 75, 90, 106, 180, 228, 232, 256, 310, 388, 404, 504, 534, 536, 666, 700, 768, 824, 832, 855, 873, 898, 962, 1003, 1008, 1027, 1051, 1201, 1258, 1269, 1339, 1355, 1360, 1383, 1390, 1430, 1505, 1506, 1530, 1566, 1578, 1616, 1678, 1701, 1713, 1723, 1821, 1842, 1880, 1882, 1894, 1973, 2039, 2069, 2106, 2130, 2204, 2226, 2236, 2238, 2263, 2272, 2310, 2317, 2327, 2353, 2360, 2401, 2402, 2409, 2419, 2425, 2426, 2438, 2440, 2452, 2467, 2494, 2514, 2559, 2581, 2653, 2677, 2679, 2683, 2686, 2694, 2729, 2732, 2739, 2779, 2785, 2795, 2821, 2831, 2839, 2862, 2864, 2866, 2882, 2890, 2905, 2947, 2962, 2964, 2978, 2981, 3006, 3016, 3037, 3039, 3055, 3060, 3076, 3112, 3124, 3135, 3138, 3186, 3213, 3222, 3225, 3269, 3273, 3288, 3291, 3329, 3363, 3375, 3376, 3397, 3415, 3491, 3500, 2296, 3547, 129, 1039, 8, 1053, 1441, 2372, 1974, 289, 2449, 2747, 2075, 57, 3550, 3069, 89, 1603, 1570, 54, 152, 1035, 1456, 506, 1387, 43, 1805, 1851, 1843, 2587, 1908, 1790, 2630, 901, 13, 529, 705, 81, 2668, 1086, 603, 1986, 2516, 2969, 2671, 568, 4636, 1115, 864, 381, 4516, 2608, 677, 88, 1825, 3220, 3284, 947, 1190, 2233, 4489, 3320, 2957, 4146, 1841, 25, 643, 4352, 14, 4261, 3876, 1311, 1342, 4057, 3974) ORDER BY content.time_create DESC LIMIT 10;


As you can see, planner estimates 115 rows, but there are 259554 of them.

This query shows root of the problem
EXPLAIN ANALYZE SELECT content.* FROM content JOIN blog ON blog.id = content.blog_id JOIN community_prop ON blog.id = community_prop.blog_id;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=24498.17..137922.26 rows=2624 width=572) (actual time=36.028..1342.267 rows=408374 loops=1)
   Hash Cond: (content.blog_id = blog.id)
   ->  Seq Scan on content  (cost=0.00..102364.99 rows=1260899 width=572) (actual time=0.030..983.274 rows=1256128 loops=1)
   ->  Hash  (cost=24439.07..24439.07 rows=4728 width=8) (actual time=35.964..35.964 rows=4728 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 185kB
         ->  Nested Loop  (cost=0.00..24439.07 rows=4728 width=8) (actual time=0.064..33.092 rows=4728 loops=1)
               ->  Seq Scan on community_prop  (cost=0.00..463.28 rows=4728 width=4) (actual time=0.004..5.089 rows=4728 loops=1)
               ->  Index Scan using blog_pkey on blog  (cost=0.00..5.06 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=4728)
                     Index Cond: (id = community_prop.blog_id)
 Total runtime: 1361.354 ms

2624 vs 408374

Joining only content with blog: 1260211 vs 1256124.
Joining only blog with community_prop: 4728 vs 4728
Joining only content with community_prop: 78304 vs 408376

SHOW default_statistics_target ;
 default_statistics_target 
---------------------------
 500

I already altered stats on blog_id column 
ALTER TABLE content ALTER COLUMN blog_id SET STATISTICS 1000;

Tried setting 3000 and 10000 on all join columns - did not make a difference.
Tried setting n_distinct on content(blog_id) manually to different values from 10000 to 200000 (exact distinct is 90k, vacuum sets it to 76k) - did not change the estimate result set, only estimated index lookup.

Don't now what to do with this.

Ready to provide any additional information.
Thank you for your time.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Thomas Kellerer
Дата:
Сообщение: Re: Postgres becoming slow, only full vacuum fixes it
Следующее
От: Matthias
Дата:
Сообщение: Re: [GENERAL] Inaccurate Explain Cost