using sequential scan instead of index for join with a union

Поиск
Список
Период
Сортировка
От Kent Tong
Тема using sequential scan instead of index for join with a union
Дата
Msg-id CAKs98dGxcPAyiN7Euw8wjiadVkNj92L+_txgpdVdEfRBDefk-A@mail.gmail.com
обсуждение исходный текст
Ответы Re: using sequential scan instead of index for join with a union
Список pgsql-performance
Hi,

I am joining the union of three tables with another table. Postgresql uses the index when only two tables are in the union. If I add one more table to the union, it switches to seq scan. Apparently it also uses the index when only one table is joined.

The SQL is:
select * from (
SELECT 'NEWS' datatype, n.id, mbct_id FROM news n
union all
SELECT 'SPEECH' datatype, s.id, mbct_id FROM speech s
union all
SELECT 'NOTICE' datatype, notice.id, mbct_id FROM notice 
) x join NBSMultiBroadcast y on x.mbct_id=y.id where y.zhtw_grp_bct between '2019-05-10' and '2019-05-17';

The estimated number of rows is not off against the actual number of rows, which is around 120. So, I don't really understand why PostgreSQL seems to believe it should use Seq Scan due to a relatively large number of rows are expected.

I am using v11.3:

PostgreSQL 11.3 (Ubuntu 11.3-1.pgdg16.04+1) on i686-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 32-bit

The output of explain analyze is:
 Hash Join  (cost=1153.01..6273.58 rows=134 width=1856) (actual time=46.937..50.557 rows=120 loops=1)
   Hash Cond: (n.mbct_id = y.id)
   ->  Append  (cost=0.00..5043.33 rows=29422 width=48) (actual time=0.015..42.237 rows=29422 loops=1)
         ->  Seq Scan on news n  (cost=0.00..4588.30 rows=27430 width=48) (actual time=0.015..35.902 rows=27430 loops=1)
         ->  Seq Scan on speech s  (cost=0.00..26.26 rows=226 width=48) (actual time=0.009..0.182 rows=226 loops=1)
         ->  Seq Scan on notice  (cost=0.00..281.66 rows=1766 width=48) (actual time=0.005..1.283 rows=1766 loops=1)
   ->  Hash  (cost=1151.24..1151.24 rows=142 width=1808) (actual time=2.466..2.466 rows=130 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 34kB
         ->  Index Scan using zhtw_grp_bct on nbsmultibroadcast y  (cost=0.29..1151.24 rows=142 width=1808) (actual time=2.279..2.396 rows=130 loops=1)
               Index Cond: ((zhtw_grp_bct >= '2019-05-10 00:00:00'::timestamp without time zone) AND (zhtw_grp_bct <= '2019-05-17 00:00:00'::timestamp without time zone))
 Planning Time: 0.749 ms
 Execution Time: 50.637 ms

The output of explain analyze for just two tables in the union is:
 Nested Loop  (cost=0.57..5863.96 rows=126 width=1856) (actual time=2.199..21.513 rows=103 loops=1)
   ->  Index Scan using zhtw_grp_bct on nbsmultibroadcast y  (cost=0.29..1151.24 rows=142 width=1808) (actual time=2.172..2.313 rows=130 loops=1)
         Index Cond: ((zhtw_grp_bct >= '2019-05-10 00:00:00'::timestamp without time zone) AND (zhtw_grp_bct <= '2019-05-17 00:00:00'::timestamp without time zone))
   ->  Append  (cost=0.29..33.17 rows=2 width=48) (actual time=0.035..0.146 rows=1 loops=130)
         ->  Index Scan using news_mbct_id_idx on news n  (cost=0.29..6.33 rows=1 width=48) (actual time=0.004..0.005 rows=1 loops=130)
               Index Cond: (mbct_id = y.id)
         ->  Seq Scan on speech s  (cost=0.00..26.82 rows=1 width=48) (actual time=0.139..0.139 rows=0 loops=130)
               Filter: (y.id = mbct_id)
               Rows Removed by Filter: 226
 Planning Time: 0.639 ms
 Execution Time: 21.604 m

The size of the three tables are 27430, 226 and 1766 respectively.

Many thanks for any help!
--
Kent Tong
IT author and consultant, child education coach

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Naik, Sameer"
Дата:
Сообщение: RE: Re: Re: Generic Plans for Prepared Statement are 158155 times slowerthan Custom Plans
Следующее
От: Sergei Kornilov
Дата:
Сообщение: Re: using sequential scan instead of index for join with a union