[HACKERS] convert EXSITS to inner join gotcha and bug

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема [HACKERS] convert EXSITS to inner join gotcha and bug
Дата
Msg-id f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
обсуждение исходный текст
Ответы Re: [HACKERS] convert EXSITS to inner join gotcha and bug
Список pgsql-hackers
Hi!

Seems, there two issues:

1) Sometime conditions which for a first glance could be pushed down to scan are 
leaved as join quals. And it could be a ~30 times performance loss.

2) Number of query result depend on enabe_seqscan variable.

The query
explain analyze
SELECT
         *
FROM
         t1
         INNER JOIN t2 ON (
                 EXISTS (
                         SELECT
                                 true
                         FROM
                                 t3
                         WHERE
                                 t3.id1 = t1.id  AND
                                 t3.id2 = t2.id
                 )
         )
WHERE
         t1.name = '5c5fec6a41b8809972870abc154b3ecd'
;

produces following plan:
  Nested Loop  (cost=6.42..1928.71 rows=1 width=99) (actual time=71.415..148.922 
rows=162 loops=1)
    Join Filter: (t3.id1 = t1.id)
    Rows Removed by Join Filter: 70368
    ->  Index Only Scan using t1i2 on t1  (cost=0.28..8.30 rows=1 width=66) 
(actual time=0.100..0.103 rows=1 loops=1)
          Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
          Heap Fetches: 1
    ->  Hash Join  (cost=6.14..1918.37 rows=163 width=66) (actual 
time=0.370..120.971 rows=70530 loops=1)
(1)      Hash Cond: (t3.id2 = t2.id)
(2)      ->  Seq Scan on t3  (cost=0.00..1576.30 rows=70530 width=66) (actual 
time=0.017..27.424 rows=70530 loops=1)
          ->  Hash  (cost=3.84..3.84 rows=184 width=33) (actual 
time=0.273..0.273 rows=184 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 20kB
                ->  Seq Scan on t2  (cost=0.00..3.84 rows=184 width=33) (actual 
time=0.017..0.105 rows=184 loops=1)
  Planning time: 7.326 ms
  Execution time: 149.115 ms


Condition (1) is not pushed to scan (2) which seemsly could be safely moved. 
With seqscan = off condition is not pushed too but query returns only one row 
instead of 162. Scan on t3 returns ~70000 rows but only ~150 rows are really 
needed. I didn't found a combination of GUCs enable_* to push down that and it 
seems to me there is reason for that which I don't see or support is somehow missed.

If pair of (t3.id1, t3.id2) is unique (see dump, there is a unique index on 
them) the query could be directly rewrited to inner join and its plan is:
  Nested Loop  (cost=9.70..299.96 rows=25 width=66) (actual time=0.376..5.232 
rows=162 loops=1)
    ->  Nested Loop  (cost=9.43..292.77 rows=25 width=99) (actual 
time=0.316..0.645 rows=162 loops=1)
          ->  Index Only Scan using t1i2 on t1  (cost=0.28..8.30 rows=1 
width=66) (actual time=0.047..0.050 rows=1 loops=1)
                Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text)
                Heap Fetches: 1
          ->  Bitmap Heap Scan on t3  (cost=9.15..283.53 rows=94 width=66) 
(actual time=0.257..0.426 rows=162 loops=1)
                Recheck Cond: (id1 = t1.id)
                Heap Blocks: exact=3
                ->  Bitmap Index Scan on t3i1  (cost=0.00..9.12 rows=94 width=0) 
(actual time=0.186..0.186 rows=162 loops=1)
                      Index Cond: (id1 = t1.id)
    ->  Index Only Scan using t2i1 on t2  (cost=0.27..0.29 rows=1 width=33) 
(actual time=0.024..0.024 rows=1 loops=162)
          Index Cond: (id = t3.id2)
          Heap Fetches: 162
  Planning time: 5.532 ms
  Execution time: 5.457 ms

Second plan is ~30 times faster. But with turned  off sequentual scan the first 
query is not work correctly, which points to some bug in planner, I suppose. 
Both 9.6 and 10devel are affected to addiction  of query result on seqscan variable.


Dump to reproduce (subset of real data but obfucated), queries are in attachment
http://sigaev.ru/misc/exists_to_nested.sql.gz
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kang Yuzhe
Дата:
Сообщение: Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Declarative partitioning - another take