[PERFORM] postgres_fdw and column casting shippability

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема [PERFORM] postgres_fdw and column casting shippability
Дата
Msg-id CAMkU=1wuSiR++ujs-O+eAATcOm7MkQGJ1SLBHbDVy7C8TBz8UA@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PERFORM] postgres_fdw and column casting shippability  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
I need to do a join between two foreign tables using columns of different types.

select data from remote2 join remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

For demonstration purposes, I use a loop-back foreign server, set up in the attached sql file.

If I do the join directly on the "foreign" server by specifying the schemaname where the physical tables live, it uses a sensible join plan, using an index on cutoff column to get a handful of rows, then casting the id column and using in index on remote1.id to get each row there.

explain analyze select data from remote.remote2 join remote.remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.56..1100.48 rows=100 width=8) (actual time=0.303..5.598 rows=119 loops=1)
   ->  Bitmap Heap Scan on remote2  (cost=5.13..334.85 rows=91 width=7) (actual time=0.112..0.899 rows=105 loops=1)
         Recheck Cond: (cutoff > '0.9999'::double precision)
         Heap Blocks: exact=105
         ->  Bitmap Index Scan on remote2_cutoff_idx  (cost=0.00..5.11 rows=91 width=0) (actual time=0.062..0.062 rows=105 loops=1)
               Index Cond: (cutoff > '0.9999'::double precision)
   ->  Index Scan using remote1_id_idx on remote1  (cost=0.43..8.40 rows=1 width=16) (actual time=0.038..0.041 rows=1 loops=105)
         Index Cond: (id = (remote2.id)::bigint)


But if I go through the foreign machinery, it doesn't use a good plan:

explain analyze select data from remote2 join remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=537.81..76743.81 rows=455000 width=4) (actual time=75.019..4659.802 rows=119 loops=1)
   Hash Cond: (remote1.id = (remote2.id)::bigint)
   ->  Foreign Scan on remote1  (cost=100.00..35506.00 rows=1000000 width=16) (actual time=1.110..4143.655 rows=1000000 loops=1)
   ->  Hash  (cost=436.67..436.67 rows=91 width=7) (actual time=2.754..2.754 rows=105 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 13kB
         ->  Foreign Scan on remote2  (cost=105.13..436.67 rows=91 width=7) (actual time=1.567..2.646 rows=105 loops=1)
 Planning time: 29.629 ms
 Execution time: 4660.433 ms

I thought it would either push the entire join to the foreign side, or at least do a foreign index scan on remote2_cutoff_idx, then loop over each row and do a foreign index scans against remote1_id_idx.

I've tried versions 9.6.3 and 10dev, and neither do what I expected.  It doesn't seem to be a planning problem where it thinks the fast plan is slower, it just doesn't seem to consider the faster plans as being options at all.  Is there some setting to make it realize the cast is shippable?  Is any of the work being done on postgres_fdw for V11 working towards fixing this?

Cheers,

Jeff
Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: plukovic
Дата:
Сообщение: Re: [PERFORM] Speed differences between in executing the same query
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PERFORM] postgres_fdw and column casting shippability