Обсуждение: [PERFORM] postgres_fdw and column casting shippability

Поиск
Список
Период
Сортировка

[PERFORM] postgres_fdw and column casting shippability

От
Jeff Janes
Дата:
I need to do a join between two foreign tables using columns of different types.

select data from remote2 join remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

For demonstration purposes, I use a loop-back foreign server, set up in the attached sql file.

If I do the join directly on the "foreign" server by specifying the schemaname where the physical tables live, it uses a sensible join plan, using an index on cutoff column to get a handful of rows, then casting the id column and using in index on remote1.id to get each row there.

explain analyze select data from remote.remote2 join remote.remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.56..1100.48 rows=100 width=8) (actual time=0.303..5.598 rows=119 loops=1)
   ->  Bitmap Heap Scan on remote2  (cost=5.13..334.85 rows=91 width=7) (actual time=0.112..0.899 rows=105 loops=1)
         Recheck Cond: (cutoff > '0.9999'::double precision)
         Heap Blocks: exact=105
         ->  Bitmap Index Scan on remote2_cutoff_idx  (cost=0.00..5.11 rows=91 width=0) (actual time=0.062..0.062 rows=105 loops=1)
               Index Cond: (cutoff > '0.9999'::double precision)
   ->  Index Scan using remote1_id_idx on remote1  (cost=0.43..8.40 rows=1 width=16) (actual time=0.038..0.041 rows=1 loops=105)
         Index Cond: (id = (remote2.id)::bigint)


But if I go through the foreign machinery, it doesn't use a good plan:

explain analyze select data from remote2 join remote1 on ((remote2.id)::bigint=remote1.id) where cutoff > 0.9999;

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=537.81..76743.81 rows=455000 width=4) (actual time=75.019..4659.802 rows=119 loops=1)
   Hash Cond: (remote1.id = (remote2.id)::bigint)
   ->  Foreign Scan on remote1  (cost=100.00..35506.00 rows=1000000 width=16) (actual time=1.110..4143.655 rows=1000000 loops=1)
   ->  Hash  (cost=436.67..436.67 rows=91 width=7) (actual time=2.754..2.754 rows=105 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 13kB
         ->  Foreign Scan on remote2  (cost=105.13..436.67 rows=91 width=7) (actual time=1.567..2.646 rows=105 loops=1)
 Planning time: 29.629 ms
 Execution time: 4660.433 ms

I thought it would either push the entire join to the foreign side, or at least do a foreign index scan on remote2_cutoff_idx, then loop over each row and do a foreign index scans against remote1_id_idx.

I've tried versions 9.6.3 and 10dev, and neither do what I expected.  It doesn't seem to be a planning problem where it thinks the fast plan is slower, it just doesn't seem to consider the faster plans as being options at all.  Is there some setting to make it realize the cast is shippable?  Is any of the work being done on postgres_fdw for V11 working towards fixing this?

Cheers,

Jeff
Вложения

Re: [PERFORM] postgres_fdw and column casting shippability

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> I've tried versions 9.6.3 and 10dev, and neither do what I expected.  It
> doesn't seem to be a planning problem where it thinks the fast plan is
> slower, it just doesn't seem to consider the faster plans as being options
> at all.  Is there some setting to make it realize the cast is shippable?

AFAICS, postgres_fdw doesn't have any knowledge of CoerceViaIO parse
nodes, so it's never going to consider this type of brute-force cast
as shippable.  Normal casts would presumably be shippable if the
underlying function is considered safe.

Looks like a round-tuit-shortage issue rather than anything fundamental.

            regards, tom lane


Re: [PERFORM] postgres_fdw and column casting shippability

От
Jeff Janes
Дата:
On Mon, May 15, 2017 at 3:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> I've tried versions 9.6.3 and 10dev, and neither do what I expected.  It
> doesn't seem to be a planning problem where it thinks the fast plan is
> slower, it just doesn't seem to consider the faster plans as being options
> at all.  Is there some setting to make it realize the cast is shippable?

AFAICS, postgres_fdw doesn't have any knowledge of CoerceViaIO parse
nodes, so it's never going to consider this type of brute-force cast
as shippable.  Normal casts would presumably be shippable if the
underlying function is considered safe.

So then, the secret is to write it like this:

explain analyze select data from remote2 join remote1 on (int8in(textout(remote2.id))  = remote1.id
   where cutoff > 0.9999;

This works to have the join pushed to the foreign side in 9.6, but not before that.

Thanks,

Jeff