Обсуждение: Function scan FDW pushdown

Поиск
Список
Период
Сортировка

Function scan FDW pushdown

От
Alexander Pyhalov
Дата:
Hi.

The attached patch allows pushing joins with function RTEs to PostgreSQL 
data sources.
This makes executing queries like this

create foreign table f_pgbench_accounts (aid int, bid int, abalance int, 
filler char(84)) SERVER local_srv OPTIONS (table_name 
'pgbench_accounts');
select * from f_pgbench_accounts join unnest(array[1,2,3]) ON unnest = 
aid;

more efficient.

with patch:

# explain analyze select * from f_pgbench_accounts join 
unnest(array[1,2,3,4,5,6]) ON unnest = aid;
                                            QUERY PLAN
------------------------------------------------------------------------------------------------
  Foreign Scan  (cost=100.00..116.95 rows=7 width=356) (actual 
time=2.282..2.287 rows=6 loops=1)
    Relations: (f_pgbench_accounts) INNER JOIN (FUNCTION RTE unnest)
  Planning Time: 0.487 ms
  Execution Time: 3.336 ms

without patch:

# explain analyze select * from f_pgbench_accounts join 
unnest(array[1,2,3,4,5,6]) ON unnest = aid;
                                                               QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=100.14..158.76 rows=7 width=356) (actual 
time=2.263..1268.607 rows=6 loops=1)
    Hash Cond: (f_pgbench_accounts.aid = unnest.unnest)
    ->  Foreign Scan on f_pgbench_accounts  (cost=100.00..157.74 rows=217 
width=352) (actual time=2.190..1205.938 rows=100000 loops=1)
    ->  Hash  (cost=0.06..0.06 rows=6 width=4) (actual time=0.041..0.043 
rows=6 loops=1)
          Buckets: 1024  Batches: 1  Memory Usage: 9kB
          ->  Function Scan on unnest  (cost=0.00..0.06 rows=6 width=4) 
(actual time=0.025..0.028 rows=6 loops=1)
  Planning Time: 0.389 ms
  Execution Time: 1269.627 ms

So far I don't know how to visualize actual function expression used in 
function RTE, as in postgresExplainForeignScan() es->rtable comes from  
queryDesc->plannedstmt->rtable, and rte->functions is already 0.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Вложения

Re: Function scan FDW pushdown

От
Ashutosh Bapat
Дата:
Hi Alexander,

On Thu, May 20, 2021 at 11:13 PM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
>
> Hi.
>
> The attached patch allows pushing joins with function RTEs to PostgreSQL
> data sources.
> This makes executing queries like this
>
> create foreign table f_pgbench_accounts (aid int, bid int, abalance int,
> filler char(84)) SERVER local_srv OPTIONS (table_name
> 'pgbench_accounts');
> select * from f_pgbench_accounts join unnest(array[1,2,3]) ON unnest =
> aid;
>

It will be good to provide some practical examples where this is useful.



> more efficient.
>
> with patch:
>
>
> So far I don't know how to visualize actual function expression used in
> function RTE, as in postgresExplainForeignScan() es->rtable comes from
> queryDesc->plannedstmt->rtable, and rte->functions is already 0.

The actual function expression will be part of the Remote SQL of
ForeignScan node so no need to visualize it separately.

The patch will have problems when there are multiple foreign tables
all on different servers or use different FDWs. In such a case the
function scan's RelOptInfo will get the fpinfo based on the first
foreign table the function scan is paired with during join planning.
But that may not be the best foreign table to join. We should be able
to plan all the possible joins. Current infra to add one fpinfo per
RelOptInfo won't help there. We need something better.

The patch targets only postgres FDW, how do you see this working with
other FDWs?

If we come up with the right approach we could use it for 1. pushing
down queries with IN () clause 2. joining a small local table with a
large foreign table by sending the local table rows down to the
foreign server.

-- 
Best Wishes,
Ashutosh Bapat



Re: Function scan FDW pushdown

От
Alexander Pyhalov
Дата:
Ashutosh Bapat писал 2021-06-15 16:15:
> Hi Alexander,

Hi.

The current version of the patch is based on asymetric partition-wise 
join.
Currently it is applied after 
v19-0001-Asymmetric-partitionwise-join.patch from
on 
https://www.postgresql.org/message-id/792d60f4-37bc-e6ad-68ca-c2af5cbb2d9b@postgrespro.ru 
.

>> So far I don't know how to visualize actual function expression used 
>> in
>> function RTE, as in postgresExplainForeignScan() es->rtable comes from
>> queryDesc->plannedstmt->rtable, and rte->functions is already 0.
> 
> The actual function expression will be part of the Remote SQL of
> ForeignScan node so no need to visualize it separately.

We still need to create tuple description for functions in 
get_tupdesc_for_join_scan_tuples(),
so I had to remove setting newrte->functions to NIL in 
add_rte_to_flat_rtable().
With rte->functions in place, there's no issues for explain.

> 
> The patch will have problems when there are multiple foreign tables
> all on different servers or use different FDWs. In such a case the
> function scan's RelOptInfo will get the fpinfo based on the first
> foreign table the function scan is paired with during join planning.
> But that may not be the best foreign table to join. We should be able
> to plan all the possible joins. Current infra to add one fpinfo per
> RelOptInfo won't help there. We need something better.

I suppose attached version of the patch is more mature.

> 
> The patch targets only postgres FDW, how do you see this working with
> other FDWs?

Not now. We introduce necessary APIs for other FDWs, but implementing 
TryShippableJoinPaths()
doesn't seem straightforward.

> 
> If we come up with the right approach we could use it for 1. pushing
> down queries with IN () clause 2. joining a small local table with a
> large foreign table by sending the local table rows down to the
> foreign server.


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Вложения