Hi Konstantin!
09.11.17 20:14, Konstantin Knizhnik wrote:
> It is still far from ideal plan because each worker is working with
> all partitions, instead of spitting partitions between workers and
> calculate partial aggregates for each partition.
>
> But if we add FDW as a child of parent table, then parallel scan can
> not be used and we get the worst possible plan:
>
> postgres=# create foreign table derived_fdw() inherits(base) server
> pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE
> postgres=# explain select sum(x) from base;
> QUERY PLAN
> ----------------------------------------------------------------------------------
>
> Aggregate (cost=34055.07..34055.08 rows=1 width=8)
> -> Append (cost=0.00..29047.75 rows=2002926 width=4)
> -> Seq Scan on base (cost=0.00..0.00 rows=1 width=4)
> -> Seq Scan on derived1 (cost=0.00..14425.00 rows=1000000
> width=4)
> -> Seq Scan on derived2 (cost=0.00..14425.00 rows=1000000
> width=4)
> -> Foreign Scan on derived_fdw (cost=100.00..197.75
> rows=2925 width=4)
> (6 rows)
>
> So we sequentially pull all data to this node and compute aggregates
> locally.
> Ideal plan will calculate in parallel partial aggregates at all nodes
> and then combine partial results.
> It requires two changes:
> 1. Replace Aggregate->Append with
> Finalize_Aggregate->Append->Partial_Aggregate
> 2. Concurrent execution of Append. It also can be done in two
> different ways: we can try to use existed parallel workers
> infrastructure and
> replace Append with Gather. It seems to be the best approach for local
> partitioning. In case of remote (FDW) partitions, it is enough
> to split starting of execution (PQsendQuery in postgres_fdw) and
> getting results. So it requires some changes in FDW protocol.
>
>
> I wonder if somebody already investigate this problem or working in
> this direction.
> May be there are already some patches proposed?
> I have searched hackers archive, but didn't find something relevant...
> Are there any suggestions about the best approach to implement this
> feature?
>
Maybe in this thread[1] your described problem are solved through
introducing Parallel Append node?
1.
https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com
--
Regards,
Maksim Milyutin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers