Обсуждение: Construction of Plan-node by CSP (RE: Custom/Foreign-Join-APIs)
> -----Original Message----- > Sent: Friday, May 15, 2015 8:44 AM > To: 'Tom Lane'; Kohei KaiGai > Cc: Robert Haas; Thom Brown; Shigeru Hanada; pgsql-hackers@postgreSQL.org > Subject: RE: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) > > > A possible compromise that we could perhaps still wedge into 9.5 is to > > extend CustomPath with a List of child Paths, and CustomScan with a List > > of child Plans, which createplan.c would know to build from the Paths, > > and other modules would then also be aware of these children. I find that > > uglier than a separate join node type, but it would be tolerable I guess. > > > The attached patch implements what you suggested as is. > It allows custom-scan providers to have child Paths without exporting > create_plan_recurse(), and enables to represent N-way join naturally. > Please add any solution, even if we don't reach the consensus of how > create_plan_recurse (and other useful static functions) are visible to > extensions. > I updated the patch to fix up this problem towards the latest master branch. Let me remind the problem again. (I really have a hard time of it) When an extension tries to implement its own join logic using custom- scan interface, it adds CustomPath on set_join_pathlist_hook with its cost estimation. Once the path gets chosen by planner, PlanCustomPath callback shall be invoked, then, the custom-scan provider will construct its CustomScan node according to the path, and I expected it recursively initializes underlying Path nodes (that work as join input, if any) using create_plan_recurse(). However, at this moment, we didn't get 100% consensus to export this function to extensions. So, later commit made this function as static one, again. Instead of this approach, Tom suggested to add a list of child Paths on CustomPath node, then createplan.c calls create_plan_recurse() for each entry of the list, without this function getting exported. I can agree with this approach as an alternative of the previous public create_plan_recurse(), and the attached patch implements this idea, as is. (Do I understand his suggestion correctly?) Below is the expectation of the custom-scan provider which takes underlying Path/Plan nodes. 1. It adds a list of underlying Path nodes on custom_children of the CustomPath node. 2. On the PlanCustomPath, it adds adds Plan nodes (initialized by createplan.c, and passed as argument) onto lefttree, righttree and/or custom_children of CustomScan node 3. On the BeginCustomScan, it calls ExecInitNode() to begin execution of the underlying plan node. Then, if it has more than 2 children, attach these PlanState objects on the custom_children list for EXPLAIN output. As long as extension follows the above interface contract, it can have underlying child Path/Plan/PlanState without direct call of create_plan_recurse() as previously argued. I think it is enough reasonable solution for the problem. How about people's thought? Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
On Mon, May 25, 2015 at 5:08 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > I updated the patch to fix up this problem towards the latest master > branch. [ ... ] > Instead of this approach, Tom suggested to add a list of child Paths > on CustomPath node, then createplan.c calls create_plan_recurse() for > each entry of the list, without this function getting exported. Tom, do you want to review this patch and figure out how to solve the underlying problem? If not, I will take care of it. But I will be unhappy if I put time and effort into this and then you insist on changing everything afterwards, again. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> Tom, do you want to review this patch and figure out how to solve the
> underlying problem? If not, I will take care of it. But I will be
> unhappy if I put time and effort into this and then you insist on
> changing everything afterwards, again.
[ sorry for slow response, been busy ] I will take a look.
regards, tom lane
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Thursday, May 28, 2015 5:35 AM
> To: Robert Haas
> Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Shigeru Hanada;
> pgsql-hackers@postgreSQL.org
> Subject: Re: [HACKERS] Construction of Plan-node by CSP (RE:
> Custom/Foreign-Join-APIs)
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > Tom, do you want to review this patch and figure out how to solve the
> > underlying problem? If not, I will take care of it. But I will be
> > unhappy if I put time and effort into this and then you insist on
> > changing everything afterwards, again.
>
> [ sorry for slow response, been busy ] I will take a look.
>
Tom, how about your availability?
>From my side, I adjust my extension (PG-Strom) to fit the infrastructure you proposed,
then confirmed it is workable even if custom-scan, that replaced relations join, takes
more than two Path nodes in the custom_children list of CustomPath, with no exportiong
create_plan_recurse().
Below is an example of custom-scan (GpuJoin) that involves four relations join.
Its code base is the latest master + custom-join-children.v2.patch; unchanged from
the last post.
postgres=# explain analyze select avg(x) from t0 natural join t1 natural join t2 natural join t3 group by cat;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=298513.77..298514.10 rows=26 width=12) (actual time=5622.028..5622.033 rows=26 loops=1)
Group Key: t0.cat
-> Custom Scan (GpuJoin) (cost=4702.00..249169.85 rows=9868784 width=12) (actual time=540.718..2075.566
rows=10000000loops=1)
Bulkload: On (density: 100.00%)
Depth 1: Logic: GpuHashJoin, HashKeys: (cid), JoinQual: (cid = cid), nrows_ratio: 0.98936439
Depth 2: Logic: GpuHashJoin, HashKeys: (bid), JoinQual: (bid = bid), nrows_ratio: 0.99748135
Depth 3: Logic: GpuHashJoin, HashKeys: (aid), JoinQual: (aid = aid), nrows_ratio: 1.00000000
-> Custom Scan (BulkScan) on t0 (cost=0.00..242858.60 rows=10000060 width=24) (actual time=8.555..903.864
rows=10000000loops=1)
-> Seq Scan on t3 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.019..4.370 rows=40000 loops=1)
-> Seq Scan on t2 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.004..4.182 rows=40000 loops=1)
-> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.005..4.275 rows=40000 loops=1)
Planning time: 0.918 ms
Execution time: 6178.264 ms
(13 rows)
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>
Вложения
> Robert Haas <robertmhaas@gmail.com> writes: > > Tom, do you want to review this patch and figure out how to solve the > > underlying problem? If not, I will take care of it. But I will be > > unhappy if I put time and effort into this and then you insist on > > changing everything afterwards, again. > > [ sorry for slow response, been busy ] I will take a look. > > regards, tom lane > Tom, please don't forget the problem. It is still problematic for custom-scan provider that tries to implement own join logic, thus we still have to apply additional patch (or copy&paste createplan.c to module's source). Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>