Обсуждение: Custom Plan node

Поиск

Список

Период

Сортировка

Custom Plan node

От

Kohei KaiGai

Дата:

05 сентября 2013 г., 19:02:06

Hi,

The attached patch adds a new plan node type; CustomPlan that enables
extensions to get control during query execution, via registered callbacks.
Right now, all the jobs of the executor are built-in, except for foreign scan,
thus we have no way to run self implemented code within extension, instead
of a particular plan-tree portion. It is painful for people who want
to implement
an edge feature on the executor, because all we can do is to replace whole
of the executor portion but unreasonable maintenance burden.

CustomPlan requires extensions two steps to use; registration of a set of
callbacks, and manipulation of plan tree.
First, extension has to register a set of callbacks with a unique name
using RegisterCustomPlan(). Each callbacks are defined as follows, and
extension is responsible to perform these routines works well.

  void BeginCustomPlan(CustomPlanState *cestate, int eflags);
  TupleTableSlot *ExecCustomPlan(CustomPlanState *node);
  Node *MultiExecCustomPlan(CustomPlanState *node);
  void EndCustomPlan(CustomPlanState *node);
  void ExplainCustomPlan(CustomPlanState *node, ExplainState *es);
  void ReScanCustomPlan(CustomPlanState *node);
  void ExecMarkPosCustomPlan(CustomPlanState *node);
  void ExecRestrPosCustomPlan(CustomPlanState *node);

These callbacks are invoked if plan tree contained CustomPlan node.
However, usual code path never construct this node type towards any
SQL input. So, extension needs to manipulate the plan tree already
constructed.
It is the second job. Extension will put its local code on the planner_hook
to reference and manipulate PlannedStmt object. It can replace particular
nodes in plan tree by CustomPlan, or inject it into arbitrary point.

Though my intention is to implement GPU accelerate table scan or other
stuff on top of this feature, probably, some other useful features can be
thought. Someone suggested it may be useful for PG-XC folks to implement
clustered-scan, at the developer meeting. Also, I have an idea to implement
in-memory query cache that enables to cut off a particular branch of plan tree.
Probably, other folks have other ideas.

The contrib/xtime module shows a simple example that records elapsed time
of the underlying plan node, then print it at end of execution.
For example, this query constructs the following plan-tree as usually we see.

postgres=# EXPLAIN (costs off)
           SELECT * FROM t1 JOIN t2 ON t1.a = t2.x
                    WHERE x BETWEEN 1000 AND 1200 ORDER BY y;
                     QUERY PLAN
-----------------------------------------------------
 Sort
   Sort Key: t2.y
   ->  Nested Loop
         ->  Seq Scan on t2
               Filter: ((x >= 1000) AND (x <= 1200))
         ->  Index Scan using t1_pkey on t1
               Index Cond: (a = t2.x)
(7 rows)

Once xtime module manipulate the plan tree to inject CustomPlan,
it shall become as follows:

postgres=# LOAD '$libdir/xtime';
LOAD
postgres=# EXPLAIN (costs off)
           SELECT * FROM t1 JOIN t2 ON t1.a = t2.x
                    WHERE x BETWEEN 1000 AND 1200 ORDER BY y;
                           QUERY PLAN
-----------------------------------------------------------------
 CustomPlan:xtime
   ->  Sort
         Sort Key: y
         ->  CustomPlan:xtime
               ->  Nested Loop
                     ->  CustomPlan:xtime on t2
                           Filter: ((x >= 1000) AND (x <= 1200))
                     ->  CustomPlan:xtime
                           ->  Index Scan using t1_pkey on t1
                                 Index Cond: (a = x)
(10 rows)

You can see CustomPlan with name of "xtime" appeared in the plan-tree,
then the executor calls functions being registered as callback of "xtime",
when it met CustomPlan during recursive execution.

Extension has to set name of custom plan provider at least when it
construct a CustomPlan node and put it on the target plan tree.
A set of callbacks are looked up by the name, and installed on
CustomPlanState object for execution, on ExecIniNode().
The reason why I didn't put function pointer directly is, plan nodes need
to be complianced to copyObject() and others.

Please any comments.

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Вложения

pgsql-v9.4-custom-plan-node.v1.patch

Re: Custom Plan node

От

Tom Lane

Дата:

06 сентября 2013 г., 20:54:00

Kohei KaiGai <kaigai@kaigai.gr.jp> writes:
> The attached patch adds a new plan node type; CustomPlan that enables
> extensions to get control during query execution, via registered callbacks.

TBH, I think this is really an exercise in building useless mechanism.
I don't believe that any actually *interesting* new types of plan node can
be inserted into a query plan without invasive changes to the planner, and
so it's a bit pointless to set up hooks whereby you can avoid touching any
source code in the executor.

> ... Extension will put its local code on the planner_hook
> to reference and manipulate PlannedStmt object.

That is hardly a credible design for doing anything interesting with
custom plans.  It's got exactly the same problem you are complaining about
for the executor, ie you have to replace the whole of the planner if you
try to do things that way.

One other point here is: if you need more than one kind of custom plan
node, how will you tell what's what?  I doubt you can completely eliminate
the need for IsA-style tests, especially in the planner area.  The sample
contrib module here already exposes the failure mode I'm worried about:
it falls down as soon as it sees a plan node type it doesn't know.  If you
could show me how this would work together with some other extension
that's also adding custom plan nodes of its own, then I might think you
had something.

In the same vein, the patch fails to provide credible behavior for
ExecSupportsMarkRestore, ExecMaterializesOutput, ExplainPreScanNode,
search_plan_tree, and probably some other places that need to know
about all possible plan node types.

Even if you'd covered every one of those bases, you've still only got
support for "generic" plan nodes having no particularly unique properties.
As an example of what I'm thinking about here, NestLoop, which might be
considered the most vanilla of all join plan nodes, actually has a lot of
specialized infrastructure in both the planner and the executor to support
its ability to pass outer-relation values into the inner-relation scan.
I think that as soon as you try to do anything of real interest with
custom plan nodes, you'll be finding you need special-purpose additions
that no set of generic hooks could reasonably cover.

In short, I don't understand or agree with this idea that major changes
should be implementable without touching any of the core code in any way.
This is open source --- if you need a modified version then modify it.
I used to build systems that needed hook-style extensibility because the
core code was burned into ROM; but that's not what we're dealing with
today, and I don't really see the argument for sacrificing readability
and performance by putting hooks everywhere, especially in places with
vague, ever-changing API contracts.
        regards, tom lane

Re: Custom Plan node

От

Robert Haas

Дата:

06 сентября 2013 г., 22:43:52

On Fri, Sep 6, 2013 at 4:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kohei KaiGai <kaigai@kaigai.gr.jp> writes:
>> The attached patch adds a new plan node type; CustomPlan that enables
>> extensions to get control during query execution, via registered callbacks.
>
> TBH, I think this is really an exercise in building useless mechanism.
> I don't believe that any actually *interesting* new types of plan node can
> be inserted into a query plan without invasive changes to the planner, and
> so it's a bit pointless to set up hooks whereby you can avoid touching any
> source code in the executor.

I find this a somewhat depressing response.  Didn't we discuss this
exact design at the developer meeting in Ottawa?  I thought it sounded
reasonable to you then, or at least I don't remember you panning it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Custom Plan node

От

Tom Lane

Дата:

06 сентября 2013 г., 23:03:36

Robert Haas <robertmhaas@gmail.com> writes:
> I find this a somewhat depressing response.  Didn't we discuss this
> exact design at the developer meeting in Ottawa?  I thought it sounded
> reasonable to you then, or at least I don't remember you panning it.

What I recall saying is that I didn't see how the planner side of it would
work ... and I still don't see that.  I'd be okay with committing
executor-side fixes only if we had a vision of where we'd go on the
planner side; but this patch doesn't offer any path forward there.

This is not unlike the FDW stuff, where getting a reasonable set of
planner APIs in place was by far the hardest part (and isn't really done
even yet, since you still can't do remote joins or remote aggregation in
any reasonable fashion).  But you can do simple stuff reasonably simply,
without reimplementing all of the planner along the way --- and I think
we should look for some equivalent level of usefulness from this before
we commit it.
        regards, tom lane

Re: Custom Plan node

От

Kohei KaiGai

Дата:

07 сентября 2013 г., 03:31:33

2013/9/7 Tom Lane <tgl@sss.pgh.pa.us>:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I find this a somewhat depressing response.  Didn't we discuss this
>> exact design at the developer meeting in Ottawa?  I thought it sounded
>> reasonable to you then, or at least I don't remember you panning it.
>
> What I recall saying is that I didn't see how the planner side of it would
> work ... and I still don't see that.  I'd be okay with committing
> executor-side fixes only if we had a vision of where we'd go on the
> planner side; but this patch doesn't offer any path forward there.
>
The reason why this patch stick on executor-side is we concluded
not to patch the planner code from the beginning in Ottawa because
of its complexity.
I'd also like to agree that planner support for custom plan is helpful
to construct better execution plan, however, it also make sense even
if this feature begins a functionality that offers a way to arrange a plan
tree being already constructed.

Anyway, let me investigate what's kind of APIs to be added for planner
stage also.

> This is not unlike the FDW stuff, where getting a reasonable set of
> planner APIs in place was by far the hardest part (and isn't really done
> even yet, since you still can't do remote joins or remote aggregation in
> any reasonable fashion).  But you can do simple stuff reasonably simply,
> without reimplementing all of the planner along the way --- and I think
> we should look for some equivalent level of usefulness from this before
> we commit it.
>
Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Custom Plan node

От

Kohei KaiGai

Дата:

07 сентября 2013 г., 12:50:07

2013/9/7 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2013/9/7 Tom Lane <tgl@sss.pgh.pa.us>:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I find this a somewhat depressing response.  Didn't we discuss this
>>> exact design at the developer meeting in Ottawa?  I thought it sounded
>>> reasonable to you then, or at least I don't remember you panning it.
>>
>> What I recall saying is that I didn't see how the planner side of it would
>> work ... and I still don't see that.  I'd be okay with committing
>> executor-side fixes only if we had a vision of where we'd go on the
>> planner side; but this patch doesn't offer any path forward there.
>>
> The reason why this patch stick on executor-side is we concluded
> not to patch the planner code from the beginning in Ottawa because
> of its complexity.
> I'd also like to agree that planner support for custom plan is helpful
> to construct better execution plan, however, it also make sense even
> if this feature begins a functionality that offers a way to arrange a plan
> tree being already constructed.
>
> Anyway, let me investigate what's kind of APIs to be added for planner
> stage also.
>
It is a brief idea to add planner support on custom node, if we need it
from the beginning. Of course, it is not still detailed considered and
needs much brushing up, however, it may be a draft to implement this
feature.

We may be able to categorize plan node types into three; scan, join
and others.

Even though planner tries to test various combinations of join and scan
to minimize its estimated cost, we have less options on other types
like T_Agg and so on. It seems to me the other types are almost put
according to the query's form, so it does not make a big problem even
if all we can do is manipulation of plan-tree at planner_hook.
That is similar to what proposed patch is doing.

So, let's focus on join and scan. It needs to give extensions a chance
to override built-in path if they can offer more cheap path.
It leads an API that allows to add alternative paths when built-in feature
is constructing candidate paths. Once path was added, we can compare
them according to the estimated cost.
For example, let's assume a query tries to join foreign table A and B
managed by same postgres_fdw server, remote join likely has cheaper
cost than local join. If extension has a capability to handle the case
correctly, it may be able to add an alternative "custom-join" path with
cheaper-cost.
Then, this path shall be transformed to "CustomJoin" node that launches
a query to get a result of A join B being remotely joined.
In this case, here is no matter even if "CustomJoin" has underlying
ForeignScan nodes on the left-/right-tree, because extension can handle
the things to do with its arbitrary.

So, the following APIs may be needed for planner support, at least.

* API to add an alternative join path, in addition to built-in join logic.
* API to add an alternative scan path, in addition to built-in scan logic.
* API to construct "CustomJoin" according to the related path.
* API to construct "CustomScan" according to the related path.

Any comment please.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Custom Plan node

От

David Fetter

Дата:

07 сентября 2013 г., 14:49:05

On Sat, Sep 07, 2013 at 02:49:54PM +0200, Kohei KaiGai wrote:
> 2013/9/7 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> > 2013/9/7 Tom Lane <tgl@sss.pgh.pa.us>:
> >> Robert Haas <robertmhaas@gmail.com> writes:
> >>> I find this a somewhat depressing response.  Didn't we discuss this
> >>> exact design at the developer meeting in Ottawa?  I thought it sounded
> >>> reasonable to you then, or at least I don't remember you panning it.
> >>
> >> What I recall saying is that I didn't see how the planner side of it would
> >> work ... and I still don't see that.  I'd be okay with committing
> >> executor-side fixes only if we had a vision of where we'd go on the
> >> planner side; but this patch doesn't offer any path forward there.
> >>
> > The reason why this patch stick on executor-side is we concluded
> > not to patch the planner code from the beginning in Ottawa because
> > of its complexity.
> > I'd also like to agree that planner support for custom plan is helpful
> > to construct better execution plan, however, it also make sense even
> > if this feature begins a functionality that offers a way to arrange a plan
> > tree being already constructed.
> >
> > Anyway, let me investigate what's kind of APIs to be added for planner
> > stage also.
> >
> It is a brief idea to add planner support on custom node, if we need it
> from the beginning. Of course, it is not still detailed considered and
> needs much brushing up, however, it may be a draft to implement this
> feature.
> 
> We may be able to categorize plan node types into three; scan, join
> and others.
> 
> Even though planner tries to test various combinations of join and scan
> to minimize its estimated cost, we have less options on other types
> like T_Agg and so on. It seems to me the other types are almost put
> according to the query's form, so it does not make a big problem even
> if all we can do is manipulation of plan-tree at planner_hook.
> That is similar to what proposed patch is doing.
> 
> So, let's focus on join and scan. It needs to give extensions a chance
> to override built-in path if they can offer more cheap path.
> It leads an API that allows to add alternative paths when built-in feature
> is constructing candidate paths. Once path was added, we can compare
> them according to the estimated cost.
> For example, let's assume a query tries to join foreign table A and B
> managed by same postgres_fdw server, remote join likely has cheaper
> cost than local join. If extension has a capability to handle the case
> correctly, it may be able to add an alternative "custom-join" path with
> cheaper-cost.
> Then, this path shall be transformed to "CustomJoin" node that launches
> a query to get a result of A join B being remotely joined.
> In this case, here is no matter even if "CustomJoin" has underlying
> ForeignScan nodes on the left-/right-tree, because extension can handle
> the things to do with its arbitrary.
> 
> So, the following APIs may be needed for planner support, at least.
> 
> * API to add an alternative join path, in addition to built-in join logic.
> * API to add an alternative scan path, in addition to built-in scan logic.
> * API to construct "CustomJoin" according to the related path.
> * API to construct "CustomScan" according to the related path.
> 
> Any comment please.

The broad outlines look great.

Do we have any way, at least conceptually, to consider the graph of
the cluster with edges weighted by network bandwidth and latency?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Custom Plan node

От

Kohei KaiGai

Дата:

07 сентября 2013 г., 15:21:39

2013/9/7 David Fetter <david@fetter.org>:
> On Sat, Sep 07, 2013 at 02:49:54PM +0200, Kohei KaiGai wrote:
>> 2013/9/7 Kohei KaiGai <kaigai@kaigai.gr.jp>:
>> > 2013/9/7 Tom Lane <tgl@sss.pgh.pa.us>:
>> >> Robert Haas <robertmhaas@gmail.com> writes:
>> >>> I find this a somewhat depressing response.  Didn't we discuss this
>> >>> exact design at the developer meeting in Ottawa?  I thought it sounded
>> >>> reasonable to you then, or at least I don't remember you panning it.
>> >>
>> >> What I recall saying is that I didn't see how the planner side of it would
>> >> work ... and I still don't see that.  I'd be okay with committing
>> >> executor-side fixes only if we had a vision of where we'd go on the
>> >> planner side; but this patch doesn't offer any path forward there.
>> >>
>> > The reason why this patch stick on executor-side is we concluded
>> > not to patch the planner code from the beginning in Ottawa because
>> > of its complexity.
>> > I'd also like to agree that planner support for custom plan is helpful
>> > to construct better execution plan, however, it also make sense even
>> > if this feature begins a functionality that offers a way to arrange a plan
>> > tree being already constructed.
>> >
>> > Anyway, let me investigate what's kind of APIs to be added for planner
>> > stage also.
>> >
>> It is a brief idea to add planner support on custom node, if we need it
>> from the beginning. Of course, it is not still detailed considered and
>> needs much brushing up, however, it may be a draft to implement this
>> feature.
>>
>> We may be able to categorize plan node types into three; scan, join
>> and others.
>>
>> Even though planner tries to test various combinations of join and scan
>> to minimize its estimated cost, we have less options on other types
>> like T_Agg and so on. It seems to me the other types are almost put
>> according to the query's form, so it does not make a big problem even
>> if all we can do is manipulation of plan-tree at planner_hook.
>> That is similar to what proposed patch is doing.
>>
>> So, let's focus on join and scan. It needs to give extensions a chance
>> to override built-in path if they can offer more cheap path.
>> It leads an API that allows to add alternative paths when built-in feature
>> is constructing candidate paths. Once path was added, we can compare
>> them according to the estimated cost.
>> For example, let's assume a query tries to join foreign table A and B
>> managed by same postgres_fdw server, remote join likely has cheaper
>> cost than local join. If extension has a capability to handle the case
>> correctly, it may be able to add an alternative "custom-join" path with
>> cheaper-cost.
>> Then, this path shall be transformed to "CustomJoin" node that launches
>> a query to get a result of A join B being remotely joined.
>> In this case, here is no matter even if "CustomJoin" has underlying
>> ForeignScan nodes on the left-/right-tree, because extension can handle
>> the things to do with its arbitrary.
>>
>> So, the following APIs may be needed for planner support, at least.
>>
>> * API to add an alternative join path, in addition to built-in join logic.
>> * API to add an alternative scan path, in addition to built-in scan logic.
>> * API to construct "CustomJoin" according to the related path.
>> * API to construct "CustomScan" according to the related path.
>>
>> Any comment please.
>
> The broad outlines look great.
>
> Do we have any way, at least conceptually, to consider the graph of
> the cluster with edges weighted by network bandwidth and latency?
>
As postgres_fdw is now doing?
Its configuration allows to add cost to connect remote server as startup
cost, and also add cost to transfer data on network being multiplexed
with estimated number of rows, according to per-server configuration.
I think it is responsibility of the custom plan provider, and fully depends
on the nature of what does it want to provide.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Custom Plan node

От

David Fetter

Дата:

07 сентября 2013 г., 19:05:04

On Sat, Sep 07, 2013 at 05:21:31PM +0200, Kohei KaiGai wrote:
> 2013/9/7 David Fetter <david@fetter.org>:
> > The broad outlines look great.
> >
> > Do we have any way, at least conceptually, to consider the graph
> > of the cluster with edges weighted by network bandwidth and
> > latency?
> >
> As postgres_fdw is now doing?  Its configuration allows to add cost
> to connect remote server as startup cost, and also add cost to
> transfer data on network being multiplexed with estimated number of
> rows, according to per-server configuration.  I think it is
> responsibility of the custom plan provider, and fully depends on the
> nature of what does it want to provide.

Interesting :)

Sorry I was unclear.  I meant something that could take into account
the bandwidths and latencies throughout the graph, not just the
connections directly from the originating node.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Custom Plan node

От

Robert Haas

Дата:

09 сентября 2013 г., 16:46:44

On Fri, Sep 6, 2013 at 7:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I find this a somewhat depressing response.  Didn't we discuss this
>> exact design at the developer meeting in Ottawa?  I thought it sounded
>> reasonable to you then, or at least I don't remember you panning it.
>
> What I recall saying is that I didn't see how the planner side of it would
> work ... and I still don't see that.  I'd be okay with committing
> executor-side fixes only if we had a vision of where we'd go on the
> planner side; but this patch doesn't offer any path forward there.
>
> This is not unlike the FDW stuff, where getting a reasonable set of
> planner APIs in place was by far the hardest part (and isn't really done
> even yet, since you still can't do remote joins or remote aggregation in
> any reasonable fashion).  But you can do simple stuff reasonably simply,
> without reimplementing all of the planner along the way --- and I think
> we should look for some equivalent level of usefulness from this before
> we commit it.

I do think there are problems with this as written.  The example
consumer of the hook seems to contain a complete list of plan nodes,
which is an oxymoron in the face of a facility to add custom plan
nodes.

But, I guess I'm not yet convinced that one-for-one substitution of
nodes is impossible even with something about this simple.  If someone
can do a post-pass over the plan tree and replace a SeqScan node with
an AwesomeSeqScan node or a Sort node with a RadixSort node, would
that constitute a sufficient POC to justify this infrastructure?
Obviously, what you'd really want is to be able to inject those nodes
(with proper costing) at the time they'd otherwise be generated, since
it could affect whether or not a path involving a substituted node
survives in the first place, but I'm not sure it's reasonable to
expect the planner infrastructure for such changes in the same path as
the executor hooks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Custom Plan node

От

Tom Lane

Дата:

09 сентября 2013 г., 17:20:53

Robert Haas <robertmhaas@gmail.com> writes:
> But, I guess I'm not yet convinced that one-for-one substitution of
> nodes is impossible even with something about this simple.  If someone
> can do a post-pass over the plan tree and replace a SeqScan node with
> an AwesomeSeqScan node or a Sort node with a RadixSort node, would
> that constitute a sufficient POC to justify this infrastructure?

No, for exactly the reason you mention: such a change wouldn't have been
accounted for in the planner's other choices, and thus this isn't anything
more than a kluge.

In these specific examples you'd have to ask whether it wouldn't make more
sense to be modifying or hooking the executor's code for the existing plan
node types, anyway.  The main reason I can see for not attacking it like
that would be if you wanted the planner to do something different ---
which the above approach forecloses.

Let me be clear that I'm not against the concept of custom plan nodes.
But it was obvious from the beginning that making the executor deal with
them would be much easier than making the planner deal with them.  I don't
think we should commit a bunch of executor-side infrastructure in the
absence of any (ahem) plan for doing something realistic on the planner
side.  Either that infrastructure will go unused, or we'll be facing a
continual stream of demands for doubtless-half-baked planner changes
so that people can do something with it.

I'd be willing to put in the infrastructure as soon as it's clear that we
have a way forward, but not if it's never going to be more than a kluge.
        regards, tom lane

Re: Custom Plan node

От

Stephen Frost

Дата:

09 сентября 2013 г., 17:29:15

* Robert Haas (robertmhaas@gmail.com) wrote:
> But, I guess I'm not yet convinced that one-for-one substitution of
> nodes is impossible even with something about this simple.

Couldn't that be done with hooks in those specific plan nodes, or
similar..?  Of course, as Tom points out, that wouldn't address how the
costing is done and it could end up being wrong if the implementation of
the node is completely different.

All that said, I've already been wishing for a way to change how Append
works to allow for parallel execution through FDWs; eg: you have a bunch
of foreign tables (say, 32) to independent PG clusters on indepentdent
pieces of hardware which can all execute a given request in parallel.
With a UNION ALL view created over top of those tables, it'd be great if
we fired off all the queries at once and then went through collecting
the responses, instead of going through them serially..

The same approach could actually be said for Appends which go across
tablespaces, if you consider that independent tablespaces mean
independent and parallelizable I/O access.  Of course, all of this would
need to deal sanely with ORDER BY and LIMIT cases.
Thanks,
    Stephen

Re: Custom Plan node

От

Robert Haas

Дата:

10 сентября 2013 г., 10:33:20

On Mon, Sep 9, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Let me be clear that I'm not against the concept of custom plan nodes.
> But it was obvious from the beginning that making the executor deal with
> them would be much easier than making the planner deal with them.  I don't
> think we should commit a bunch of executor-side infrastructure in the
> absence of any (ahem) plan for doing something realistic on the planner
> side.  Either that infrastructure will go unused, or we'll be facing a
> continual stream of demands for doubtless-half-baked planner changes
> so that people can do something with it.
>
> I'd be willing to put in the infrastructure as soon as it's clear that we
> have a way forward, but not if it's never going to be more than a kluge.

Fair enough, I think.  So the action item for KaiGai is to think of
how the planner integration might work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Custom Plan node

От

Greg Stark

Дата:

10 сентября 2013 г., 10:55:43

On Tue, Sep 10, 2013 at 11:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I'd be willing to put in the infrastructure as soon as it's clear that we
>> have a way forward, but not if it's never going to be more than a kluge.
>
> Fair enough, I think.  So the action item for KaiGai is to think of
> how the planner integration might work.

It's hard to imagine how the planner could possibly be pluggable in a
generally useful way. It sounds like putting an insurmountable barrier
in place that blocks a feature that would be useful in the Executor.

If you only want to handle nodes which directly replace one of the
existing nodes providing the same semantics then that might be
possible. But I think you also want to be able to interpose new nodes
in the tree, for example for the query cache idea.

Frankly I think the planner is hard enough to change when you have
full access to the source I can't imagine trying to manipulate it from
a distance like this.

-- 
greg

Re: Custom Plan node

От

Tom Lane

Дата:

10 сентября 2013 г., 13:58:03

Greg Stark <stark@mit.edu> writes:
> It's hard to imagine how the planner could possibly be pluggable in a
> generally useful way. It sounds like putting an insurmountable barrier
> in place that blocks a feature that would be useful in the Executor.

But it's *not* useful without a credible way to modify the planner.

> Frankly I think the planner is hard enough to change when you have
> full access to the source I can't imagine trying to manipulate it from
> a distance like this.

Yeah.  To tell the truth, I'm not really convinced that purely-plugin
addition of new plan node types is going to be worth anything.  I think
pretty much any interesting project (for example, the parallelization of
Append nodes that was mumbled about above) is going to require changes
that won't fit within such an infrastructure.  Still, I'm willing to
accept a patch for plugin plan nodes, *if* it addresses the hard part
of that problem and not only the easy part.
        regards, tom lane

Re: Custom Plan node

От

Kohei KaiGai

Дата:

11 сентября 2013 г., 03:45:22

2013/9/10 Robert Haas <robertmhaas@gmail.com>:
> On Mon, Sep 9, 2013 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Let me be clear that I'm not against the concept of custom plan nodes.
>> But it was obvious from the beginning that making the executor deal with
>> them would be much easier than making the planner deal with them.  I don't
>> think we should commit a bunch of executor-side infrastructure in the
>> absence of any (ahem) plan for doing something realistic on the planner
>> side.  Either that infrastructure will go unused, or we'll be facing a
>> continual stream of demands for doubtless-half-baked planner changes
>> so that people can do something with it.
>>
>> I'd be willing to put in the infrastructure as soon as it's clear that we
>> have a way forward, but not if it's never going to be more than a kluge.
>
> Fair enough, I think.  So the action item for KaiGai is to think of
> how the planner integration might work.
>
Do you think the idea I mentioned at the upthread is worth to investigate
for more detailed consideration? Or, does it seems to you short-sighted
thinking to fit this infrastructure with planner?

It categorizes plan node into three: join, scan and other stuff.
Cost based estimation is almost applied on join and scan, so abstracted
scan and join may make sense to inform core planner what does this
custom plan node try to do.
On the other hand, other stuff, like Agg, is a stuff that must be added
according to the provided query, even if its cost estimation was not small,
to perform as the provided query described.
So, I thought important one is integration of join and scan, but manipulation
of plan tree for other stuff is sufficient.

How about your opinion?

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Custom Plan node

От

Robert Haas

Дата:

13 сентября 2013 г., 14:43:34

On Tue, Sep 10, 2013 at 11:45 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> Fair enough, I think.  So the action item for KaiGai is to think of
>> how the planner integration might work.
>>
> Do you think the idea I mentioned at the upthread is worth to investigate
> for more detailed consideration? Or, does it seems to you short-sighted
> thinking to fit this infrastructure with planner?
>
> It categorizes plan node into three: join, scan and other stuff.
> Cost based estimation is almost applied on join and scan, so abstracted
> scan and join may make sense to inform core planner what does this
> custom plan node try to do.
> On the other hand, other stuff, like Agg, is a stuff that must be added
> according to the provided query, even if its cost estimation was not small,
> to perform as the provided query described.
> So, I thought important one is integration of join and scan, but manipulation
> of plan tree for other stuff is sufficient.
>
> How about your opinion?

Well, I don't know that I'm smart enough to predict every sort of
thing that someone might want to do here, unfortunately.  This is a
difficult area: there are many possible things someone might want to
do, and as Tom points out, there's a lot of special handling of
particular node types that can make things difficult.  And I can't
claim to be an expert in this area.

That having been said, I think the idea of a CustomScan node is
probably worth investigating.  I don't know if that would work out
well or poorly, but I think it would be worth some experimentation.
Perhaps you could have a hook that gets called for each baserel, and
can decide whether or not it wishes to inject any additional paths;
and then a CustomScan node that could be used to introduce such paths.I've been thinking that we really ought to have
theability to

optimize CTID range queries, like SELECT * FROM foo WHERE ctid > 'some
constant'.  We have a Tid Scan node, but it only handles equalities,
not inequalities.  I suppose this functionality should really be in
core, but since it's not it might make an interesting test for the
infrastructure you're proposing.  You may be able to think of
something else you like better; it's just a thought.

I am a little less sanguine about the chances of a CustomJoin node
working out well.  I agree that we need something to handle join
pushdown, but it seems to me that might be done by providing a Foreign
Scan path into the joinrel rather than by adding a concept of foreign
joins per se.  There are other possible new join types, like the Range
Join that Jeff Davis has mentioned in the past, which might be
interesting.  But overall, I can't see us adding very many more join
types, so I'm not totally sure how much extensibility would help us
here.  We've added a few scan types over the years (index only scans
in 9.2, and bitmap index scans in 8.1, I think) but all of our
existing join types go back to time immemorial.

And I think that lumping everything else together under "not a scan or
join" has the least promise of all.  Is replacing Append really the
same as replacing Sort?  I think we'll need to think harder here about
what we're trying to accomplish and how to get there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Custom Plan node

От

Kohei KaiGai

Дата:

03 октября 2013 г., 07:06:06

2013/9/13 Robert Haas <robertmhaas@gmail.com>:
> On Tue, Sep 10, 2013 at 11:45 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>>> Fair enough, I think.  So the action item for KaiGai is to think of
>>> how the planner integration might work.
>>>
>> Do you think the idea I mentioned at the upthread is worth to investigate
>> for more detailed consideration? Or, does it seems to you short-sighted
>> thinking to fit this infrastructure with planner?
>>
>> It categorizes plan node into three: join, scan and other stuff.
>> Cost based estimation is almost applied on join and scan, so abstracted
>> scan and join may make sense to inform core planner what does this
>> custom plan node try to do.
>> On the other hand, other stuff, like Agg, is a stuff that must be added
>> according to the provided query, even if its cost estimation was not small,
>> to perform as the provided query described.
>> So, I thought important one is integration of join and scan, but manipulation
>> of plan tree for other stuff is sufficient.
>>
>> How about your opinion?
>
> Well, I don't know that I'm smart enough to predict every sort of
> thing that someone might want to do here, unfortunately.  This is a
> difficult area: there are many possible things someone might want to
> do, and as Tom points out, there's a lot of special handling of
> particular node types that can make things difficult.  And I can't
> claim to be an expert in this area.
>
Sorry for my late response. I've tried to investigate the planner code
to find out the way to integrate this custom api, and it is still in
progress.
One special handling I found was that create_join_plan() adjust
root->curOuterRels prior to recursion of inner tree if NestLoop.
Probably, we need some flags to control these special handling
in the core.
It is a hard job to list up all the stuff, so it seems to me we need
to check-up them during code construction...

> That having been said, I think the idea of a CustomScan node is
> probably worth investigating.  I don't know if that would work out
> well or poorly, but I think it would be worth some experimentation.
> Perhaps you could have a hook that gets called for each baserel, and
> can decide whether or not it wishes to inject any additional paths;
> and then a CustomScan node that could be used to introduce such paths.
>  I've been thinking that we really ought to have the ability to
> optimize CTID range queries, like SELECT * FROM foo WHERE ctid > 'some
> constant'.  We have a Tid Scan node, but it only handles equalities,
> not inequalities.  I suppose this functionality should really be in
> core, but since it's not it might make an interesting test for the
> infrastructure you're proposing.  You may be able to think of
> something else you like better; it's just a thought.
>
This above framework was exactly what I considered.
Probably, we have to put a hook on functions invoked by
set_base_rel_pathlist() to add another possible way to scan
the provided baserel, then set_cheapest() will choose the
most reasonable one.
The attached patch, it's just a works-in-progress, shows
which hook I try to put around the code. Please grep it
with "add_custom_scan_paths".
Regarding to the "guest module" of this framework, another
idea that I have is, built-in query cache module that returns
previous scan result being cached if table contents was not
updated from the previous run. Probably, it makes sense in
case when most of rows are filtered out in this scan.
Anyway, I'd like to consider something useful to demonstrate
this API.

> I am a little less sanguine about the chances of a CustomJoin node
> working out well.  I agree that we need something to handle join
> pushdown, but it seems to me that might be done by providing a Foreign
> Scan path into the joinrel rather than by adding a concept of foreign
> joins per se.
>
Indeed, if we have a hook on add_paths_to_joinrel(), it also makes
sense for foreign tables; probably, planner will choose foreign-path
instead of existing join node including foreign-scans.

>  There are other possible new join types, like the Range
> Join that Jeff Davis has mentioned in the past, which might be
> interesting.  But overall, I can't see us adding very many more join
> types, so I'm not totally sure how much extensibility would help us
> here.  We've added a few scan types over the years (index only scans
> in 9.2, and bitmap index scans in 8.1, I think) but all of our
> existing join types go back to time immemorial.
>
It seems to me a significant point. Even if the custom node being added
by extension instead of joins looks like a scan for core-PostgreSQL,
extension will be able to run its custom join equivalent to join.

I think, the above built-in query cache idea make sense to demonstrate
this pseudo join node; that will be able to hold a materialized result being
already joined. At least, it seems to me sufficient for my target; table join
accelerated with GPU device.

> And I think that lumping everything else together under "not a scan or
> join" has the least promise of all.  Is replacing Append really the
> same as replacing Sort?  I think we'll need to think harder here about
> what we're trying to accomplish and how to get there.
>
As long as extension modifies PlannedStmt on the planner_hook,
I don't think it is not difficult so much, as I demonstrate on the
previous patch.
Unlike scan or join, existing code is not designed to compare
multiple possible paths, so it seems to me a feature to adjust
a plan-tree already construct is sufficient for most usage
because extension can decide which one can offer more cheap
path than built-in ones.

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Вложения

pgsql-v9.4-custom-plan-node.v2.patch

Re: Custom Plan node

От

Robert Haas

Дата:

03 октября 2013 г., 18:20:09

On Thu, Oct 3, 2013 at 3:05 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> Sorry for my late response. I've tried to investigate the planner code
> to find out the way to integrate this custom api, and it is still in
> progress.
> One special handling I found was that create_join_plan() adjust
> root->curOuterRels prior to recursion of inner tree if NestLoop.
> Probably, we need some flags to control these special handling
> in the core.
> It is a hard job to list up all the stuff, so it seems to me we need
> to check-up them during code construction...

I'm pretty sure that is only a very small tip of a very large iceberg.

> This above framework was exactly what I considered.
> Probably, we have to put a hook on functions invoked by
> set_base_rel_pathlist() to add another possible way to scan
> the provided baserel, then set_cheapest() will choose the
> most reasonable one.
> The attached patch, it's just a works-in-progress, shows
> which hook I try to put around the code. Please grep it
> with "add_custom_scan_paths".

That seems fairly reasonable to me, but I think we'd want a working
demonstration

> Regarding to the "guest module" of this framework, another
> idea that I have is, built-in query cache module that returns
> previous scan result being cached if table contents was not
> updated from the previous run. Probably, it makes sense in
> case when most of rows are filtered out in this scan.
> Anyway, I'd like to consider something useful to demonstrate
> this API.

I doubt that has any chance of working well.  Supposing that query
caching is a feature we want, I don't think that plan trees are the
right place to try to install it.  That sounds like something that
ought to be done before we plan and execute the query in the first
place.

>> I am a little less sanguine about the chances of a CustomJoin node
>> working out well.  I agree that we need something to handle join
>> pushdown, but it seems to me that might be done by providing a Foreign
>> Scan path into the joinrel rather than by adding a concept of foreign
>> joins per se.
>>
> Indeed, if we have a hook on add_paths_to_joinrel(), it also makes
> sense for foreign tables; probably, planner will choose foreign-path
> instead of existing join node including foreign-scans.

Yes, I think it's reasonable to think about injecting a scan path into
a join node.

>> And I think that lumping everything else together under "not a scan or
>> join" has the least promise of all.  Is replacing Append really the
>> same as replacing Sort?  I think we'll need to think harder here about
>> what we're trying to accomplish and how to get there.
>>
> As long as extension modifies PlannedStmt on the planner_hook,
> I don't think it is not difficult so much, as I demonstrate on the
> previous patch.
> Unlike scan or join, existing code is not designed to compare
> multiple possible paths, so it seems to me a feature to adjust
> a plan-tree already construct is sufficient for most usage
> because extension can decide which one can offer more cheap
> path than built-in ones.

Well, there were a lot of problems with your demonstration, which have
already been pointed out upthread.  I'm skeptical about the idea of
simply replacing planner nodes wholesale, and Tom is outright opposed.I think you'll do better to focus on a narrower
case- I'd suggest
 
custom scan nodes - and leave the rest as a project for another time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Custom Plan node

От

Kohei KaiGai

Дата:

05 октября 2013 г., 08:01:16

2013/10/3 Robert Haas <robertmhaas@gmail.com>:
>>> I am a little less sanguine about the chances of a CustomJoin node
>>> working out well.  I agree that we need something to handle join
>>> pushdown, but it seems to me that might be done by providing a Foreign
>>> Scan path into the joinrel rather than by adding a concept of foreign
>>> joins per se.
>>>
>> Indeed, if we have a hook on add_paths_to_joinrel(), it also makes
>> sense for foreign tables; probably, planner will choose foreign-path
>> instead of existing join node including foreign-scans.
>
> Yes, I think it's reasonable to think about injecting a scan path into
> a join node.
>
>>> And I think that lumping everything else together under "not a scan or
>>> join" has the least promise of all.  Is replacing Append really the
>>> same as replacing Sort?  I think we'll need to think harder here about
>>> what we're trying to accomplish and how to get there.
>>>
>> As long as extension modifies PlannedStmt on the planner_hook,
>> I don't think it is not difficult so much, as I demonstrate on the
>> previous patch.
>> Unlike scan or join, existing code is not designed to compare
>> multiple possible paths, so it seems to me a feature to adjust
>> a plan-tree already construct is sufficient for most usage
>> because extension can decide which one can offer more cheap
>> path than built-in ones.
>
> Well, there were a lot of problems with your demonstration, which have
> already been pointed out upthread.  I'm skeptical about the idea of
> simply replacing planner nodes wholesale, and Tom is outright opposed.
>  I think you'll do better to focus on a narrower case - I'd suggest
> custom scan nodes - and leave the rest as a project for another time.
>
Thanks, it makes me clear what we should target on v9.4 development.
Towards the next commitfest, I'm planning to develop the following
features:
* CustomScan node that can run custom code instead of built-in scan nodes.
* Join-pushdown of postgres_fdw using the hook to be located on the add_paths_to_joinrel(), for demonstration purpose.
* Something new way to scan a relation; probably, your suggested ctid scan with less or bigger qualifier is a good
example,also for demonstration purpose.
 

Probably, above set of jobs will be the first chunk of this feature.
Then, let's do other stuff like Append, Sort, Aggregate and so on
later. It seems to me a reasonable strategy.

Even though it is an off-topic in this thread....

>> Regarding to the "guest module" of this framework, another
>> idea that I have is, built-in query cache module that returns
>> previous scan result being cached if table contents was not
>> updated from the previous run. Probably, it makes sense in
>> case when most of rows are filtered out in this scan.
>> Anyway, I'd like to consider something useful to demonstrate
>> this API.
>
> I doubt that has any chance of working well.  Supposing that query
> caching is a feature we want, I don't think that plan trees are the
> right place to try to install it.  That sounds like something that
> ought to be done before we plan and execute the query in the first
> place.
>
The query cache I mention above is intended to cache a part of
query tree. For example, if an ad-hoc query tries to join A,B and
C, and data analyst modifies qualifiers attached on C but it joins
A and B as usual. I though, it makes sense if we can cache the
result of the statically joined result. Idea is similar to materialized
view. If executor can pick up cached result on AxB, all it needs
to do is joining C and the cached one.
Of course, I know it may have difficulty like cache validation. :-(

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Custom Plan node

Вложения

Вложения