Custom Plan node

Поиск
Список
Период
Сортировка
От Kohei KaiGai
Тема Custom Plan node
Дата
Msg-id CADyhKSWaSpJy9v3R1K5t3fC9r04-yf6ta_driuLHuR-xgCvyng@mail.gmail.com
обсуждение исходный текст
Ответы Re: Custom Plan node  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

The attached patch adds a new plan node type; CustomPlan that enables
extensions to get control during query execution, via registered callbacks.
Right now, all the jobs of the executor are built-in, except for foreign scan,
thus we have no way to run self implemented code within extension, instead
of a particular plan-tree portion. It is painful for people who want
to implement
an edge feature on the executor, because all we can do is to replace whole
of the executor portion but unreasonable maintenance burden.

CustomPlan requires extensions two steps to use; registration of a set of
callbacks, and manipulation of plan tree.
First, extension has to register a set of callbacks with a unique name
using RegisterCustomPlan(). Each callbacks are defined as follows, and
extension is responsible to perform these routines works well.

  void BeginCustomPlan(CustomPlanState *cestate, int eflags);
  TupleTableSlot *ExecCustomPlan(CustomPlanState *node);
  Node *MultiExecCustomPlan(CustomPlanState *node);
  void EndCustomPlan(CustomPlanState *node);
  void ExplainCustomPlan(CustomPlanState *node, ExplainState *es);
  void ReScanCustomPlan(CustomPlanState *node);
  void ExecMarkPosCustomPlan(CustomPlanState *node);
  void ExecRestrPosCustomPlan(CustomPlanState *node);

These callbacks are invoked if plan tree contained CustomPlan node.
However, usual code path never construct this node type towards any
SQL input. So, extension needs to manipulate the plan tree already
constructed.
It is the second job. Extension will put its local code on the planner_hook
to reference and manipulate PlannedStmt object. It can replace particular
nodes in plan tree by CustomPlan, or inject it into arbitrary point.

Though my intention is to implement GPU accelerate table scan or other
stuff on top of this feature, probably, some other useful features can be
thought. Someone suggested it may be useful for PG-XC folks to implement
clustered-scan, at the developer meeting. Also, I have an idea to implement
in-memory query cache that enables to cut off a particular branch of plan tree.
Probably, other folks have other ideas.

The contrib/xtime module shows a simple example that records elapsed time
of the underlying plan node, then print it at end of execution.
For example, this query constructs the following plan-tree as usually we see.

postgres=# EXPLAIN (costs off)
           SELECT * FROM t1 JOIN t2 ON t1.a = t2.x
                    WHERE x BETWEEN 1000 AND 1200 ORDER BY y;
                     QUERY PLAN
-----------------------------------------------------
 Sort
   Sort Key: t2.y
   ->  Nested Loop
         ->  Seq Scan on t2
               Filter: ((x >= 1000) AND (x <= 1200))
         ->  Index Scan using t1_pkey on t1
               Index Cond: (a = t2.x)
(7 rows)

Once xtime module manipulate the plan tree to inject CustomPlan,
it shall become as follows:

postgres=# LOAD '$libdir/xtime';
LOAD
postgres=# EXPLAIN (costs off)
           SELECT * FROM t1 JOIN t2 ON t1.a = t2.x
                    WHERE x BETWEEN 1000 AND 1200 ORDER BY y;
                           QUERY PLAN
-----------------------------------------------------------------
 CustomPlan:xtime
   ->  Sort
         Sort Key: y
         ->  CustomPlan:xtime
               ->  Nested Loop
                     ->  CustomPlan:xtime on t2
                           Filter: ((x >= 1000) AND (x <= 1200))
                     ->  CustomPlan:xtime
                           ->  Index Scan using t1_pkey on t1
                                 Index Cond: (a = x)
(10 rows)

You can see CustomPlan with name of "xtime" appeared in the plan-tree,
then the executor calls functions being registered as callback of "xtime",
when it met CustomPlan during recursive execution.

Extension has to set name of custom plan provider at least when it
construct a CustomPlan node and put it on the target plan tree.
A set of callbacks are looked up by the name, and installed on
CustomPlanState object for execution, on ExecIniNode().
The reason why I didn't put function pointer directly is, plan nodes need
to be complianced to copyObject() and others.

Please any comments.

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: lcr v5 - introduction of InvalidCommandId