Re: [v9.5] Custom Plan API

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [v9.5] Custom Plan API
Дата
Msg-id CA+Tgmoakn=qpCp0rDXMEUoaPEzKAFybAq-hG8h+t1CV5f1taeA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [v9.5] Custom Plan API  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Список pgsql-hackers
On Thu, Sep 11, 2014 at 8:40 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> On Thu, Sep 11, 2014 at 11:24 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
>> wrote:
>> >> Don't the changes to src/backend/optimizer/plan/createplan.c belong
>> >> in patch #2?
>> >>
>> > The borderline between #1 and #2 is little bit bogus. So, I moved most
>> > of portion into #1, however, invocation of InitCustomScan (that is a
>> > callback in CustomPlanMethod) in create_custom_plan() is still in #2.
>>
>> Eh, create_custom_scan() certainly looks like it is in #1 from here, or
>> at least part of it is.  It calculates tlist and clauses and then does
>> nothing with them.  That clearly can't be the right division.
>>
>> I think it would make sense to have create_custom_scan() compute tlist and
>> clauses first, and then pass those to CreateCustomPlan().  Then you don't
>> need a separate InitCustomScan() - which is misnamed anyway, since it has
>> nothing to do with ExecInitCustomScan().
>>
> The only reason why I put separate hooks here is, create_custom_scan() needs
> to know exact size of the CustomScan node (including private fields), however,
> it is helpful for extensions to kick its callback to initialize the node
> next to the common initialization stuff.

Why does it need to know that?  I don't see that it's doing anything
that requires knowing the size of that node, and if it is, I think it
shouldn't be.  That should get delegated to the callback provided by
the custom plan provider.

> Regarding to the naming, how about GetCustomScan() instead of InitCustomScan()?
> It follows the manner in create_foreignscan_plan().

I guess that's a bit better, but come to think of it, I'd really like
to avoid baking in the assumption that the custom path provider has to
return any particular type of plan node.  A good start would be to
give it a name that doesn't imply that - e.g. PlanCustomPath().

>> > OK, I revised. Now custom-scan assumes it has a particular valid
>> > relation to be scanned, so no code path with scanrelid == 0 at this moment.
>> >
>> > Let us revisit this scenario when custom-scan replaces relation-joins.
>> > In this case, custom-scan will not be associated with a particular
>> > base- relation, thus it needs to admit a custom-scan node with scanrelid
>> == 0.
>>
>> Yeah, I guess the question there is whether we'll want let CustomScan have
>> scanrelid == 0 or require that CustomJoin be used there instead.
>>
> Right now, I cannot imagine a use case that requires individual CustomJoin
> node because CustomScan with scanrelid==0 (that performs like custom-plan
> rather than custom-scan in actually) is sufficient.
>
> If a CustomScan gets chosen instead of built-in join logics, it shall looks
> like a relation scan on the virtual one that is consists of two underlying
> relation. Callbacks of the CustomScan has a responsibility to join underlying
> relations; that is invisible from the core executor.
>
> It seems to me CustomScan with scanrelid==0 is sufficient to implement
> an alternative logic on relation joins, don't need an individual node
> from the standpoint of executor.

That's valid logic, but it's not the only way to do it.  If we have
CustomScan and CustomJoin, either of them will require some adaption
to handle this case.  We can either allow a custom scan that isn't
scanning any particular relation (i.e. scanrelid == 0), or we can
allow a custom join that has no children.  I don't know which way will
come out cleaner, and I think it's good to leave that decision to one
side for now.

>> >> Why can't the Custom(GpuHashJoin) node build the hash table
>> >> internally instead of using a separate node?
>> >>
>> > It's possible, however, it prevents to check sub-plans using EXPLAIN
>> > if we manage inner-plans internally. So, I'd like to have a separate
>> > node being connected to the inner-plan.
>>
>> Isn't that just a matter of letting the EXPLAIN code print more stuff?
>> Why can't it?
>>
> My GpuHashJoin takes multiple relations to load them a hash-table.
> On the other hand, Plan node can have two underlying relations at most
> (inner/outer). Outer-side is occupied by the larger relation, so it
> needs to make multiple relations visible using inner-branch.
> If CustomScan can has a list of multiple underlying plan-nodes, like
> Append node, it can represent the structure above in straightforward
> way, but I'm uncertain which is the better design.

Right.  I think the key point is that it is *possible* to make this
work without a multiexec interface, and it seems like we're agreed
that it is.  Now perhaps we will decide that there is enough benefit
in having multiexec support that we want to do it anyway, but it's
clearly not a hard requirement, because it can be done without that in
the way you describe here.  Let's leave to the future the decision as
to how to proceed here; getting the basic thing done is hard enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: jsonb contains behaviour weirdness
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index