Обсуждение: Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

Поиск
Список
Период
Сортировка

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

От
Kouhei Kaigai
Дата:
> On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > I'm now implementing. The above design perfectly works on ForeignScan.
> > On the other hands, I'd like to have deeper consideration for CustomScan.
> >
> > My recent patch adds LibraryName and SymbolName on CustomScanMethods
> > to lookup the method table even if library is not loaded yet.
> > However, this ExtensibleNodeMethods relies custom scan provider shall
> > be loaded, by parallel infrastructure, prior to the deserialization.
> > It means extension has a chance to register itself as well.
> >
> > My idea is, redefine CustomScanMethod as follows:
> >
> > typedef struct ExtensibleNodeMethods
> > {
> >     const char *extnodename;
> >     Size        node_size;
> >     Node     *(*nodeCopy)(const Node *from);
> >     bool      (*nodeEqual)(const Node *a, const Node *b);
> >     void      (*nodeOut)(struct StringInfoData *str, const Node *node);
> >     void      (*nodeRead)(Node *node);
> > } ExtensibleNodeMethods;
> >
> > typedef struct CustomScanMethods
> > {
> >     union {
> >         const char *CustomName;
> >         ExtensibleNodeMethods  xnode;
> >     };
> >     /* Create execution state (CustomScanState) from a CustomScan plan node
> */
> >     Node       *(*CreateCustomScanState) (struct CustomScan *cscan);
> > } CustomScanMethods;
> >
> > It allows to pull CustomScanMethods using GetExtensibleNodeMethods
> > by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
> > Thus, we don't need to care about LibraryName and SymbolName.
> >
> > This kind of trick is not needed for ForeignScan because all the pointer
> > stuff are reproducible by catalog, however, method table of CustomScan
> > is not.
> >
> > How about your opinion?
> 
> Anonymous unions are not C89, so this is a no-go.
> 
> I think we should just drop CustomName and replace it with
> ExtensibleNodeMethods.  That will break things for third-party code
> that is using the existing CustomScan stuff, but there's a good chance
> that nobody other than you has written any such code.  And even if
> someone has, updating it for this change will not be very difficult.
>
Thanks, I have same opinion.
At this moment, CustomName is not utilized so much except for EXPLAIN
and debug output, so it is not a hard stuff to replace this field by
extnodename, even if custom scan provider does not have own structure
thus no callbacks of node operations are defined.

The attached patch adds 'extnodename' fields on ForeignPath and
ForeignScan, also embeds ExtensibleNodeMethods in CustomPathMethods,
CustomScanMethods and CustomExecMethods.

Its handlers in copyfuncs.c were enhanced to utilize the callback
to allocate a larger structure and copy private fields.
Handlers in outfuncs.c and readfuncs.c have to be symmetric. The
core backend writes out 'extnodename' prior to any private fields,
then we can identify the callback to process rest of tokens for
private fields.

RegisterExtensibleNodeMethods() tracks a pair of 'extnodename'
and ExtensibleNodeMethods itself. It saves the pointer of the
method table, but not duplicate, because _readCustomScan()
cast the method pulled by 'extnodename' thus registered table
has to be a static variable on extension; that means extension
never update ExtensibleNodeMethods once registered.

The one other patch is my test using PG-Strom, however, I didn't
have a test on FDW extensions yet.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Вложения

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

От
Andres Freund
Дата:
On 2015-12-02 08:52:20 +0000, Kouhei Kaigai wrote:
> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> index 26264cb..c4bb76e 100644
> --- a/src/backend/nodes/copyfuncs.c
> +++ b/src/backend/nodes/copyfuncs.c
> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
>  static ForeignScan *
>  _copyForeignScan(const ForeignScan *from)
>  {
> -    ForeignScan *newnode = makeNode(ForeignScan);
> -
> +    const ExtensibleNodeMethods *methods
> +        = GetExtensibleNodeMethods(from->extnodename, true);
> +    ForeignScan *newnode = (ForeignScan *) newNode(!methods
> +                                                   ? sizeof(ForeignScan)
> +                                                   : methods->node_size,
> +                                                   T_ForeignScan);

Changing the FDW API seems to put this squarely into 9.6
territory. Agreed?

>      /*
>       * copy node superclass fields
>       */
> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
>      /*
>       * copy remainder of node
>       */
> +    COPY_STRING_FIELD(extnodename);
>      COPY_SCALAR_FIELD(fs_server);
>      COPY_NODE_FIELD(fdw_exprs);
>      COPY_NODE_FIELD(fdw_private);
> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
>      COPY_NODE_FIELD(fdw_recheck_quals);
>      COPY_BITMAPSET_FIELD(fs_relids);
>      COPY_SCALAR_FIELD(fsSystemCol);
> +    if (methods && methods->nodeCopy)
> +        methods->nodeCopy((Node *) newnode, (const Node *) from);

I wonder if we shouldn't instead "recurse" into a generic routine for
extensible nodes here.
> +void
> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
> +{
> +    uint32            hash;
> +    int                index;
> +    ListCell       *lc;
> +    MemoryContext    oldcxt;
> +
> +    if (!xnodes_methods_slots)
> +        xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
> +                                          sizeof(List *) * XNODES_NUM_SLOTS);
> +
> +    hash = hash_any(methods->extnodename, strlen(methods->extnodename));
> +    index = hash % XNODES_NUM_SLOTS;
> +
> +    /* duplication check */
> +    foreach (lc, xnodes_methods_slots[index])
> +    {
> +        const ExtensibleNodeMethods *curr = lfirst(lc);
> +
> +        if (strcmp(methods->extnodename, curr->extnodename) == 0)
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_DUPLICATE_OBJECT),
> +                     errmsg("ExtensibleNodeMethods \"%s\" already exists",
> +                            methods->extnodename)));
> +    }
> +    /* ok, register this entry */
> +    oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> +    xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
> +                                          (void *)methods);
> +    MemoryContextSwitchTo(oldcxt);
> +}

Why aren't you using dynahash here, and instead reimplement a hashtable?


>  static ForeignScan *
>  _readForeignScan(void)
>  {
> +    const ExtensibleNodeMethods *methods;
>      READ_LOCALS(ForeignScan);
>  
>      ReadCommonScan(&local_node->scan);
>  
> +    READ_STRING_FIELD(extnodename);
>      READ_OID_FIELD(fs_server);
>      READ_NODE_FIELD(fdw_exprs);
>      READ_NODE_FIELD(fdw_private);
> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
>      READ_BITMAPSET_FIELD(fs_relids);
>      READ_BOOL_FIELD(fsSystemCol);
>  
> +    methods = GetExtensibleNodeMethods(local_node->extnodename, true);
> +    if (methods && methods->nodeRead)
> +    {
> +        ForeignScan       *local_temp = palloc0(methods->node_size);
> +
> +        Assert(methods->node_size >= sizeof(ForeignScan));
> +
> +        memcpy(local_temp, local_node, sizeof(ForeignScan));
> +        methods->nodeRead((Node *) local_temp);
> +
> +        pfree(local_node);
> +        local_node = local_temp;
> +    }
>      READ_DONE();
>  }

This imo pretty clearly shows why the 'extensible node' handling should
be delegated to separate routines.


I wonder if we shouldn't, before committing this, at least write a PoC
implementation of actual extensible nodes. I.e. new node types that are
registered at runtime by extensions. ISTM those are relatively closely
linked, and the above doesn't yet support them nicely afaics.

Greetings,

Andres Freund



Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

От
Robert Haas
Дата:
On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-12-02 08:52:20 +0000, Kouhei Kaigai wrote:
>> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
>> index 26264cb..c4bb76e 100644
>> --- a/src/backend/nodes/copyfuncs.c
>> +++ b/src/backend/nodes/copyfuncs.c
>> @@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
>>  static ForeignScan *
>>  _copyForeignScan(const ForeignScan *from)
>>  {
>> -     ForeignScan *newnode = makeNode(ForeignScan);
>> -
>> +     const ExtensibleNodeMethods *methods
>> +             = GetExtensibleNodeMethods(from->extnodename, true);
>> +     ForeignScan *newnode = (ForeignScan *) newNode(!methods
>> +                                                                                                ?
sizeof(ForeignScan)
>> +                                                                                                :
methods->node_size,
>> +                                                                                                T_ForeignScan);
>
> Changing the FDW API seems to put this squarely into 9.6
> territory. Agreed?

I don't think anybody thought this patch was 9.5 material.  I didn't,
anyway.  The FDW changes for the join-pushdown-EPQ problem are another
issue, but that can be discussed on the other thread.

>>       /*
>>        * copy node superclass fields
>>        */
>> @@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
>>       /*
>>        * copy remainder of node
>>        */
>> +     COPY_STRING_FIELD(extnodename);
>>       COPY_SCALAR_FIELD(fs_server);
>>       COPY_NODE_FIELD(fdw_exprs);
>>       COPY_NODE_FIELD(fdw_private);
>> @@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
>>       COPY_NODE_FIELD(fdw_recheck_quals);
>>       COPY_BITMAPSET_FIELD(fs_relids);
>>       COPY_SCALAR_FIELD(fsSystemCol);
>> +     if (methods && methods->nodeCopy)
>> +             methods->nodeCopy((Node *) newnode, (const Node *) from);
>
> I wonder if we shouldn't instead "recurse" into a generic routine for
> extensible nodes here.

I don't know what that means.

>> +void
>> +RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
>> +{
>> +     uint32                  hash;
>> +     int                             index;
>> +     ListCell           *lc;
>> +     MemoryContext   oldcxt;
>> +
>> +     if (!xnodes_methods_slots)
>> +             xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
>> +                                                                               sizeof(List *) * XNODES_NUM_SLOTS);
>> +
>> +     hash = hash_any(methods->extnodename, strlen(methods->extnodename));
>> +     index = hash % XNODES_NUM_SLOTS;
>> +
>> +     /* duplication check */
>> +     foreach (lc, xnodes_methods_slots[index])
>> +     {
>> +             const ExtensibleNodeMethods *curr = lfirst(lc);
>> +
>> +             if (strcmp(methods->extnodename, curr->extnodename) == 0)
>> +                     ereport(ERROR,
>> +                                     (errcode(ERRCODE_DUPLICATE_OBJECT),
>> +                                      errmsg("ExtensibleNodeMethods \"%s\" already exists",
>> +                                                     methods->extnodename)));
>> +     }
>> +     /* ok, register this entry */
>> +     oldcxt = MemoryContextSwitchTo(TopMemoryContext);
>> +     xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
>> +                                                                               (void *)methods);
>> +     MemoryContextSwitchTo(oldcxt);
>> +}
>
> Why aren't you using dynahash here, and instead reimplement a hashtable?

I had thought we might assume that the number of extensible nodes was
small enough that we could use an array for this and just use linear
search.  But if we want to keep the door open to having enough
extensible nodes that this would be inefficient, then I agree that
dynahash is better than a hand-rolled hash table.

>>  static ForeignScan *
>>  _readForeignScan(void)
>>  {
>> +     const ExtensibleNodeMethods *methods;
>>       READ_LOCALS(ForeignScan);
>>
>>       ReadCommonScan(&local_node->scan);
>>
>> +     READ_STRING_FIELD(extnodename);
>>       READ_OID_FIELD(fs_server);
>>       READ_NODE_FIELD(fdw_exprs);
>>       READ_NODE_FIELD(fdw_private);
>> @@ -1804,6 +1806,19 @@ _readForeignScan(void)
>>       READ_BITMAPSET_FIELD(fs_relids);
>>       READ_BOOL_FIELD(fsSystemCol);
>>
>> +     methods = GetExtensibleNodeMethods(local_node->extnodename, true);
>> +     if (methods && methods->nodeRead)
>> +     {
>> +             ForeignScan        *local_temp = palloc0(methods->node_size);
>> +
>> +             Assert(methods->node_size >= sizeof(ForeignScan));
>> +
>> +             memcpy(local_temp, local_node, sizeof(ForeignScan));
>> +             methods->nodeRead((Node *) local_temp);
>> +
>> +             pfree(local_node);
>> +             local_node = local_temp;
>> +     }
>>       READ_DONE();
>>  }
>
> This imo pretty clearly shows why the 'extensible node' handling should
> be delegated to separate routines.

This does look ugly.  I'm not sure off-hand how to fix it.

> I wonder if we shouldn't, before committing this, at least write a PoC
> implementation of actual extensible nodes. I.e. new node types that are
> registered at runtime by extensions. ISTM those are relatively closely
> linked, and the above doesn't yet support them nicely afaics.

How is what you have in mind different from the pgstrom patch KaiGai attached?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company