Re: [PATCH] Add transforms feature

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [PATCH] Add transforms feature
Дата
Msg-id 20140404222147.GB13431@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: [PATCH] Add transforms feature  (Peter Eisentraut <peter_e@gmx.net>)
Ответы Re: [PATCH] Add transforms feature  (Andres Freund <andres@2ndquadrant.com>)
Re: [PATCH] Add transforms feature  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
On 2014-01-15 21:13:18 -0500, Peter Eisentraut wrote:
> The attached patch will probably fail to apply because of the pg_proc
> changes.  So if you want to try it out, look into the header for the Git
> hash it was based off.  I'll produce a properly merged version when this
> approach is validated.  Also, some implementation details could probably
> take some revising after a couple of nights of sleep. ;-)

You've slept since? ;)

So, I am only doign a look through the patch, to see where it has gone
in the past year.

> index 4e476c3..5ac9f05 100644
> --- a/src/Makefile.shlib
> +++ b/src/Makefile.shlib
> @@ -133,7 +133,7 @@ ifeq ($(PORTNAME), darwin)
>    else
>      # loadable module
>      DLSUFFIX        = .so
> -    LINK.shared        = $(COMPILER) -bundle -multiply_defined suppress
> +    LINK.shared        = $(COMPILER) -bundle -multiply_defined suppress -Wl,-undefined,dynamic_lookup
>    endif
>    BUILD.exports        = $(AWK) '/^[^\#]/ {printf "_%s\n",$$1}' $< >$@
>    exports_file        = $(SHLIB_EXPORTS:%.txt=%.list)

Hm. Do the linkers on other platforms support this behaviour? Linux
does, by default, but I have zap clue about the rest.

Why do we need this? I guess because a transform from e.g. hstore to $pl
needs symbols out of hstore.so and the $pl?

I wonder if it woudln't be better to rely on explicit symbol lookups for
this. That'd avoid the need for the order dependency and linker changes
like this.

> +            case OBJECT_TRANSFORM:
> +                {
> +                    TypeName   *typename = (TypeName *) linitial(objname);
> +                    char       *langname = (char *) linitial(objargs);
> +                    Oid            type_id = typenameTypeId(NULL, typename);
> +                    Oid            lang_id = get_language_oid(langname, false);
> +
> +                    address.classId = TransformRelationId;
> +                    address.objectId =
> +                        get_transform_oid(type_id, lang_id, missing_ok);
> +                    address.objectSubId = 0;
> +                }
> +                break;

Hm. I wonder if missing_ok should be forwarded to get_language_oid() and
(by changing the way things are done atm) to typenameTypeId?

> +        case OCLASS_TRANSFORM:
> +            {
> +                HeapTuple    trfTup;
> +                Form_pg_transform trfForm;
> +
> +                trfTup = SearchSysCache1(TRFOID,
> +                                          ObjectIdGetDatum(object->objectId));
> +                if (!HeapTupleIsValid(trfTup))
> +                    elog(ERROR, "could not find tuple for transform %u",
> +                         object->objectId);
> +
> +                trfForm = (Form_pg_transform) GETSTRUCT(trfTup);
> +
> +                appendStringInfo(&buffer, _("transform for %s language %s"),
> +                                 format_type_be(trfForm->trftype),
> +                                 get_language_name(trfForm->trflang, false));
> +
> +                ReleaseSysCache(trfTup);
> +                break;
> +            }
> +

Why deviate from the usual 'cache lookup failed for ..'? elog doesn't
translate so it's not particular relevant, but ...

>      referenced.objectSubId = 0;
>      recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
>  
> +    /* dependency on transform used by return type, if any */
> +    if ((trfid = get_transform_oid(returnType, languageObjectId, true)))
> +    {
> +        referenced.classId = TransformRelationId;
> +        referenced.objectId = trfid;
> +        referenced.objectSubId = 0;
> +        recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
> +    }
> +

Should be compared to InvalidOid imo, rather than implicitly assuming
that InvalidOid evaluates to false.

> +/*
> + * CREATE TRANSFORM
> + */
> +Oid
> +CreateTransform(CreateTransformStmt *stmt)
> +{
...
> +    if (!pg_type_ownercheck(typeid, GetUserId()))
> +        aclcheck_error_type(ACLCHECK_NOT_OWNER, typeid);
> +
> +    aclresult = pg_type_aclcheck(typeid, GetUserId(), ACL_USAGE);
> +    if (aclresult != ACLCHECK_OK)
> +        aclcheck_error_type(aclresult, typeid);
> +
> +    /*
> +     * Get the language
> +     */
> +    langid = get_language_oid(stmt->lang, false);
> +
> +    aclresult = pg_language_aclcheck(langid, GetUserId(), ACL_USAGE);
> +    if (aclresult != ACLCHECK_OK)
> +        aclcheck_error(aclresult, ACL_KIND_LANGUAGE, stmt->lang);

Hm. Is USAGE really sufficient here? Should we possibly make it
dependant on lanpltrusted like CreateFunction() does?

> +    if (stmt->fromsql)
> +    {
> +        fromsqlfuncid = LookupFuncNameTypeNames(stmt->fromsql->funcname, stmt->fromsql->funcargs, false);
> +
> +        if (!pg_proc_ownercheck(fromsqlfuncid, GetUserId()))
> +            aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC, NameListToString(stmt->fromsql->funcname));

Why isn't EXECUTE sufficient here?

> +        aclresult = pg_proc_aclcheck(fromsqlfuncid, GetUserId(), ACL_EXECUTE);
> +        if (aclresult != ACLCHECK_OK)
> +            aclcheck_error(aclresult, ACL_KIND_PROC, NameListToString(stmt->fromsql->funcname));
> +
> +        tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(fromsqlfuncid));
> +        if (!HeapTupleIsValid(tuple))
> +            elog(ERROR, "cache lookup failed for function %u", fromsqlfuncid);
> +        procstruct = (Form_pg_proc) GETSTRUCT(tuple);
> +        if (procstruct->prorettype != INTERNALOID)
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> +                     errmsg("return data type of FROM SQL function must be \"internal\"")));
> +        check_transform_function(procstruct);
> +        ReleaseSysCache(tuple);

So, this can be used to call a function that takes INTERNAL, and returns
INTERNAL. Isn't that normally reserved for superusers? I think this at
the very least needs to be an ownercheck on the function?

> @@ -66,6 +66,7 @@ CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO
>      text        proargnames[1]; /* parameter names (NULL if no names) */
>      pg_node_tree proargdefaults;/* list of expression trees for argument
>                                   * defaults (NULL if none) */
> +    Oid            protrftypes[1]    /* types for which to apply transforms */
>      text        prosrc;            /* procedure source text */
>      text        probin;            /* secondary procedure info (can be NULL) */
>      text        proconfig[1];    /* procedure-local GUC settings */

I wonder if this shouldn't rather be a array that lists the transform to
be used for every single column akin to proargtypes or such. That's
going to make life easier for pl developers.
>  /**********************************************************************
> @@ -1260,6 +1264,7 @@ static SV  *plperl_call_perl_func(plperl_proc_desc *desc,
>                     bool *isnull)
>  {
>      FmgrInfo    tmp;
> +    Oid            funcid;
>  
>      /* we might recurse */
>      check_stack_depth();
> @@ -1283,6 +1288,8 @@ static SV  *plperl_call_perl_func(plperl_proc_desc *desc,
>          /* must call typinput in case it wants to reject NULL */
>          return InputFunctionCall(finfo, NULL, typioparam, typmod);
>      }
> +    else if ((funcid = get_transform_tosql(typid, current_call_data->prodesc->lang_oid)))
> +        return OidFunctionCall1(funcid, PointerGetDatum(sv));
>      else if (SvROK(sv))
>      {
>          /* handle references */

Am I missing something here? You're not looking at the proc's
transforms, but just lookup the general ones? Same for output and such.

Looks like you forgot to update this.


>      for (i = 0; i < desc->nargs; i++)
>      {
>          if (fcinfo->argnull[i])
> @@ -2055,9 +2075,16 @@ static SV  *plperl_call_perl_func(plperl_proc_desc *desc,
>          else
>          {
>              SV           *sv;
> +            Oid            funcid;
>  
>              if (OidIsValid(desc->arg_arraytype[i]))
>                  sv = plperl_ref_from_pg_array(fcinfo->arg[i], desc->arg_arraytype[i]);
> +            else if (list_member_oid(current_call_data->prodesc->trftypes, argtypes[i]))
> +            {
> +                funcid = get_transform_fromsql(argtypes[i], current_call_data->prodesc->lang_oid);
> +                Assert(funcid); // TODO
> +                sv = (SV *) DatumGetPointer(OidFunctionCall1(funcid, fcinfo->arg[i]));
> +            }

This is the behaviour I'd really would like to avoid. Searching an array
for every parameter sucks. I think this really should be a vector with
one element per argument. Yes, that's going to require special handling
of the return type, but whatever.


So, I lost my motiviation here. I like this version *much* more than the
state of a year ago. I think there's a fair amount of work required
here, but it seems to be dilligence that's required, not redesign. But I
still don't think that's doable within the next couple of days?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Idea for aggregates
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [review] PostgreSQL Service on Windows does not start if data directory given is relative path