Обсуждение: deparsing utility commands

Поиск

Список

Период

Сортировка

deparsing utility commands

От

Alvaro Herrera

Дата:

15 февраля 2015 г., 07:48:26

Hi,

This is a repost of the patch to add CREATE command deparsing support to
event triggers.  It now supports not only CREATE but also ALTER and
other commands such as GRANT/REVOKE, COMMENT ON and SECURITY LABEL.
This patch series is broken up in a rather large number of patches, but
my intention would be to commit separately only the first 6 patches; the
remaining parts are split up only so that it's easy to see how deparsing
each patch is implemented, and would be committed as a single changeset
(but before that I need a bunch of extra fixes and additions to other
parts.)

One of the worries regarding this code is whether we will be able to
keep it up to date as the grammar changes.  We have come up with a
testing harness for this that's pretty simple: we will have a new make
target in src/test/regress that creates a new pg_regress schedule file,
then run pg_regress with it.  That schedule contains a "deparse
initialization" test, then the contents of serial_schedule, then a
final deparse test.  The initialization test creates a table and an
event trigger, and stores all deparsed commands in the table; the final
deparse test is about executing those commands.  (There are actually two
event triggers; the other one stores DROP commands for objects dropped,
using the sql_drop event.)  The deparse_init.sql file is attached.

Just by storing the deparsed command in the table we're doing the first
level of deparse testing: make sure that all commands deparse to
something.  Running the commands is the second level: make sure that the
commands generated by the deparsing step works correctly.  (The third
level is making sure that each command generated by deparse creates an
object idential to the one generated by the original command.  The plan
here is to run two databases through pg_dump and compare.)
One line of defense which I just tought about is that instead of
sprinkling EventTriggerStashCommand() calls all over ProcessUtilitySlow,
we should add only one at the bottom.

There are some issues remaining:

1. ruleutils.c seems to deparse temp schemas as pg_temp_X instead of
just pg_temp; so some commands like CREATE VIEW..AS SELECT FROM temp_table
fail in an ugly way.  Maybe the solution is to tell ruleutils that the
temp schema is always visible; or maybe the solution is to tell it that
it's spelled pg_temp instead.

2. I need to complete support for all object types in
getObjectIdentityParts.  The ones not implemented cause the generated
DROP commands to fail.

3. Some commands are still not supported by the deparsing code:
      1 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE OPTIONS (...)
      2 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE RESET
      2 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE SET OPTIONS
      2 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE FOREIGN TABLE
      2 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE LANGUAGE
      3 + WARNING:  hash indexes are not WAL-logged and their use is discouraged
      3 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE ALTER COLUMN OPTIONS
      3 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE TABLE AS EXECUTE
      4 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE ALTER CONSTRAINT
      4 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE TYPE USING
      4 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TEXT SEARCH CONFIGURATION
      4 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER USER MAPPING
      5 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE OPERATOR CLASS
      5 + WARNING:  state: XX000 errm: unsupported deparse of ALTER FUNCTION SET
      5 + WARNING:  state: XX000 errm: unsupported deparse of CREATE SERVER ... OPTIONS
      7 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER FOREIGN DATA WRAPPER
      7 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER SERVER
      8 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER POLICY
      9 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER TABLE SET
     10 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE FOREIGN DATA WRAPPER
     11 + WARNING:  state: XX000 errm: can't recreate text search configuration with mappings
     12 + WARNING:  state: XX000 errm: unimplemented deparse of CREATE USER MAPPING
     14 + WARNING:  state: XX000 errm: unimplemented deparse of ALTER OPERATOR FAMILY

I expect most of these are simple to implement.  The only one that seems
to present a challenge is ALTER OPERATOR FAMILY: it needs to pass back
the OID of all the added/removed operator class members.


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Stephen Frost wrote:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> > > > Now, we probably don't want to hack *all* the utility commands to return
> > > > ObjectAddress instead of OID, because it many cases that's just not
> > > > going to be convenient (not to speak of the code churn); so I think for
> > > > most objtypes the ProcessUtilitySlow stanza would look like this:
> >
> > > That'd be fine with me, though for my 2c, I wouldn't object to changing
> > > them all to return ObjectAddress either.  I agree that it'd cause a fair
> > > bit of code churn to do so, but there's a fair bit of code churn
> > > happening here anyway (looking at what 0008 does to ProcessUtilitySlow
> > > anyway).
> >
> > Well, that would make my life easier I think (even if it's a bit more
> > work), so unless there are objections I will do things this way.  It's a
> > bit of a pity that Robert and Dimitri went to huge lengths to have these
> > functions return OID and we're now changing it all to ObjAddress
> > instead, but oh well.
>
> Not sure that I see it as that huge a deal..  They're still returning an
> Oid, it's just embedded in the ObjAddress to provide a complete
> statement of what the object is.

I've been looking at this idea some more.  There's some churn, but it's
not that bad.  For instance, here's the patch to have RENAME return
ObjectAddress rather than OIDs.

For instance, the get_objtype_catalog_id() function, provided in a later
patch in the series, will no longer be necessary; instead, each
execution code path in the src/backend/command code must set the correct
catalog ID in the ObjectAddress it returns.  So the event triggers code
will have the catalog ID at the ready, without having to figure it out
from the ObjectType it gets from the parse node.

> btw, the hunk in 0026 which adds a 'break;' into standard_ProcessUtility
> caught me by surprise.  Looks like that 'break;' was missing from 0003
> (for T_GrantStmt).

Thanks for pointing this out -- that was broken rebase.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

0001-Have-RENAME-routines-return-ObjectAddress-rather-tha.patch

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

19 февраля 2015 г., 23:14:42

Stephen Frost wrote:

> > > > Yes, I will push these unless somebody objects soon, as they seem
> > > > perfectly reasonable to me.  The only troubling thing is that there is
> > > > no regression test for this kind of thing in event triggers (i.e. verify
> > > > which command tags get support and which don't), which seems odd to me.
> > > > Not these patches's fault, though, so I'm not considering adding any ATM.
> > >
> > > Ugh.  I dislike that when we say an event trigger will fire before
> > > 'GRANT' what we really mean is "GRANT when it's operating against a
> > > local object".  At the minimum we absolutely need to be very clear in
> > > the documentation about that limitation.  Perhaps there is something
> > > already which reflects that, but I don't see anything in the patch
> > > being added about that.
> >
> > Hmm, good point, will give this some thought.  I'm thinking perhaps we
> > can add a table of which object types are supported for generic commands
> > such as GRANT, COMMENT and SECURITY LABEL.
>
> That sounds like a good idea.

Here's a patch.  I noticed that the introduction to event trigger
already says they only act on local objects:

    The ddl_command_start event occurs just before the execution of
    a CREATE, ALTER, DROP, SECURITY LABEL, COMMENT, GRANT or REVOKE
    command. No check whether the affected object exists or doesn't
    exist is performed before the event trigger fires. As an
    exception, however, this event does not occur for DDL commands
    targeting shared objects — databases, roles, and tablespaces —
    or for commands targeting event triggers themselves.

So adding more text to the same effect would be repetitive.  I added a
sixth column "Notes" to the table of supported command tags vs. events,
with the text "Only for local objects" next to the four commands being
added here.

I think it's fair to push this patch as is.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

0001-Support-more-commands-in-event-triggers.patch

Re: deparsing utility commands

От

Andres Freund

Дата:

21 февраля 2015 г., 15:13:30

On 2015-02-19 17:14:57 -0300, Alvaro Herrera wrote:
>     The ddl_command_start event occurs just before the execution of
>     a CREATE, ALTER, DROP, SECURITY LABEL, COMMENT, GRANT or REVOKE
>     command. No check whether the affected object exists or doesn't
>     exist is performed before the event trigger fires. As an
>     exception, however, this event does not occur for DDL commands
>     targeting shared objects — databases, roles, and tablespaces —
>     or for commands targeting event triggers themselves.
>
> So adding more text to the same effect would be repetitive.  I added a
> sixth column "Notes" to the table of supported command tags vs. events,
> with the text "Only for local objects" next to the four commands being
> added here.

I think that's sufficient for now. We might want to expand the handling
of 'global objects' topic sometime in the future, but for now I think
what you have is clear enough.

> I think it's fair to push this patch as is.

+1


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: deparsing utility commands

От

Andres Freund

Дата:

21 февраля 2015 г., 15:22:38

On 2015-02-19 14:39:27 -0300, Alvaro Herrera wrote:
> diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
> index d899dd7..2bbc15d 100644
> --- a/src/backend/catalog/objectaddress.c
> +++ b/src/backend/catalog/objectaddress.c
> @@ -531,6 +531,12 @@ ObjectTypeMap[] =
>      { "policy", OBJECT_POLICY }
>  };
>  
> +ObjectAddress InvalidObjectAddress =
> +{
> +    InvalidOid,
> +    InvalidOid,
> +    0
> +};

Maybe mark it as a constant? Then it can live in some readonly section.

> +extern ObjectAddress InvalidObjectAddress;
> +
> +#define initObjectAddress(addr, class_id, object_id) \
> +    do { \
> +        (addr).classId = (class_id); \
> +        (addr).objectId = (object_id); \
> +        (addr).objectSubId = 0; \
> +    } while (0)
> +
> +#define initObjectAddressSub(addr, class_id, object_id, object_sub_id) \
> +    do { \
> +        (addr).classId = (class_id); \
> +        (addr).objectId = (object_id); \
> +        (addr).objectSubId = (object_sub_id); \
> +    } while (0)
> +

Maybe, based on some precedent, make those ObjectAddressSet(Sub)?() -
the init bit in initObjectAddress sounds to me like like it does more
than setting a couple member variables.

I'd also be inclined to make initObjectAddress use initObjectAddressSub,
but that's reaching the realm of pedantism ;)


To me the change sounds like a good idea - drop/rename are already
handled somewhat generically, and expanding that to the return value for
renames sounds like a natural progression to me.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: deparsing utility commands

От

Andres Freund

Дата:

21 февраля 2015 г., 19:30:41

Hi,

On 2015-02-15 01:48:15 -0300, Alvaro Herrera wrote:
> One line of defense which I just tought about is that instead of
> sprinkling EventTriggerStashCommand() calls all over ProcessUtilitySlow,
> we should add only one at the bottom.

Doesn't sound like a bad idea, but I'm not sure whether it's realistic
given that some commands do multiple EventTriggerStashCommand()s?

> 1. ruleutils.c seems to deparse temp schemas as pg_temp_X instead of
> just pg_temp; so some commands like CREATE VIEW..AS SELECT FROM temp_table
> fail in an ugly way.  Maybe the solution is to tell ruleutils that the
> temp schema is always visible; or maybe the solution is to tell it that
> it's spelled pg_temp instead.

Hm. I'm not immediately following why that's happening for deparsing but
not for the normal get_viewdef stuff?

> From d03a8edcefd8f0ea1c46b315c446f96c61146a85 Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date: Wed, 24 Sep 2014 15:53:04 -0300
> Subject: [PATCH 07/42] deparse: infrastructure needed for command deparsing
>
> This patch introduces unused infrastructure for command deparsing.
> There are three main parts:

+ "as of yet"?

> 1. A list (stash) of executed commands in Node format, stored in the
> event trigger state.  At ddl_command_end, the stored items can be
> extracted.  For now this only support "basic" commands (in particular
> not ALTER TABLE or GRANT).  It's useful enough to cover all CREATE
> command as well as many simple ALTER forms.

seems outdated.

(yea, I know you want to fold the patch anyway).

> +/* XXX merge this with ObjectTypeMap? */
>  static event_trigger_support_data event_trigger_support[] = {
>      {"AGGREGATE", true},
>      {"CAST", true},

Hm, I'd just drop the XXX for now.

> @@ -1060,6 +1066,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
>          case OBJECT_CAST:
>          case OBJECT_COLUMN:
>          case OBJECT_COLLATION:
> +        case OBJECT_COMPOSITE:
>          case OBJECT_CONVERSION:
>          case OBJECT_DEFAULT:
>          case OBJECT_DOMAIN:
> @@ -1088,6 +1095,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
>          case OBJECT_TSPARSER:
>          case OBJECT_TSTEMPLATE:
>          case OBJECT_TYPE:
> +        case OBJECT_USER_MAPPING:
>          case OBJECT_VIEW:
>              return true;
>      }

Should those two be broken out and just committed?

> @@ -1193,14 +1201,6 @@ EventTriggerBeginCompleteQuery(void)
>      EventTriggerQueryState *state;
>      MemoryContext cxt;
>
> -    /*
> -     * Currently, sql_drop and table_rewrite events are the only reason to
> -     * have event trigger state at all; so if there are none, don't install
> -     * one.
> -     */
> -    if (!trackDroppedObjectsNeeded())
> -        return false;
> -
>      cxt = AllocSetContextCreate(TopMemoryContext,
>                                  "event trigger state",
>                                  ALLOCSET_DEFAULT_MINSIZE,
> @@ -1211,7 +1211,7 @@ EventTriggerBeginCompleteQuery(void)
>      slist_init(&(state->SQLDropList));
>      state->in_sql_drop = false;
>      state->table_rewrite_oid = InvalidOid;
> -
> +    state->stash = NIL;
>      state->previous = currentEventTriggerState;
>      currentEventTriggerState = state;

Hm. I'm not clear why we don't check for other types of event triggers
in EventTriggerBeginCompleteQuery and skip the work otherwise. If we
just enable them all the time - which imo is ok given how they're split
of in the normal process utility - we should remove the boolean return
value from Begin/CompleteQuery and drop
 * Note: it's an error to call this routine if EventTriggerBeginCompleteQuery
 * returned false previously.
from EndCompleteQuery.

> +/*
> + * EventTriggerStashCommand
> + *         Save data about a simple DDL command that was just executed
> + */
> +void
> +EventTriggerStashCommand(Oid objectId, uint32 objectSubId, ObjectType objtype,
> +                         Oid secondaryOid, Node *parsetree)
> +{
> +    MemoryContext oldcxt;
> +    StashedCommand *stashed;
> +
> +    oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt);
> +
> +    stashed = palloc(sizeof(StashedCommand));
> +
> +    stashed->type = SCT_Simple;
> +    stashed->in_extension = creating_extension;
> +
> +    stashed->d.simple.objectId = objectId;
> +    stashed->d.simple.objtype = objtype;
> +    stashed->d.simple.objectSubId = objectSubId;
> +    stashed->d.simple.secondaryOid = secondaryOid;
> +    stashed->parsetree = copyObject(parsetree);
> +
> +    currentEventTriggerState->stash = lappend(currentEventTriggerState->stash,
> +                                              stashed);
> +
> +    MemoryContextSwitchTo(oldcxt);
> +}
>
It's a bit wierd that the drop list is managed using a slist, but the
stash isn't...


> +Datum
> +pg_event_trigger_get_creation_commands(PG_FUNCTION_ARGS)
> +{

> +    foreach(lc, currentEventTriggerState->stash)
> +    {

> +        command = deparse_utility_command(cmd);
> +
> +        /*
> +         * Some parse trees return NULL when deparse is attempted; we don't
> +         * emit anything for them.
> +         */
> +        if (command != NULL)
> +        {
> +            Datum        values[9];
> +            bool        nulls[9];
> +            ObjectAddress addr;
> +            int            i = 0;
> +
> +            MemSet(nulls, 0, sizeof(nulls));
> +
> +            if (cmd->type == SCT_Simple)
> +            {
> +                Oid            classId;
> +                Oid            objId;
> +                uint32        objSubId;
> +                const char *tag;
> +                char       *identity;
> +                char       *type;
> +                char       *schema = NULL;
> +
> +                if (cmd->type == SCT_Simple)
> +                {
> +                    classId = get_objtype_catalog_oid(cmd->d.simple.objtype);
> +                    objId = cmd->d.simple.objectId;
> +                    objSubId = cmd->d.simple.objectSubId;
> +                }
> +
> +                tag = CreateCommandTag(cmd->parsetree);
> +                addr.classId = classId;
> +                addr.objectId = objId;
> +                addr.objectSubId = objSubId;
> +
> +                type = getObjectTypeDescription(&addr);
> +                identity = getObjectIdentity(&addr);
> +
> +                /*
> +                 * Obtain schema name, if any ("pg_temp" if a temp object)
> +                 */
> +                if (is_objectclass_supported(addr.classId))
> +                {
> +                    AttrNumber    nspAttnum;
> +
> +                    nspAttnum = get_object_attnum_namespace(addr.classId);
> +                    if (nspAttnum != InvalidAttrNumber)
> +                    {
> +                        Relation    catalog;
> +                        HeapTuple    objtup;
> +                        Oid            schema_oid;
> +                        bool        isnull;
> +
> +                        catalog = heap_open(addr.classId, AccessShareLock);
> +                        objtup = get_catalog_object_by_oid(catalog,
> +                                                           addr.objectId);
> +                        if (!HeapTupleIsValid(objtup))
> +                            elog(ERROR, "cache lookup failed for object %u/%u",
> +                                 addr.classId, addr.objectId);
> +                        schema_oid = heap_getattr(objtup, nspAttnum,
> +                                                  RelationGetDescr(catalog), &isnull);
> +                        if (isnull)
> +                            elog(ERROR, "invalid null namespace in object %u/%u/%d",
> +                                 addr.classId, addr.objectId, addr.objectSubId);
> +                        if (isAnyTempNamespace(schema_oid))
> +                            schema = pstrdup("pg_temp");
> +                        else
> +                            schema = get_namespace_name(schema_oid);
> +
> +                        heap_close(catalog, AccessShareLock);
> +                    }
> +                }

Hm. How do we get here for !is_objectclass_supported()?

> +/*
> + * Allocate a new object tree to store parameter values -- varargs version.
> + *
> + * The "fmt" argument is used to append as a "fmt" element in the output blob.
> + * numobjs indicates the number of extra elements to append; for each one, a
> + * name (string), type (from the ObjType enum) and value must be supplied.  The
> + * value must match the type given; for instance, ObjTypeInteger requires an
> + * int64, ObjTypeString requires a char *, ObjTypeArray requires a list (of
> + * ObjElem), ObjTypeObject requires an ObjTree, and so on.  Each element type *
> + * must match the conversion specifier given in the format string, as described
> + * in pg_event_trigger_expand_command, q.v.
> + *
> + * Note we don't have the luxury of sprintf-like compiler warnings for
> + * malformed argument lists.
> + */
> +static ObjTree *
> +new_objtree_VA(char *fmt, int numobjs,...)
> +{
> +    ObjTree    *tree;
> +    va_list        args;
> +    int            i;
> +
> +    /* Set up the toplevel object and its "fmt" */
> +    tree = new_objtree();
> +    append_string_object(tree, "fmt", fmt);
> +
> +    /* And process the given varargs */
> +    va_start(args, numobjs);
> +    for (i = 0; i < numobjs; i++)
> +    {
> +        ObjTree    *value;
> +        ObjType        type;
> +        ObjElem       *elem;
> +        char       *name;
> +        char       *strval;
> +        bool        boolval;
> +        List       *list;
> +        int64        number;
> +        float8        fnumber;
> +
> +        name = va_arg(args, char *);
> +        type = va_arg(args, ObjType);
> +
> +        /* Null params don't have a value (obviously) */
> +        if (type == ObjTypeNull)
> +        {
> +            append_null_object(tree, name);
> +            continue;
> +        }
> +
> +        /*
> +         * For all other param types there must be a value in the varargs.
> +         * Fetch it and add the fully formed subobject into the main object.
> +         */
> +        switch (type)
> +        {
> +            case ObjTypeBool:
> +                boolval = va_arg(args, int);
> +                elem = new_bool_object(boolval);
> +                break;
> +            case ObjTypeString:
> +                strval = va_arg(args, char *);
> +                elem = new_string_object(strval);
> +                break;
> +            case ObjTypeObject:
> +                value = va_arg(args, ObjTree *);
> +                elem = new_object_object(value);
> +                break;
> +            case ObjTypeArray:
> +                list = va_arg(args, List *);
> +                elem = new_array_object(list);
> +                break;
> +            case ObjTypeInteger:
> +                number = va_arg(args, int64);
> +                elem = new_integer_object(number);
> +                break;
> +            case ObjTypeFloat:
> +                fnumber = va_arg(args, double);
> +                elem = new_float_object(fnumber);
> +                break;
> +            default:
> +                elog(ERROR, "invalid parameter type %d", type);
> +        }
> +
> +        elem->name = name;
> +        append_premade_object(tree, elem);
> +    }
> +
> +    va_end(args);
> +    return tree;
> +}

Would look nicer if boolval et al weren't declared - I'd just pass the
va_arg() return value directly to new_*_object().

Attached as 0001.

Additionally I find the separate handling of ObjTypeNull not so
nice. 0002.

> +/*
> + * A helper routine to setup %{}T elements.
> + */
> +static ObjTree *
> +new_objtree_for_type(Oid typeId, int32 typmod)
> +{

> +    append_bool_object(typeParam, "is_array", is_array);

Maybe name that one typarray?

> +/*
> + * Handle deparsing of simple commands.
> + *
> + * This function contains a large switch that mirrors that in
> + * ProcessUtilitySlow.  All cases covered there should also be covered here.
> + */
> +static ObjTree *
> +deparse_simple_command(StashedCommand *cmd)
> +{
> +
> +        case T_CommentStmt:
> +            command = NULL;
> +            break;
> +
> +        case T_GrantStmt:
> +            /* handled elsewhere */
> +            elog(ERROR, "unexpected command type %s", CreateCommandTag(parsetree));
> +            break;
...
> +        case T_SecLabelStmt:
> +            elog(ERROR, "unimplemented deparse of %s", CreateCommandTag(parsetree));
> +            break;
> +    }
> +
> +    return command;
> +}

It is not clear to me why some commands error out and other don't. It
makes sense to me to error out for things like GrantStmt that shouldn't
end up here, but...  I guess that's just an artifact of the patch series
because you add the handling for the non elog()ing commands later?

I think this should use ereport() in some cases if we're not going to
support some commands for now.

> +/*
> + * Given a utility command parsetree and the OID of the corresponding object,
> + * return a JSON representation of the command.
> + *
> + * The command is expanded fully, so that there are no ambiguities even in the
> + * face of search_path changes.
> + */
> +char *
> +deparse_utility_command(StashedCommand *cmd)
> +{
> +    OverrideSearchPath *overridePath;
> +    MemoryContext    oldcxt;
> +    MemoryContext    tmpcxt;
> +    ObjTree           *tree;
> +    char           *command;
> +    StringInfoData  str;
> +
> +    /*
> +     * Allocate everything done by the deparsing routines into a temp context,
> +     * to avoid having to sprinkle them with memory handling code; but allocate
> +     * the output StringInfo before switching.
> +     */
> +    initStringInfo(&str);
> +    tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
> +                                   "deparse ctx",
> +                                   ALLOCSET_DEFAULT_MINSIZE,
> +                                   ALLOCSET_DEFAULT_INITSIZE,
> +                                   ALLOCSET_DEFAULT_MAXSIZE);
> +    oldcxt = MemoryContextSwitchTo(tmpcxt);
> +
> +    /*
> +     * Many routines underlying this one will invoke ruleutils.c functionality
> +     * in order to obtain deparsed versions of expressions.  In such results,
> +     * we want all object names to be qualified, so that results are "portable"
> +     * to environments with different search_path settings.  Rather than inject
> +     * what would be repetitive calls to override search path all over the
> +     * place, we do it centrally here.
> +     */
> +    overridePath = GetOverrideSearchPath(CurrentMemoryContext);
> +    overridePath->schemas = NIL;
> +    overridePath->addCatalog = false;
> +    overridePath->addTemp = false;
> +    PushOverrideSearchPath(overridePath);

Ah, the 'addTemp = false' probably is the answer to my question above
about view deparsing adding a fully qualified pg_temp...


> +/*------
> + * Returns a formatted string from a JSON object.
> + *
> + * The starting point is the element named "fmt" (which must be a string).
> + * This format string may contain zero or more %-escapes, which consist of an
> + * element name enclosed in { }, possibly followed by a conversion modifier,
> + * followed by a conversion specifier.    Possible conversion specifiers are:
> + *
> + * %        expand to a literal %.
> + * I        expand as a single, non-qualified identifier
> + * D        expand as a possibly-qualified identifier
> + * T        expand as a type name
> + * O        expand as an operator name
> + * L        expand as a string literal (quote using single quotes)
> + * s        expand as a simple string (no quoting)
> + * n        expand as a simple number (no quoting)
> + *
> + * The element name may have an optional separator specification preceded
> + * by a colon.    Its presence indicates that the element is expected to be
> + * an array; the specified separator is used to join the array elements.
> + *------

I think this documentation should at least be referred to from
comments in deparse_utility.c. I was wondering where it is.

> diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
> index 6fbe283..a842ef2 100644
> --- a/src/backend/utils/adt/jsonb.c
> +++ b/src/backend/utils/adt/jsonb.c
> @@ -416,7 +416,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
>  {
>      bool        first = true;
>      JsonbIterator *it;
> -    int            type = 0;
> +    JsonbIteratorToken type = WJB_DONE;
>      JsonbValue    v;
>      int            level = 0;
>      bool        redo_switch = false;
> @@ -498,7 +498,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
>                  first = false;
>                  break;
>              default:
> -                elog(ERROR, "unknown flag of jsonb iterator");
> +                elog(ERROR, "unknown jsonb iterator token type");
>          }
>      }

Huh?


I think this infrastructure pretty desperately requires some higher
level overview. Like how are we going from the node tree, to the object
tree, to jsonb, to actual DDL. I think I understand, but it took me a
while. Doesn't have to be long and super detailed...

> From dcb353c8c9bd93778a62719ff8bf32b9a419e16d Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: Thu, 25 Sep 2014
> 15:54:00 -0300 Subject: [PATCH 09/42] deparse: Support CREATE TYPE AS

> +static ObjTree *
> +deparse_ColumnDef(Relation relation, List *dpcontext, bool composite,
> +                  ColumnDef *coldef)
> +{

> +    if (!composite)
> +    {
> +        /*
> +         * Emit a NOT NULL declaration if necessary.  Note that we cannot trust
> +         * pg_attribute.attnotnull here, because that bit is also set when
> +         * primary keys are specified; and we must not emit a NOT NULL
> +         * constraint in that case, unless explicitely specified.  Therefore,
> +         * we scan the list of constraints attached to this column to determine
> +         * whether we need to emit anything.
> +         * (Fortunately, NOT NULL constraints cannot be table constraints.)
> +         */

Is the primary key bit really a problem? To me it sounds like it's
actually good to retain a NOT NULL in that case. Note that pg_dump
actually *does* emit a NOT NULL here, even if you drop the primary
key. Since the column behaves as NOT NULL and is dumped as such it seems
like a good idea to fully treat it as such.

>    deparse: Support CREATE SCHEMA/TABLE/SEQUENCE/INDEX/TRIGGER

> +/*
> + * deparse_CreateTrigStmt
> + *        Deparse a CreateTrigStmt (CREATE TRIGGER)
> + *
> + * Given a trigger OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
> +static ObjTree *
> +deparse_CreateTrigStmt(Oid objectId, Node *parsetree)
> +{
> +    CreateTrigStmt *node = (CreateTrigStmt *) parsetree;
> +    Relation    pg_trigger;
> +    HeapTuple    trigTup;
> +    Form_pg_trigger trigForm;
> +    ObjTree       *trigger;
> +    ObjTree       *tmp;
> +    int            tgnargs;
> +    List       *list;
> +    List       *events;
> +
> +    pg_trigger = heap_open(TriggerRelationId, AccessShareLock);
> +
> +    trigTup = get_catalog_object_by_oid(pg_trigger, objectId);
> +    trigForm = (Form_pg_trigger) GETSTRUCT(trigTup);
> +
> +    /*
> +     * Some of the elements only make sense for CONSTRAINT TRIGGERs, but it
> +     * seems simpler to use a single fmt string for both kinds of triggers.
> +     */
> +    trigger =
> +        new_objtree_VA("CREATE %{constraint}s TRIGGER %{name}I %{time}s %{events: OR }s "
> +                       "ON %{relation}D %{from_table}s %{constraint_attrs: }s "
> +                       "FOR EACH %{for_each}s %{when}s EXECUTE PROCEDURE %{function}s",
> +                       2,
> +                       "name", ObjTypeString, node->trigname,
> +                       "constraint", ObjTypeString,
> +                       node->isconstraint ? "CONSTRAINT" : "");
> +
> +    if (node->timing == TRIGGER_TYPE_BEFORE)
> +        append_string_object(trigger, "time", "BEFORE");
> +    else if (node->timing == TRIGGER_TYPE_AFTER)
> +        append_string_object(trigger, "time", "AFTER");
> +    else if (node->timing == TRIGGER_TYPE_INSTEAD)
> +        append_string_object(trigger, "time", "INSTEAD OF");
> +    else
> +        elog(ERROR, "unrecognized trigger timing value %d", node->timing);
> +
> +    /*
> +     * Decode the events that the trigger fires for.  The output is a list;
> +     * in most cases it will just be a string with the even name, but when
> +     * there's an UPDATE with a list of columns, we return a JSON object.
> +     */
> +    events = NIL;
> +    if (node->events & TRIGGER_TYPE_INSERT)
> +        events = lappend(events, new_string_object("INSERT"));
> +    if (node->events & TRIGGER_TYPE_DELETE)
> +        events = lappend(events, new_string_object("DELETE"));
> +    if (node->events & TRIGGER_TYPE_TRUNCATE)
> +        events = lappend(events, new_string_object("TRUNCATE"));
> +    if (node->events & TRIGGER_TYPE_UPDATE)
> +    {
> +        if (node->columns == NIL)
> +        {
> +            events = lappend(events, new_string_object("UPDATE"));
> +        }
> +        else
> +        {
> +            ObjTree       *update;
> +            ListCell   *cell;
> +            List       *cols = NIL;
> +
> +            /*
> +             * Currently only UPDATE OF can be objects in the output JSON, but
> +             * we add a "kind" element so that user code can distinguish
> +             * possible future new event types.
> +             */
> +            update = new_objtree_VA("UPDATE OF %{columns:, }I",
> +                                    1, "kind", ObjTypeString, "update_of");
> +
> +            foreach(cell, node->columns)
> +            {
> +                char   *colname = strVal(lfirst(cell));
> +
> +                cols = lappend(cols,
> +                               new_string_object(colname));
> +            }
> +
> +            append_array_object(update, "columns", cols);
> +
> +            events = lappend(events,
> +                             new_object_object(update));
> +        }
> +    }
> +    append_array_object(trigger, "events", events);
> +
> +    tmp = new_objtree_for_qualname_id(RelationRelationId,
> +                                      trigForm->tgrelid);
> +    append_object_object(trigger, "relation", tmp);
> +
> +    tmp = new_objtree_VA("FROM %{relation}D", 0);
> +    if (trigForm->tgconstrrelid)
> +    {
> +        ObjTree       *rel;
> +
> +        rel = new_objtree_for_qualname_id(RelationRelationId,
> +                                          trigForm->tgconstrrelid);
> +        append_object_object(tmp, "relation", rel);
> +    }
> +    else
> +        append_bool_object(tmp, "present", false);
> +    append_object_object(trigger, "from_table", tmp);
> +
> +    list = NIL;
> +    if (node->deferrable)
> +        list = lappend(list,
> +                       new_string_object("DEFERRABLE"));
> +    if (node->initdeferred)
> +        list = lappend(list,
> +                       new_string_object("INITIALLY DEFERRED"));

I mildly wonder if DEFERRABLE/INITIALLY DEFERRED shouldn't instead have
an 'is_present' style representation.

> +/*
> + * deparse_CreateStmt
> + *        Deparse a CreateStmt (CREATE TABLE)

I really wish CreateStmt were named CreateTableStmt. I don't know how
often that tripped me over, let alone everyone. Yes, that's an unrelated
rant ;)

> + * Given a table OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
> +static ObjTree *
> +deparse_CreateStmt(Oid objectId, Node *parsetree)
> +{
...
> +    tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0);
> +    if (node->tablespacename)
> +        append_string_object(tmp, "tablespace", node->tablespacename);
> +    else
> +    {
> +        append_null_object(tmp, "tablespace");
> +        append_bool_object(tmp, "present", false);
> +    }
> +    append_object_object(createStmt, "tablespace", tmp);

Hm. I had not thought about that before, but given that
default_tablespace exists, we should maybe somehow indicate which one
was chosen?

> +static ObjElem *
> +deparse_Seq_OwnedBy(ObjTree *parent, Oid sequenceId)
> +{
> +    ObjTree    *ownedby = NULL;
> +    Relation    depRel;
> +    SysScanDesc scan;
> +    ScanKeyData keys[3];
> +    HeapTuple    tuple;
> +
> +    depRel = heap_open(DependRelationId, AccessShareLock);
> +    ScanKeyInit(&keys[0],
> +                Anum_pg_depend_classid,
> +                BTEqualStrategyNumber, F_OIDEQ,
> +                ObjectIdGetDatum(RelationRelationId));
> +    ScanKeyInit(&keys[1],
> +                Anum_pg_depend_objid,
> +                BTEqualStrategyNumber, F_OIDEQ,
> +                ObjectIdGetDatum(sequenceId));
> +    ScanKeyInit(&keys[2],
> +                Anum_pg_depend_objsubid,
> +                BTEqualStrategyNumber, F_INT4EQ,
> +                Int32GetDatum(0));
> +
> +    scan = systable_beginscan(depRel, DependDependerIndexId, true,
> +                              NULL, 3, keys);
> +    while (HeapTupleIsValid(tuple = systable_getnext(scan)))
> +    {
> +        Oid            ownerId;
> +        Form_pg_depend depform;
> +        ObjTree    *tmp;
> +        char       *colname;
> +
> +        depform = (Form_pg_depend) GETSTRUCT(tuple);
> +
> +        /* only consider AUTO dependencies on pg_class */
> +        if (depform->deptype != DEPENDENCY_AUTO)
> +            continue;
> +        if (depform->refclassid != RelationRelationId)
> +            continue;
> +        if (depform->refobjsubid <= 0)
> +            continue;
> +
> +        ownerId = depform->refobjid;
> +        colname = get_attname(ownerId, depform->refobjsubid);
> +        if (colname == NULL)
> +            continue;
> +
> +        tmp = new_objtree_for_qualname_id(RelationRelationId, ownerId);
> +        append_string_object(tmp, "attrname", colname);
> +        ownedby = new_objtree_VA("OWNED BY %{owner}D",
> +                                 2,
> +                                 "clause", ObjTypeString, "owned",
> +                                 "owner", ObjTypeObject, tmp);
> +    }
> +
> +    systable_endscan(scan);
> +    relation_close(depRel, AccessShareLock);
> +
> +    /*
> +     * If there's no owner column, emit an empty OWNED BY element, set up so
> +     * that it won't print anything.
> +     */

"column"? Should have been row?

> +    if (!ownedby)
> +        /* XXX this shouldn't happen ... */
> +        ownedby = new_objtree_VA("OWNED BY %{owner}D",
> +                                 3,
> +                                 "clause", ObjTypeString, "owned",
> +                                 "owner", ObjTypeNull,
> +                                 "present", ObjTypeBool, false);

Why shouldn't this happen? Free standing sequences are perfectly normal?
And OWNED BY NONE will need to be deparseable.

> +static ObjTree *
> +deparse_CreateSeqStmt(Oid objectId, Node *parsetree)
> +{
> +    ObjTree    *createSeq;
> +    ObjTree    *tmp;
> +    Relation    relation = relation_open(objectId, AccessShareLock);
> +    Form_pg_sequence seqdata;
> +    List       *elems = NIL;
> +
> +    seqdata = get_sequence_values(objectId);
> +
> +    createSeq =
> +        new_objtree_VA("CREATE %{persistence}s SEQUENCE %{identity}D "
> +                       "%{definition: }s",
> +                       1,
> +                       "persistence", ObjTypeString,
> +                       get_persistence_str(relation->rd_rel->relpersistence));
> +
> +    tmp = new_objtree_for_qualname(relation->rd_rel->relnamespace,
> +                                   RelationGetRelationName(relation));
> +    append_object_object(createSeq, "identity", tmp);
> +
> +    /* definition elements */
> +    elems = lappend(elems, deparse_Seq_Cache(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_Cycle(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_IncrementBy(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_Minvalue(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_Maxvalue(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_Startwith(createSeq, seqdata));
> +    elems = lappend(elems, deparse_Seq_Restart(createSeq, seqdata));
> +    /* we purposefully do not emit OWNED BY here */

Needs explanation about the reason. Hm. I don't think it's actually
correct. Right now the following won't be deparsed correctly:

CREATE TABLE seqowner(id int);
CREATE SEQUENCE seqowned OWNED BY seqowner.id;
that generates
CREATE  TABLE  public.seqowner (id pg_catalog.int4   )  WITH (oids=OFF)
CREATE  SEQUENCE public.seqowned CACHE 1 NO CYCLE INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START WITH 1
RESTART1 


> +/*
> + * deparse_IndexStmt
> + *        deparse an IndexStmt
> + *
> + * Given an index OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + *
> + * If the index corresponds to a constraint, NULL is returned.
> + */
> +static ObjTree *
> +deparse_IndexStmt(Oid objectId, Node *parsetree)
> +{

> +    /* tablespace */
> +    tmp = new_objtree_VA("TABLESPACE %{tablespace}s", 0);
> +    if (tablespace)
> +        append_string_object(tmp, "tablespace", tablespace);
> +    else
> +        append_bool_object(tmp, "present", false);
> +    append_object_object(indexStmt, "tablespace", tmp);

Same default tablespace question as with create table...


> From 4e35ba6557a6d04539d69d75821c568f9efb09b5 Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date: Fri, 14 Feb 2014 19:04:08 -0300
> Subject: [PATCH 12/42] deparse: Support CREATE TYPE AS RANGE
>

>
> +static ObjTree *
> +deparse_CreateRangeStmt(Oid objectId, Node *parsetree)
> +{

> +    /* SUBTYPE_OPCLASS */
> +    if (OidIsValid(rangeForm->rngsubopc))
> +    {
> +        tmp = new_objtree_for_qualname_id(OperatorClassRelationId,
> +                                          rangeForm->rngsubopc);
> +        tmp = new_objtree_VA("SUBTYPE_OPCLASS = %{opclass}D",
> +                             2,
> +                             "clause", ObjTypeString, "opclass",
> +                             "opclass", ObjTypeObject, tmp);
> +        definition = lappend(definition, new_object_object(tmp));
> +    }
> +
> +    /* COLLATION */
> +    if (OidIsValid(rangeForm->rngcollation))
> +    {
> +        tmp = new_objtree_for_qualname_id(CollationRelationId,
> +                                          rangeForm->rngcollation);
> +        tmp = new_objtree_VA("COLLATION = %{collation}D",
> +                             2,
> +                             "clause", ObjTypeString, "collation",
> +                             "collation", ObjTypeObject, tmp);
> +        definition = lappend(definition, new_object_object(tmp));
> +    }
> +
> +    /* CANONICAL */
> +    if (OidIsValid(rangeForm->rngcanonical))
> +    {
> +        tmp = new_objtree_for_qualname_id(ProcedureRelationId,
> +                                          rangeForm->rngcanonical);
> +        tmp = new_objtree_VA("CANONICAL = %{canonical}D",
> +                             2,
> +                             "clause", ObjTypeString, "canonical",
> +                             "canonical", ObjTypeObject, tmp);
> +        definition = lappend(definition, new_object_object(tmp));
> +    }
> +
> +    /* SUBTYPE_DIFF */
> +    if (OidIsValid(rangeForm->rngsubdiff))
> +    {
> +        tmp = new_objtree_for_qualname_id(ProcedureRelationId,
> +                                          rangeForm->rngsubdiff);
> +        tmp = new_objtree_VA("SUBTYPE_DIFF = %{subtype_diff}D",
> +                             2,
> +                             "clause", ObjTypeString, "subtype_diff",
> +                             "subtype_diff", ObjTypeObject, tmp);
> +        definition = lappend(definition, new_object_object(tmp));
> +    }

Maybe use the present = false stuff here as well?

> From 04dc40db971dc51c7e8c646cb5d822488da306f7 Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date: Fri, 21 Feb 2014 18:11:35 -0300
> Subject: [PATCH 13/42] deparse: Support CREATE EXTENSION

>  /*
> + * deparse_CreateExtensionStmt
> + *        deparse a CreateExtensionStmt
> + *
> + * Given an extension OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + *
> + * XXX the current representation makes the output command dependant on the
> + * installed versions of the extension.  Is this a problem?

I tend to think it's ok. Might be worthwhile to add the final extension
version in some unused attribute?


> From f45a9141905e93b83d28809e357f95c7fa713fe7 Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date: Wed, 26 Feb 2014 17:26:55 -0300
> Subject: [PATCH 14/42] deparse: Support CREATE RULE

/me scheds some tears

> From 67ff742875bfadd182ae28e40e54476c9e4d220e Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date: Fri, 25 Apr 2014 16:43:53 -0300
> Subject: [PATCH 16/42] deparse: Support for ALTER <OBJECT> RENAME
>
> It supports everything but functions, aggregates, operator classes and
> operator families.

AFAICS those are implemented, and you updated the description in git
since.
> +static ObjTree *
> +deparse_RenameStmt(Oid objectId, Node *parsetree)
> +{
> +    RenameStmt *node = (RenameStmt *) parsetree;
> +    ObjTree       *renameStmt;
> +    char       *fmtstr;
> +    Relation    relation;
> +    Oid            schemaId;
> +
> +    /*
> +     * FIXME --- this code is missing support for inheritance behavioral flags,
> +     * i.e. the "*" and ONLY elements.
> +     */

Hm. I think we probably need that?

> +     * XXX what if there's another event trigger running concurrently that
> +     * renames the schema or moves the object to another schema?  Seems
> +     * pretty far-fetched, but possible nonetheless.
> +     */

We should probably prohibit DDL from within event triggers?

> +    switch (node->renameType)
> +    {
> +        case OBJECT_COLLATION:
> +        case OBJECT_CONVERSION:
> +        case OBJECT_DOMAIN:
> +        case OBJECT_TSDICTIONARY:
> +        case OBJECT_TSPARSER:
> +        case OBJECT_TSTEMPLATE:
> +        case OBJECT_TSCONFIGURATION:
> +        case OBJECT_TYPE:
> +            {
> +                char       *identity;
> +                HeapTuple    objTup;
> +                Relation    catalog;
> +                Datum        objnsp;
> +                bool        isnull;
> +                Oid            classId = get_objtype_catalog_oid(node->renameType);
> +                AttrNumber    Anum_namespace = get_object_attnum_namespace(classId);
> +
> +                catalog = relation_open(classId, AccessShareLock);
> +                objTup = get_catalog_object_by_oid(catalog, objectId);
> +                objnsp = heap_getattr(objTup, Anum_namespace,
> +                                      RelationGetDescr(catalog), &isnull);
> +                if (isnull)
> +                    elog(ERROR, "invalid NULL namespace");
> +
> +                identity = psprintf("%s.%s", get_namespace_name(DatumGetObjectId(objnsp)),
> +                                    strVal(llast(node->object)));
> +                fmtstr = psprintf("ALTER %s %%{identity}s RENAME TO %%{newname}I",
> +                                  stringify_objtype(node->renameType));

Is that correct? I.e. will it work for a namespace that needs to be
quoted? There's a couple of these. Afaics they all should use
new_objtree_for_qualname and %{}D? I guess in some cases, like the type,
that could be slightly more complicated, but I guess it has to be done
nonetheless?

> +        case OBJECT_AGGREGATE:
> +        case OBJECT_FUNCTION:
> +            {
> +                char       *newident;
> +                ObjectAddress objaddr;
> +                const char       *quoted_newname;
> +                StringInfoData old_ident;
> +                char       *start;
> +
> +                /*
> +                 * Generating a function/aggregate identity is altogether too
> +                 * messy, so instead of doing it ourselves, we generate one for
> +                 * the renamed object, then strip out the name and replace it
> +                 * with the original name from the parse node.  This is so ugly
> +                 * that we don't dare do it for any other object kind.
> +                 */
> +
> +                objaddr.classId = get_objtype_catalog_oid(node->renameType);
> +                objaddr.objectId = objectId;
> +                objaddr.objectSubId = 0;
> +                newident = getObjectIdentity(&objaddr);
> +
> +                quoted_newname = quote_identifier(node->newname);
> +                start = strstr(newident, quoted_newname);
> +                if (!start)
> +                    elog(ERROR, "could not find %s in %s", start, newident);
> +                initStringInfo(&old_ident);
> +                appendBinaryStringInfo(&old_ident, newident, start - newident);
> +                appendStringInfoString(&old_ident,
> +                                       quote_identifier(strVal(llast(node->object))));
> +                appendStringInfoString(&old_ident, start + strlen(quoted_newname));
> +
> +                fmtstr = psprintf("ALTER %s %%{identity}s RENAME TO %%{newname}I",
> +                                  stringify_objtype(node->renameType));
> +                renameStmt = new_objtree_VA(fmtstr, 1,
> +                                            "identity", ObjTypeString, old_ident.data);

Yuck.

I think it'd be just as easy to split the argument output from the
function name output in format_procedure_internal, and make that
separately callable.


> Subject: [PATCH 18/42] deparse: Support CREATE FUNCTION

>  /*
> + * deparse_CreateFunctionStmt
> + *        Deparse a CreateFunctionStmt (CREATE FUNCTION)
> + *
> + * Given a function OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + *
> + * XXX this is missing the per-function custom-GUC thing.
> + */

Support attached as 0003.


> Subject: [PATCH 19/42] deparse: Support ALTER TABLE

> +/*
> + * EventTriggerStartRecordingSubcmds
> + *        Prepare to receive data on a complex DDL command about to be executed
> + *
> + * Note we don't actually stash the object we create here into the "stashed"
> + * list; instead we keep it in curcmd, and only when we're done processing the
> + * subcommands we will add it to the actual stash.
> + *
> + * FIXME -- this API isn't considering the possibility of an ALTER TABLE command
> + * being called reentrantly by an event trigger function.  Do we need stackable
> + * commands at this level?
> + */
> +void
> +EventTriggerComplexCmdStart(Node *parsetree, ObjectType objtype)
> +{
> +    MemoryContext    oldcxt;
> +    StashedCommand *stashed;
> +
> +    oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt);
> +
> +    stashed = palloc(sizeof(StashedCommand));
> +
> +    stashed->type = SCT_AlterTable;
> +    stashed->in_extension = creating_extension;
> +
> +    stashed->d.alterTable.objectId = InvalidOid;
> +    stashed->d.alterTable.objtype = objtype;
> +    stashed->d.alterTable.subcmds = NIL;
> +    stashed->parsetree = copyObject(parsetree);
> +
> +    currentEventTriggerState->curcmd = stashed;
> +
> +    MemoryContextSwitchTo(oldcxt);
> +}

Hm. So +EventTriggerComplexCmdStart is hardcoded to use SCT_AlterTable? That's a bit odd.

> +/*
> + * EventTriggerEndRecordingSubcmds
> + *         Finish up saving a complex DDL command
> + *
> + * FIXME this API isn't considering the possibility that a xact/subxact is
> + * aborted partway through.  Probably it's best to add an
> + * AtEOSubXact_EventTriggers() to fix this.
> + */

I think that'd generally be better than requiring PG_CATCH.

> +static ObjTree *
> +deparse_AlterTableStmt(StashedCommand *cmd)
> +{

> +    foreach(cell, cmd->d.alterTable.subcmds)
> +    {
> +        StashedATSubcmd    *substashed = (StashedATSubcmd *) lfirst(cell);
> +        AlterTableCmd    *subcmd = (AlterTableCmd *) substashed->parsetree;
> +        ObjTree       *tree;
> +
> +        Assert(IsA(subcmd, AlterTableCmd));
> +
> +        switch (subcmd->subtype)
> +        {
> +            case AT_AddColumn:
> +            case AT_AddColumnRecurse:
> +                /* XXX need to set the "recurse" bit somewhere? */
> +                Assert(IsA(subcmd->def, ColumnDef));
> +                tree = deparse_ColumnDef(rel, dpcontext, false,
> +                                         (ColumnDef *) subcmd->def);
> +                tmp = new_objtree_VA("ADD COLUMN %{definition}s",
> +                                     2, "type", ObjTypeString, "add column",
> +                                     "definition", ObjTypeObject, tree);
> +                subcmds = lappend(subcmds, new_object_object(tmp));
> +                break;

Misses ONLY handling?

> +            case AT_AddIndex:
> +                {
> +                    Oid            idxOid = substashed->oid;
> +                    IndexStmt  *istmt;
> +                    Relation    idx;
> +                    const char *idxname;
> +                    Oid            constrOid;
> +
> +                    Assert(IsA(subcmd->def, IndexStmt));
> +                    istmt = (IndexStmt *) subcmd->def;
> +
> +                    if (!istmt->isconstraint)
> +                        break;
> +
> +                    idx = relation_open(idxOid, AccessShareLock);
> +                    idxname = RelationGetRelationName(idx);
> +
> +                    constrOid = get_relation_constraint_oid(
> +                        cmd->d.alterTable.objectId, idxname, false);
> +
> +                    tmp = new_objtree_VA("ADD CONSTRAINT %{name}I %{definition}s",
> +                                         3, "type", ObjTypeString, "add constraint",
> +                                         "name", ObjTypeString, idxname,
> +                                         "definition", ObjTypeString,
> +                                         pg_get_constraintdef_string(constrOid, false));
> +                    subcmds = lappend(subcmds, new_object_object(tmp));
> +
> +                    relation_close(idx, AccessShareLock);
> +                }
> +                break;

Hm, didn't you have a out of line version of this somewhere?


> +            case AT_AlterColumnType:
> +                {

> +                    /*
> +                     * Error out if the USING clause was used.  We cannot use
> +                     * it directly here, because it needs to run through
> +                     * transformExpr() before being usable for ruleutils.c, and
> +                     * we're not in a position to transform it ourselves.  To
> +                     * fix this problem, tablecmds.c needs to be modified to store
> +                     * the transformed expression somewhere in the StashedATSubcmd.
> +                     */
> +                    tmp2 = new_objtree_VA("USING %{expression}s", 0);
> +                    if (def->raw_default)
> +                        elog(ERROR, "unimplemented deparse of ALTER TABLE TYPE USING");
> +                    else
> +                        append_bool_object(tmp2, "present", false);
> +                    append_object_object(tmp, "using", tmp2);

Hm. This probably needs to be fixed :(.

> +                }
> +                break;
> +
> +            case AT_AlterColumnGenericOptions:
> +                elog(ERROR, "unimplemented deparse of ALTER TABLE ALTER COLUMN OPTIONS");
> +                break;

> +            case AT_SetRelOptions:
> +                elog(ERROR, "unimplemented deparse of ALTER TABLE SET");
> +                break;
> +
> +            case AT_ResetRelOptions:
> +                elog(ERROR, "unimplemented deparse of ALTER TABLE RESET");
> +                break;


> +            case AT_GenericOptions:
> +                elog(ERROR, "unimplemented deparse of ALTER TABLE OPTIONS (...)");
> +                break;
> +

Shouldn't be too hard...

Could you perhaps attach some string that can be searched for to all
commands that you think need to be implemented before being committable?

>                              else
>                              {
> -                                /* Recurse for anything else */
> +                                /*
> +                                 * Recurse for anything else.  If we need to do
> +                                 * so, "close" the current complex-command set,
> +                                 * and start a new one at the bottom; this is
> +                                 * needed to ensure the ordering of queued
> +                                 * commands is consistent with the way they are
> +                                 * executed here.
> +                                 */
> +                                EventTriggerComplexCmdEnd();
>                                  ProcessUtility(stmt,
>                                                 queryString,
>                                                 PROCESS_UTILITY_SUBCOMMAND,
>                                                 params,
>                                                 None_Receiver,
>                                                 NULL);
> +                                EventTriggerComplexCmdStart(parsetree, atstmt->relkind);
> +                                EventTriggerComplexCmdSetOid(relid);
>                              }

Hm. Commands are always ordered in a way this is ok?


> Subject: [PATCH 21/42] deparse: Support CREATE OPERATOR FAMILY

>  static ObjTree *
> +deparse_CreateOpFamily(Oid objectId, Node *parsetree)
> +{
> +    HeapTuple   opfTup;
> +    HeapTuple   amTup;
> +    Form_pg_opfamily opfForm;
> +    Form_pg_am  amForm;
> +    ObjTree       *copfStmt;
> +    ObjTree       *tmp;
> +
> +    opfTup = SearchSysCache1(OPFAMILYOID, ObjectIdGetDatum(objectId));
> +    if (!HeapTupleIsValid(opfTup))
> +        elog(ERROR, "cache lookup failed for operator family with OID %u", objectId);
> +    opfForm = (Form_pg_opfamily) GETSTRUCT(opfTup);
> +
> +    amTup = SearchSysCache1(AMOID, ObjectIdGetDatum(opfForm->opfmethod));
> +    if (!HeapTupleIsValid(amTup))
> +        elog(ERROR, "cache lookup failed for access method %u",
> +             opfForm->opfmethod);
> +    amForm = (Form_pg_am) GETSTRUCT(amTup);
> +
> +    copfStmt = new_objtree_VA("CREATE OPERATOR FAMILY %{identity}D USING %{amname}s",
> +                              0);
> +
> +    tmp = new_objtree_for_qualname(opfForm->opfnamespace,
> +                                   NameStr(opfForm->opfname));
> +    append_object_object(copfStmt, "identity", tmp);
> +    append_string_object(copfStmt, "amname", NameStr(amForm->amname));

Hm. Amname is unquoted? I can't believe this will ever really be a problem, but still.


> Subject: [PATCH 24/42] deparse: support ALTER THING OWNER TO
>  static ObjTree *
> +deparse_AlterOwnerStmt(Oid objectId, Node *parsetree)
> +{
> +    AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree;
> +    ObjTree       *ownerStmt;
> +    ObjectAddress addr;
> +    char       *fmt;
> +
> +    fmt = psprintf("ALTER %s %%{identity}s OWNER TO %%{newname}I",
> +                   stringify_objtype(node->objectType));

newname sounds like copied from rename.


> Subject: [PATCH 25/42] deparse: Support ALTER EXTENSION / UPDATE TO

> Subject: [PATCH 26/42] deparse: Support GRANT/REVOKE

> +            else
> +            {
> +                Assert(cmd->type == SCT_Grant);
> +
> +                /* classid */
> +                nulls[i++] = true;
> +                /* objid */
> +                nulls[i++] = true;
> +                /* objsubid */
> +                nulls[i++] = true;

Might actually be nice to fill those out to the object that's being
changed. But it might end up being more work than justified, and it can
be easily added later. I guess it'd also mean we would have to generate
multiple GRANT/REVOKE commands.


> diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
> index c09915d..92bccf5 100644
> --- a/src/backend/tcop/utility.c
> +++ b/src/backend/tcop/utility.c
> @@ -781,6 +781,7 @@ standard_ProcessUtility(Node *parsetree,
>                  else
>                      ExecuteGrantStmt((GrantStmt *) parsetree);
>              }
> +            break;

Uh. Was that missing from an earlier commit?

>          case T_DropStmt:
>              {
> diff --git a/src/include/commands/event_trigger.h b/src/include/commands/event_trigger.h
> index cdce9e1..60b590b 100644
> --- a/src/include/commands/event_trigger.h
> +++ b/src/include/commands/event_trigger.h
> @@ -17,6 +17,7 @@
>  #include "catalog/objectaddress.h"
>  #include "catalog/pg_event_trigger.h"
>  #include "nodes/parsenodes.h"
> +#include "utils/aclchk.h"
>
>  typedef struct EventTriggerData
>  {
> @@ -62,6 +63,7 @@ extern void EventTriggerSQLDropAddObject(const ObjectAddress *object,
>
>  extern void EventTriggerStashCommand(Oid objectId, uint32 objectSubId,
>                           ObjectType objtype, Oid secondaryOid, Node *parsetree);
> +extern void EventTriggerStashGrant(InternalGrant *istmt);
>  extern void EventTriggerComplexCmdStart(Node *parsetree, ObjectType objtype);
>  extern void EventTriggerComplexCmdSetOid(Oid objectId);
>  extern void EventTriggerRecordSubcmd(Node *subcmd, Oid relid,
> diff --git a/src/include/tcop/deparse_utility.h b/src/include/tcop/deparse_utility.h
> index 8ebbf34..c364603 100644
> --- a/src/include/tcop/deparse_utility.h
> +++ b/src/include/tcop/deparse_utility.h
> @@ -14,6 +14,8 @@
>
>  #include "access/attnum.h"
>  #include "nodes/nodes.h"
> +#include "utils/aclchk.h"
> +
>
>  /*
>   * Support for keeping track of a command to deparse.
> @@ -26,7 +28,8 @@
>  typedef enum StashedCommandType
>  {
>      SCT_Simple,
> -    SCT_AlterTable
> +    SCT_AlterTable,
> +    SCT_Grant
>  } StashedCommandType;
>
>  /*
> @@ -61,6 +64,12 @@ typedef struct StashedCommand
>              ObjectType objtype;
>              List   *subcmds;
>          } alterTable;
> +
> +        struct GrantCommand
> +        {
> +            InternalGrant *istmt;
> +            const char *type;
> +        } grant;
>      } d;
>  } StashedCommand;
>
> diff --git a/src/include/utils/aclchk.h b/src/include/utils/aclchk.h
> new file mode 100644
> index 0000000..1ca7095
> --- /dev/null
> +++ b/src/include/utils/aclchk.h
> @@ -0,0 +1,45 @@
> +/*-------------------------------------------------------------------------
> + *
> + * aclchk.h
> + *
> + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
> + * Portions Copyright (c) 1994, Regents of the University of California
> + *
> + * src/include/utils/aclchk.h

Maybe we should name it aclchk_internal.h?


> Subject: [PATCH 27/42] deparse: Support ALTER FUNCTION

> + * deparse_AlterFunctionStmt
> + *        Deparse a AlterFunctionStmt (ALTER FUNCTION)
> + *
> + * Given a function OID and the parsetree that created it, return the JSON
> + * blob representing the alter command.
> + *
> + * XXX this is missing the per-function custom-GUC thing.
> + */

Hm, so my earlier ptatch needs to partially be copied here. I'm running out of time now though...

> +static ObjTree *
> +deparse_AlterFunction(Oid objectId, Node *parsetree)
> +{

> +    foreach(cell, node->actions)
> +    {
> +        DefElem    *defel = (DefElem *) lfirst(cell);
> +        ObjTree       *tmp = NULL;
> +
> +        if (strcmp(defel->defname, "volatility") == 0)
> +        {
> +            tmp = new_objtree_VA(strVal(defel->arg), 0);
> +        }
> +        else if (strcmp(defel->defname, "strict") == 0)
> +        {
> +            tmp = new_objtree_VA(intVal(defel->arg) ?
> +                                 "RETURNS NULL ON NULL INPUT" :
> +                                 "CALLED ON NULL INPUT", 0);
> +        }

Ick. I wish there was NOT STRICT.


> Subject: [PATCH 28/42] deparse: support COMMENT ON
>

> +static ObjTree *
> +deparse_CommentOnConstraintSmt(Oid objectId, Node *parsetree)
> +{
> +    CommentStmt *node = (CommentStmt *) parsetree;
> +    ObjTree       *comment;
> +    HeapTuple    constrTup;
> +    Form_pg_constraint constrForm;
> +    char       *fmt;
> +    ObjectAddress addr;
> +
> +    Assert(node->objtype == OBJECT_TABCONSTRAINT || node->objtype == OBJECT_DOMCONSTRAINT);
> +
> +    if (node->comment)
> +        fmt = psprintf("COMMENT ON CONSTRAINT %%{identity}s ON %s%%{parentobj}s IS %%{comment}L",
> +                       node->objtype == OBJECT_TABCONSTRAINT ? "" : "DOMAIN ");
> +    else
> +        fmt = psprintf("COMMENT ON CONSTRAINT %%{identity}s ON %s%%{parentobj}s IS NULL",
> +                       node->objtype == OBJECT_TABCONSTRAINT ? "" : "DOMAIN ");

psprintf(...' IS %s', .... node->comment ? "%{comment}L" : "NULL") maybe?

> +
> +static ObjTree *
> +deparse_CommentStmt(Oid objectId, Oid objectSubId, Node *parsetree)
> +{

> +    if (node->comment)
> +        fmt = psprintf("COMMENT ON %s %%{identity}s IS %%{comment}L",
> +                       stringify_objtype(node->objtype));
> +    else
> +        fmt = psprintf("COMMENT ON %s %%{identity}s IS NULL",
> +                       stringify_objtype(node->objtype));

similarly here?

> +    comment = new_objtree_VA(fmt, 0);
> +    if (node->comment)
> +        append_string_object(comment, "comment", node->comment);

Or at leas tthis should be moved into the above if.

> Subject: [PATCH 29/42] deparse: support SECURITY LABEL

>  static ObjTree *
> +deparse_SecLabelStmt(Oid objectId, Oid objectSubId, Node *parsetree)
> +{
> +    SecLabelStmt *node = (SecLabelStmt *) parsetree;
> +    ObjTree       *label;
> +    ObjectAddress addr;
> +    char       *fmt;
> +
> +    if (node->label)
> +    {
> +        fmt = psprintf("SECURITY LABEL FOR %%{provider}s ON %s %%{identity}s IS %%{label}L",
> +                   stringify_objtype(node->objtype));
> +        label = new_objtree_VA(fmt, 0);
> +
> +        append_string_object(label, "label", node->label);
> +    }
> +    else
> +    {
> +        fmt = psprintf("SECURITY LABEL FOR %%{provider}s ON %s %%{identity}s IS NULL",
> +                   stringify_objtype(node->objtype));
> +        label = new_objtree_VA(fmt, 0);
> +    }

Should probably merged similarly.

"deparse: Handle default security provider." should be squashed into this.


> Subject: [PATCH 34/42] deparse: support CREATE TABLE AS

> +static ObjTree *
> +deparse_CreateTableAsStmt(Oid objectId, Node *parsetree)
> +{

> +    /*
> +     * Note that INSERT INTO is deparsed as CREATE TABLE AS.  They are
> +     * functionally equivalent synonyms so there is no harm from this.
> +     */
> +    if (node->relkind == OBJECT_MATVIEW)
> +        fmt = "CREATE %{persistence}s MATERIALIZED VIEW %{if_not_exists}s "
> +            "%{identity}D %{columns}s %{with}s %{tablespace}s "
> +            "AS %{query}s %{with_no_data}s";
> +    else
> +        fmt = "CREATE %{persistence}s TABLE %{if_not_exists}s "
> +            "%{identity}D %{columns}s %{with}s %{on_commit}s %{tablespace}s "
> +            "AS %{query}s %{with_no_data}s";

With my replication hat on, which I admit isn't very general in this
case, this isn't actually what I want... What I really would rather have
there is a normal CREATE TABLE statement that I can replay on the other
side. But I guess that has to be done somehow in a utility hook :(
> +    /* Add an ON COMMIT clause.  CREATE MATERIALIZED VIEW doesn't have one */
> +    if (node->relkind == OBJECT_TABLE)
> +    {
> +        tmp = new_objtree_VA("ON COMMIT %{on_commit_value}s", 0);
> +        switch (node->into->onCommit)
> +        {
> +            case ONCOMMIT_DROP:
> +                append_string_object(tmp, "on_commit_value", "DROP");
> +                break;
> +
> +            case ONCOMMIT_DELETE_ROWS:
> +                append_string_object(tmp, "on_commit_value", "DELETE ROWS");
> +                break;
> +
> +            case ONCOMMIT_PRESERVE_ROWS:
> +                append_string_object(tmp, "on_commit_value", "PRESERVE ROWS");
> +                break;
> +
> +            case ONCOMMIT_NOOP:
> +                append_null_object(tmp, "on_commit_value");
> +                append_bool_object(tmp, "present", false);
> +                break;
> +        }
> +        append_object_object(createStmt, "on_commit", tmp);
> +    }

That switch already exists somewhere else? Maybe add a function that
converts ONCOMMIT_* into the relevant string?





Ran out of energy here...

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

On 2015-02-21 17:30:24 +0100, Andres Freund wrote:

> >  /*
> > + * deparse_CreateFunctionStmt
> > + *        Deparse a CreateFunctionStmt (CREATE FUNCTION)
> > + *
> > + * Given a function OID and the parsetree that created it, return the JSON
> > + * blob representing the creation command.
> > + *
> > + * XXX this is missing the per-function custom-GUC thing.
> > + */

> > Subject: [PATCH 27/42] deparse: Support ALTER FUNCTION
>
> > + * deparse_AlterFunctionStmt
> > + *        Deparse a AlterFunctionStmt (ALTER FUNCTION)
> > + *
> > + * Given a function OID and the parsetree that created it, return the JSON
> > + * blob representing the alter command.
> > + *
> > + * XXX this is missing the per-function custom-GUC thing.
> > + */
>
> Hm, so my earlier ptatch needs to partially be copied here. I'm running out of time now though...

Updated 0003 attached that supports both SET for both CREATE and
ALTER. In my previous patch I'd looked at proconfig for the values. A
bit further thought revealed that that's not such a great idea: It works
well enough for CREATE FUNCTION, but already breaks down at CREATE OR
REPLACE FUNCTION unless we want to emit RESET ALL (SET ...)+ which seems
mighty ugly.

Also, the code is actually a good bit easier to understand this
way. I. hate. arrays. ;)

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

0003-Support-SET-RESET-in-CREATE-ALTER-FUNCTION.patch

Re: deparsing utility commands

От

Stephen Frost

Дата:

24 февраля 2015 г., 03:48:53

Andres,

* Andres Freund (andres@2ndquadrant.com) wrote:
> On 2015-02-21 14:51:32 -0500, Stephen Frost wrote:
> > It'd be *really* nice to be able to pass an object identifier to some
> > function and get back the CREATE (in particular, though perhaps DROP, or
> > whatever) command for it.  This gets us *awful* close to that without
> > actually giving it to us and that's bugging me.  The issue is the
> > parsetree, which I understand may be required in some cases (ALTER,
> > primairly, it seems?), but isn't always.
>
> > The USING and WITH CHECK quals can be extracted from the polForm also,
> > of course, it's what psql and pg_dump are doing, after all.
> >
> > So, why depend on the parsetree?  What about have another argument which
> > a user could use without the parsetree, for things that it's absolutely
> > required for today (eg: ALTER sub-command differentiation)?
>
> I'm really not wild about pretty massively expanding the scope at the
> eleventh hour.  There's a fair number of commands where this the
> deparsing command will give you a bunch of commands - look at CREATE
> TABLE and CREATE SCHEMA ... for the most extreme examples. For those
> there's no chance to do that without the parse tree available.

For my 2c, I don't agree with either of your assumptions above.  I don't
see this as a massive expansion of the scope, nor that we're in the 11th
hour at this point.  Agreed, it's the last commitfest, but we're near
the beginning of it and Alvaro has already made modifications to the
patch set, as is generally expected to happen for any patch in a
commitfest, without much trouble.  The changes which I'm suggesting are
nearly trivial for the larger body of work and only slightly more than
trivial for the more complicated pieces.

> Yes, it might be possible to use the same code for a bunch of minor
> commands, but not for the interesting/complex stuff.

We can clearly rebuild at least CREATE commands for all objects without
access to the parse tree, obviously pg_dump manages somehow.  I didn't
specify that a single command had to be used.  Further, the actual
deparse_CreateStmt doesn't look like it'd have a terribly hard time
producing something without access to the parsetree.

The whole premise of this approach is that we're using the results of
the catalog changes as the basis for what's needed to recreate the
command.  The places where we're referring to the parsetree look more
like a crutch of convenience than for any particular good reason to.
Consider this part of 0011 which includes deparse_CreateStmt:

/** Process table elements: column definitions and constraints.  Only* the column definitions are obtained from the
parsenode itself.  To* get constraints we rely on pg_constraint, because the parse node* might be missing some things
suchas the name of the constraints.*/ 

and then looking down through to deparse_ColumnDef, we see that the
ColumnDef passed in is used almost exclusivly for the *column name*
(which is used to look up the entry in pg_attribute..).  Looping over
the pg_attribute entries for the table in the attnum order would
certainly return the same result, or something has gone wrong.

Beyond the inheiritance consideration, the only other reference actually
leads to what you argued (correctly, in my view) was a mistake in the
code- where a NOT NULL is omitted for the primary key case.  Weren't you
also complaining about the 'OF type' form being a potential issue?
That'd also go away with the approach I'm advocating.  In short, I
suspect that if this approach had been taken originally, at least some
of the concerns and issued levied against the current implementation
wouldn't exist.  Thankfully, the way the code has been developed, the
majority of the code is general infrastructure and the changes I'm
suggesting are all in code which is simpler, thanks to that
infrastructure already being there.

All I'm suggesting is that we focus on collecting the information from
the catalog and avoid using the parsetree.  For at least the CREATE and
DROP cases, that should be entirely possible.  This does end up being a
bit different from the original goal, which was closer to "reproduce
exactly the command that was specified," but as shown above, that
probably wasn't what we ever really wanted.  The original command is
ambiguous on a number of levels and even where it isn't we can get the
canonical information we need straight from the catalog.

> > Maybe that's possible and maybe it isn't, but at least for the CREATE
> > cases we should be able to avoid forcing a user to provide a built
> > parsetree, and that'd be *really* nice and open up this feature to
> > quite a few other use-cases that I can think of.  I'd further suggest
> > that we provide these command to the SQL level, and then have a
> > wrapper which will take the name of an object, resolve it to Oid, and
> > then pass back the CREATE command for it.
>
> I think this is a different feature that might end up sharing some of
> the infrastructure, but not more.

I know you've looked through this code also and I really don't get the
comment that only "some" of this infrastructure would be shared.  As I
tried to point out, for the 'CREATE POLICY' case, a few minor
modifications would have it provide exactly what I'm suggesting and I'm
sure that most of the cases would be similar.  Simply looking through
the code with an eye towards avoiding the parsetree in favor of pulling
information from the catalog would illustrate the point I'm making, I
believe.

> > For as much commentary as there has been about event triggers and
> > replication and BDR, and about how this is going to end up being a
> > little-used side-feature, I'm amazed that there has been very little
> > discussion about how this would finally put into the backend the ability
> > to build up CREATE commands for objects and remove the need to use
> > pg_dump for that (something which isn't exactly fun for GUI applications
> > and other use-cases).
>
> I don't think it's all that related, so I'm not surprised. If you
> execute a CREATE TABLE(id serial primary key); you'll get a bunch of
> commands with this facility - it'd be much less clear what exactly shall
> be dumped in the case of some hypothetical GUI when you want to dump the
> table.

I really don't think it's all that strange or complicated to consider
and we've got a rather nice example of what a good approach would be.

This may only apply to the CREATE and DROP cases, but that's no small
thing.

Apologies for taking so long to reply, it wasn't intentional.
Unfortunately, I'm not going to have a lot of time tomorrow either but I
hope to find time later in the week to resume review and perhaps to
provide code to back the approach I'm advocating.
Thanks!
    Stephen

Re: deparsing utility commands

От

Andres Freund

Дата:

24 февраля 2015 г., 05:10:29

Hi,

On 2015-02-23 19:48:43 -0500, Stephen Frost wrote:
> > Yes, it might be possible to use the same code for a bunch of minor
> > commands, but not for the interesting/complex stuff.
> 
> We can clearly rebuild at least CREATE commands for all objects without
> access to the parse tree, obviously pg_dump manages somehow.

That's not the same. pg_dump recreates a static state, not running log
of changes that can be replayed.

> I didn't specify that a single command had to be used.

Well, if you want to get 'the create table statement' you'll need to
decide what you want. With ddl event triggers it's clear - something
that, when replayed, results in the same catalog contents. Such an easy
definition doesn't, as far as I know, exist for a set of functions you
seem to imagine.

> Further, the actual deparse_CreateStmt doesn't look like it'd have a
> terribly hard time producing something without access to the
> parsetree.

The interesting bit isn't the deparse_CreateStmt itself, but all the
stuff that's also triggered when you do a CREATE TABLE
nontrival-stuff;. utility.c will first transform the statement and then
run run and stash every created subcommand. And the subcommands will
*also* get deparsed.

Just think about what to do about CREATE TABLE foo(id serial primary
key, data text, bar_id REFERENCES foo.bar); - there's no way you can get
which other objects to dump from the catalog alone. What for a schema,
considering CREATE SCHEMA ... (schema_elements)+;?

Sure, individual subcommands could refer to the catalog instead of the
parse tree. But to get the whole thing you can't easily just refer to
it.

> All I'm suggesting is that we focus on collecting the information from
> the catalog and avoid using the parsetree.  For at least the CREATE and
> DROP cases, that should be entirely possible.

DROP's already in 9.4, the additions in 9.5 were more or less usability
things.  The problem generating DROPs is right now more identifying
which one you want to drop and checking the dependencies - the latter
I'm not sure how to do without actually executing the DROP.

> > > Maybe that's possible and maybe it isn't, but at least for the CREATE
> > > cases we should be able to avoid forcing a user to provide a built
> > > parsetree, and that'd be *really* nice and open up this feature to
> > > quite a few other use-cases that I can think of.  I'd further suggest
> > > that we provide these command to the SQL level, and then have a
> > > wrapper which will take the name of an object, resolve it to Oid, and
> > > then pass back the CREATE command for it.
> > 
> > I think this is a different feature that might end up sharing some of
> > the infrastructure, but not more.
> 
> I know you've looked through this code also and I really don't get the
> comment that only "some" of this infrastructure would be shared.  As I
> tried to point out, for the 'CREATE POLICY' case, a few minor
> modifications would have it provide exactly what I'm suggesting and I'm
> sure that most of the cases would be similar.  Simply looking through
> the code with an eye towards avoiding the parsetree in favor of pulling
> information from the catalog would illustrate the point I'm making, I
> believe.

I've no problem at all changing CREATE POLICY (and some other) deparse
functions to look more at the catalog than the command. What I don't see
is changing all of the create deparse functions, guarantee that they are
usable for getting the DDL of catalog objects and provide SQL
accessible infrastructure for that. That's a *massive* undertaking.

What I mean with 'sharing some of the infrastructure' is that I can see
a good portion of the deparse_* functions being reused for what you
desire.

But the decision about which of those to call will be an entirely
separate project. So is a whole new infrastructure to consider locking
and visibility (all the deparse stuff uses continualy evolving catalog
snapshots!) that it'll need as that's a problem the event trigger stuff
has much less to care about, because the objects are new rows and thus
can't be updated by other backends.

> > I don't think it's all that related, so I'm not surprised. If you
> > execute a CREATE TABLE(id serial primary key); you'll get a bunch of
> > commands with this facility - it'd be much less clear what exactly shall
> > be dumped in the case of some hypothetical GUI when you want to dump the
> > table.
> 
> I really don't think it's all that strange or complicated to consider
> and we've got a rather nice example of what a good approach would be.

Right. We got a *massive* program that solves dependencies and doesn't
do all that much useful/correct things if you only let it dump
individual objects. And that dumping of individual objects is what you
want... ;)

Greetings,

Andres Freund

Re: deparsing utility commands

От

Stephen Frost

Дата:

24 февраля 2015 г., 15:26:41

* Andres Freund (andres@2ndquadrant.com) wrote:
> Hi,
>
> On 2015-02-23 19:48:43 -0500, Stephen Frost wrote:
> > > Yes, it might be possible to use the same code for a bunch of minor
> > > commands, but not for the interesting/complex stuff.
> >
> > We can clearly rebuild at least CREATE commands for all objects without
> > access to the parse tree, obviously pg_dump manages somehow.
>
> That's not the same. pg_dump recreates a static state, not running log
> of changes that can be replayed.

There isn't anything particularly special about these function, that I
can see at least, when it comes to a 'running log' vs. 'static state'.

> > I didn't specify that a single command had to be used.
>
> Well, if you want to get 'the create table statement' you'll need to
> decide what you want. With ddl event triggers it's clear - something
> that, when replayed, results in the same catalog contents. Such an easy
> definition doesn't, as far as I know, exist for a set of functions you
> seem to imagine.

This is really an orthogonal consideration.  I'd draw a parallel to
CREATE TABLE .. LIKE, where you have to specify what you want.  Perhaps
that's reason enough for this initial version to not be exposed out to
psql, but it doesn't change the point that going based off of the
catalog and not the parsetree would facilitate this capability.
Requiring the parsetree precludes any progress in this direction.

> > Further, the actual deparse_CreateStmt doesn't look like it'd have a
> > terribly hard time producing something without access to the
> > parsetree.
>
> The interesting bit isn't the deparse_CreateStmt itself, but all the
> stuff that's also triggered when you do a CREATE TABLE
> nontrival-stuff;. utility.c will first transform the statement and then
> run run and stash every created subcommand. And the subcommands will
> *also* get deparsed.

Right, all of that is completely independent of deparse_CreateStmt and
there's nothing in your argument for why deparse_CreateStmt should get
access to or need the parsetree.  If I'm following correctly, your
argument is that we shouldn't care that deparse_CreateStmt gets the
parsetree because of how it's getting called and the argument I'm making
is that deparse_CreateStmt really doesn't need the parsetree and if it
wasn't required then deparse_CreateStmt could serve additional
use-cases.

> Just think about what to do about CREATE TABLE foo(id serial primary
> key, data text, bar_id REFERENCES foo.bar); - there's no way you can get
> which other objects to dump from the catalog alone. What for a schema,
> considering CREATE SCHEMA ... (schema_elements)+;?
>
> Sure, individual subcommands could refer to the catalog instead of the
> parse tree. But to get the whole thing you can't easily just refer to
> it.

I'm not entirely following this last sentence, but I'm encouragerd by
the agreement (if I understand correctly) that the subcommands could use
the catalog instead of the parsetree.  The question of if that makes
them useful for other callers is perhaps up for some debate, but they
certainly look valuable from here.  They may need to get extended to
have other options passed in (include defaults, include constraints,
etc) which then cause other sub-commands to be called (or perhaps
there's a larger function overall which is called and handles those
pieces and combines the results and the sub-command remains the same)
but all of that is good and interesting discussion which can happen if
we write these functions without the parsetree.  With the parsetree
requirement, we can't really even have those discussions.

> > All I'm suggesting is that we focus on collecting the information from
> > the catalog and avoid using the parsetree.  For at least the CREATE and
> > DROP cases, that should be entirely possible.
>
> DROP's already in 9.4, the additions in 9.5 were more or less usability
> things.  The problem generating DROPs is right now more identifying
> which one you want to drop and checking the dependencies - the latter
> I'm not sure how to do without actually executing the DROP.

I'm a bit confused by this as executing the DROP is going to follow the
entries in pg_depend, which we could do also..  What am I missing there?

> > > I think this is a different feature that might end up sharing some of
> > > the infrastructure, but not more.
> >
> > I know you've looked through this code also and I really don't get the
> > comment that only "some" of this infrastructure would be shared.  As I
> > tried to point out, for the 'CREATE POLICY' case, a few minor
> > modifications would have it provide exactly what I'm suggesting and I'm
> > sure that most of the cases would be similar.  Simply looking through
> > the code with an eye towards avoiding the parsetree in favor of pulling
> > information from the catalog would illustrate the point I'm making, I
> > believe.
>
> I've no problem at all changing CREATE POLICY (and some other) deparse
> functions to look more at the catalog than the command. What I don't see
> is changing all of the create deparse functions, guarantee that they are
> usable for getting the DDL of catalog objects and provide SQL
> accessible infrastructure for that. That's a *massive* undertaking.

Perhaps we can agree to start with the first sentence in this paragraph
then?  I'm not against waiting for the other pieces until we've had more
time to properly have that discussion.

> What I mean with 'sharing some of the infrastructure' is that I can see
> a good portion of the deparse_* functions being reused for what you
> desire.
>
> But the decision about which of those to call will be an entirely
> separate project. So is a whole new infrastructure to consider locking
> and visibility (all the deparse stuff uses continualy evolving catalog
> snapshots!) that it'll need as that's a problem the event trigger stuff
> has much less to care about, because the objects are new rows and thus
> can't be updated by other backends.

For my part, at least, I'd want my current backend to show the tables
based on what has already happened and not based on some point prior, so
either this is exactly what is wanted or I've misunderstood your point
here.  I realize that the same isn't exactly true for pg_dump, but
that's also not the use-case I was suggesting (though I do admit that it
might be interesting to figure out if there's a way for pg_dump to make
use of this, but that's an even farther afield discussion).

> > > I don't think it's all that related, so I'm not surprised. If you
> > > execute a CREATE TABLE(id serial primary key); you'll get a bunch of
> > > commands with this facility - it'd be much less clear what exactly shall
> > > be dumped in the case of some hypothetical GUI when you want to dump the
> > > table.
> >
> > I really don't think it's all that strange or complicated to consider
> > and we've got a rather nice example of what a good approach would be.
>
> Right. We got a *massive* program that solves dependencies and doesn't
> do all that much useful/correct things if you only let it dump
> individual objects. And that dumping of individual objects is what you
> want... ;)

"Dumping" as in pg_dump-style might be a bit beyond what I'm thinking
about at the moment.  Regardless, as I've tried to point out above, the
changes which I'm actually suggesting for this initial body of work are
just to avoid the parsetree and go based off of what the catalog has.
I'm hopeful that's a small enough and reasonable enough change to happen
during this commitfest.  I don't have any issue tabling the rest of the
discussion and future work based on that to a later time.
Thanks!
    Stephen

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

24 февраля 2015 г., 16:48:21

Stephen Frost wrote:

> Regardless, as I've tried to point out above, the
> changes which I'm actually suggesting for this initial body of work are
> just to avoid the parsetree and go based off of what the catalog has.
> I'm hopeful that's a small enough and reasonable enough change to happen
> during this commitfest.  I don't have any issue tabling the rest of the
> discussion and future work based on that to a later time.

At some point I did consider the idea that the CREATE cases of the
deparse code could be usable without a parse tree.  For the specific
case of deparsing a DDL command as it's being ran, it's important to
have the parse tree: options such as IF EXISTS, OR REPLACE and others
don't live in the catalogs and so they must be obtained from the parse
tree.  I think it is possible to adjust some of the code so that it can
run with a NULL parse tree, and set the options to empty in those cases;
that should return a usable command to re-create the object.

One point to bear in mind is that there is still some work left; and if
I need to additionally add extra interfaces so that this code can be
called from outside event triggers, I will not have any time to review
other people's patches from the commitfest.  So here's my suggestion:
let this be pushed with the current interfaces only, with an eye towards
not adding obstacles to future support for a different interface.
That's more in the spirit of incremental improvement than trying to cram
everything in the initial commit.

I'm thinking something likeSELECT * FROM pg_creation_commands({'pg_class'::regclass, 'sometable'::pg_class});
would return a set of commands in the JSON-blob format that creates the
table.  The input argument is an array of object addresses, so that you
can request creation commands for multiple objects.  (It's not my
intention to provide this functionality right away, but if somebody else
wants to work on top of the current deparse patch, by my guest; if it
proves simple enough we can still get it into 9.5 as part of this
patch.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Andres Freund

Дата:

24 февраля 2015 г., 17:25:11

On 2015-02-24 10:48:38 -0300, Alvaro Herrera wrote:
> Stephen Frost wrote:
> 
> > Regardless, as I've tried to point out above, the
> > changes which I'm actually suggesting for this initial body of work are
> > just to avoid the parsetree and go based off of what the catalog has.
> > I'm hopeful that's a small enough and reasonable enough change to happen
> > during this commitfest.  I don't have any issue tabling the rest of the
> > discussion and future work based on that to a later time.
> 
> At some point I did consider the idea that the CREATE cases of the
> deparse code could be usable without a parse tree.  For the specific
> case of deparsing a DDL command as it's being ran, it's important to
> have the parse tree: options such as IF EXISTS, OR REPLACE and others
> don't live in the catalogs and so they must be obtained from the parse
> tree.

Yep.

> One point to bear in mind is that there is still some work left; and if
> I need to additionally add extra interfaces so that this code can be
> called from outside event triggers, I will not have any time to review
> other people's patches from the commitfest.  So here's my suggestion:
> let this be pushed with the current interfaces only, with an eye towards
> not adding obstacles to future support for a different interface.
> That's more in the spirit of incremental improvement than trying to cram
> everything in the initial commit.

+1

> I'm thinking something like
>  SELECT * FROM pg_creation_commands({'pg_class'::regclass, 'sometable'::pg_class});

> would return a set of commands in the JSON-blob format that creates the
> table.  The input argument is an array of object addresses, so that you
> can request creation commands for multiple objects.  (It's not my
> intention to provide this functionality right away, but if somebody else
> wants to work on top of the current deparse patch, by my guest; if it
> proves simple enough we can still get it into 9.5 as part of this
> patch.)

I think that'd be usefuful functionality, but I can't see it being
realistic for 9.5 unless somebody starts working on it right now with
full force.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: deparsing utility commands

От

Ryan Pedela

Дата:

24 февраля 2015 г., 18:49:46

On Tue, Feb 24, 2015 at 6:48 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Stephen Frost wrote:

I'm thinking something like
SELECT * FROM pg_creation_commands({'pg_class'::regclass, 'sometable'::pg_class});
would return a set of commands in the JSON-blob format that creates the
table. The input argument is an array of object addresses, so that you
can request creation commands for multiple objects. (It's not my
intention to provide this functionality right away, but if somebody else
wants to work on top of the current deparse patch, by my guest; if it
proves simple enough we can still get it into 9.5 as part of this
patch.)

Another possible function could be to diff two relations to produce a set of DDL commands that could be used for schema migrations.

Also thank you very much! CREATE/ALTER support in event triggers is the upcoming feature I am most excited about, and I am happy to see progress.

Thanks,

Ryan Pedela

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

24 февраля 2015 г., 20:52:13

Andres Freund wrote:

Here's a new set of patches.  Several fixes have been merged into the
patches themselves, before rebasing (in particular, some more commands,
such as DefineRelation and AlterDomain routines are now returning
ObjectAddress rather than OID).  I have accumulated some newer fixes as
fixup commits at the end of the series.  I have handled some of your
comments that needed code changes; more changes pending.  Additionally:

> On 2015-02-15 01:48:15 -0300, Alvaro Herrera wrote:
> > One line of defense which I just tought about is that instead of
> > sprinkling EventTriggerStashCommand() calls all over ProcessUtilitySlow,
> > we should add only one at the bottom.
>
> Doesn't sound like a bad idea, but I'm not sure whether it's realistic
> given that some commands do multiple EventTriggerStashCommand()s?

What the new code does is to have a boolean flag, default false, and
only auto-stash commands if the flag is false.  Commands that process
stuff differently need to set the flag to true (meaning "already
stashed") and then the auto-stash step is skipped.

> > 1. ruleutils.c seems to deparse temp schemas as pg_temp_X instead of
> > just pg_temp; so some commands like CREATE VIEW..AS SELECT FROM temp_table
> > fail in an ugly way.  Maybe the solution is to tell ruleutils that the
> > temp schema is always visible; or maybe the solution is to tell it that
> > it's spelled pg_temp instead.
>
> Hm. I'm not immediately following why that's happening for deparsing but
> not for the normal get_viewdef stuff?

Haven't researched this closely yet, but I think it's a minor bug.


> > @@ -1060,6 +1066,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
> >          case OBJECT_CAST:
> >          case OBJECT_COLUMN:
> >          case OBJECT_COLLATION:
> > +        case OBJECT_COMPOSITE:
> >          case OBJECT_CONVERSION:
> >          case OBJECT_DEFAULT:
> >          case OBJECT_DOMAIN:
> > @@ -1088,6 +1095,7 @@ EventTriggerSupportsObjectType(ObjectType obtype)
> >          case OBJECT_TSPARSER:
> >          case OBJECT_TSTEMPLATE:
> >          case OBJECT_TYPE:
> > +        case OBJECT_USER_MAPPING:
> >          case OBJECT_VIEW:
> >              return true;
> >      }
>
> Should those two be broken out and just committed?

No, those two symbols are created by the infrastructure patch because
they are not required by the current code.

> > @@ -1193,14 +1201,6 @@ EventTriggerBeginCompleteQuery(void)
> >      EventTriggerQueryState *state;
> >      MemoryContext cxt;
> >
> > -    /*
> > -     * Currently, sql_drop and table_rewrite events are the only reason to
> > -     * have event trigger state at all; so if there are none, don't install
> > -     * one.
> > -     */
> > -    if (!trackDroppedObjectsNeeded())
> > -        return false;
> > -
> >      cxt = AllocSetContextCreate(TopMemoryContext,
> >                                  "event trigger state",
> >                                  ALLOCSET_DEFAULT_MINSIZE,
> > @@ -1211,7 +1211,7 @@ EventTriggerBeginCompleteQuery(void)
> >      slist_init(&(state->SQLDropList));
> >      state->in_sql_drop = false;
> >      state->table_rewrite_oid = InvalidOid;
> > -
> > +    state->stash = NIL;
> >      state->previous = currentEventTriggerState;
> >      currentEventTriggerState = state;
>
> Hm. I'm not clear why we don't check for other types of event triggers
> in EventTriggerBeginCompleteQuery and skip the work otherwise. If we
> just enable them all the time - which imo is ok given how they're split
> of in the normal process utility - we should remove the boolean return
> value from Begin/CompleteQuery and drop
>  * Note: it's an error to call this routine if EventTriggerBeginCompleteQuery
>  * returned false previously.
> from EndCompleteQuery.

I think this merits some more checking, to avoid event trigger overhead
when there are no triggers that are going to use this mechanism.


> > +    currentEventTriggerState->stash = lappend(currentEventTriggerState->stash,
> > +                                              stashed);

> It's a bit wierd that the drop list is managed using a slist, but the
> stash isn't...

Yeah, I think I'll change the stash to an slist too.  I used List for

> > +                /*
> > +                 * Obtain schema name, if any ("pg_temp" if a temp object)
> > +                 */
> > +                if (is_objectclass_supported(addr.classId))
> > +                {
> > +                    AttrNumber    nspAttnum;
> > +
> > +                    nspAttnum = get_object_attnum_namespace(addr.classId);
> > +                    if (nspAttnum != InvalidAttrNumber)
> > +                    {
> > +                        Relation    catalog;
> > +                        HeapTuple    objtup;
> > +                        Oid            schema_oid;
> > +                        bool        isnull;
> > +
> > +                        catalog = heap_open(addr.classId, AccessShareLock);
> > +                        objtup = get_catalog_object_by_oid(catalog,
> > +                                                           addr.objectId);
> > +                        if (!HeapTupleIsValid(objtup))
> > +                            elog(ERROR, "cache lookup failed for object %u/%u",
> > +                                 addr.classId, addr.objectId);
> > +                        schema_oid = heap_getattr(objtup, nspAttnum,
> > +                                                  RelationGetDescr(catalog), &isnull);
> > +                        if (isnull)
> > +                            elog(ERROR, "invalid null namespace in object %u/%u/%d",
> > +                                 addr.classId, addr.objectId, addr.objectSubId);
> > +                        if (isAnyTempNamespace(schema_oid))
> > +                            schema = pstrdup("pg_temp");
> > +                        else
> > +                            schema = get_namespace_name(schema_oid);
> > +
> > +                        heap_close(catalog, AccessShareLock);
> > +                    }
> > +                }
>
> Hm. How do we get here for !is_objectclass_supported()?

The whole idea here is that object classes that are not "supported" (by
the objectaddress.c generic stuff) then it's not an object type that
lives within a schema.  The main point of this block is to obtain schema
name, so objects that don't live in schemas obviously continue to have
schema = NULL.


> > +static ObjTree *
> > +new_objtree_VA(char *fmt, int numobjs,...)
> > +{

> Would look nicer if boolval et al weren't declared - I'd just pass the
> va_arg() return value directly to new_*_object().
>
> Attached as 0001.
>
> Additionally I find the separate handling of ObjTypeNull not so
> nice. 0002.

Both nice, thanks.  Folded into infrastructure commit.

> > +/*
> > + * A helper routine to setup %{}T elements.
> > + */
> > +static ObjTree *
> > +new_objtree_for_type(Oid typeId, int32 typmod)
> > +{
>
> > +    append_bool_object(typeParam, "is_array", is_array);
>
> Maybe name that one typarray?

Done.  I also fixed an issue with timestamp/timestamptz/intervals when
there were arrays of those, which I noticed a few days ago while fooling
with the test framework.

> > + * Handle deparsing of simple commands.
> > + *
> > + * This function contains a large switch that mirrors that in
> > + * ProcessUtilitySlow.  All cases covered there should also be covered here.
> > + */
> > +static ObjTree *
> > +deparse_simple_command(StashedCommand *cmd)
> > +{

> It is not clear to me why some commands error out and other don't. It
> makes sense to me to error out for things like GrantStmt that shouldn't
> end up here, but...  I guess that's just an artifact of the patch series
> because you add the handling for the non elog()ing commands later?

Yeah, exactly.  In this one I added elog(ERROR) to all cases in the
infrastructure patch.  Note that the intention is that all command types
are supported by the time we push, and therefore this:

> I think this should use ereport() in some cases if we're not going to
> support some commands for now.

is unnecessary.

> > +    /*
> > +     * Many routines underlying this one will invoke ruleutils.c functionality
> > +     * in order to obtain deparsed versions of expressions.  In such results,
> > +     * we want all object names to be qualified, so that results are "portable"
> > +     * to environments with different search_path settings.  Rather than inject
> > +     * what would be repetitive calls to override search path all over the
> > +     * place, we do it centrally here.
> > +     */
> > +    overridePath = GetOverrideSearchPath(CurrentMemoryContext);
> > +    overridePath->schemas = NIL;
> > +    overridePath->addCatalog = false;
> > +    overridePath->addTemp = false;
> > +    PushOverrideSearchPath(overridePath);
>
> Ah, the 'addTemp = false' probably is the answer to my question above
> about view deparsing adding a fully qualified pg_temp...

Not real sure about that.  Might be.  But I vaguely recall I needed
isTemp=false for something else.

>
> > +/*------
> > + * Returns a formatted string from a JSON object.
> > + *
> > + * The starting point is the element named "fmt" (which must be a string).
> > + * This format string may contain zero or more %-escapes, which consist of an
> > + * element name enclosed in { }, possibly followed by a conversion modifier,
> > + * followed by a conversion specifier.    Possible conversion specifiers are:
> > + *
> > + * %        expand to a literal %.
> > + * I        expand as a single, non-qualified identifier
> > + * D        expand as a possibly-qualified identifier
> > + * T        expand as a type name
> > + * O        expand as an operator name
> > + * L        expand as a string literal (quote using single quotes)
> > + * s        expand as a simple string (no quoting)
> > + * n        expand as a simple number (no quoting)
> > + *
> > + * The element name may have an optional separator specification preceded
> > + * by a colon.    Its presence indicates that the element is expected to be
> > + * an array; the specified separator is used to join the array elements.
> > + *------
>
> I think this documentation should at least be referred to from
> comments in deparse_utility.c. I was wondering where it is.

I'll add a reference.

> > diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
> > index 6fbe283..a842ef2 100644
> > --- a/src/backend/utils/adt/jsonb.c
> > +++ b/src/backend/utils/adt/jsonb.c
> > @@ -416,7 +416,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
> >  {
> >      bool        first = true;
> >      JsonbIterator *it;
> > -    int            type = 0;
> > +    JsonbIteratorToken type = WJB_DONE;
> >      JsonbValue    v;
> >      int            level = 0;
> >      bool        redo_switch = false;
> > @@ -498,7 +498,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
> >                  first = false;
> >                  break;
> >              default:
> > -                elog(ERROR, "unknown flag of jsonb iterator");
> > +                elog(ERROR, "unknown jsonb iterator token type");
> >          }
> >      }
>
> Huh?

Apologies -- should be somewhere separate, I guess.


> I think this infrastructure pretty desperately requires some higher
> level overview. Like how are we going from the node tree, to the object
> tree, to jsonb, to actual DDL. I think I understand, but it took me a
> while. Doesn't have to be long and super detailed...

Will add.

> > From dcb353c8c9bd93778a62719ff8bf32b9a419e16d Mon Sep 17 00:00:00 2001
> > From: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: Thu, 25 Sep 2014
> > 15:54:00 -0300 Subject: [PATCH 09/42] deparse: Support CREATE TYPE AS
>
> > +static ObjTree *
> > +deparse_ColumnDef(Relation relation, List *dpcontext, bool composite,
> > +                  ColumnDef *coldef)
> > +{
>
> > +    if (!composite)
> > +    {
> > +        /*
> > +         * Emit a NOT NULL declaration if necessary.  Note that we cannot trust
> > +         * pg_attribute.attnotnull here, because that bit is also set when
> > +         * primary keys are specified; and we must not emit a NOT NULL
> > +         * constraint in that case, unless explicitely specified.  Therefore,
> > +         * we scan the list of constraints attached to this column to determine
> > +         * whether we need to emit anything.
> > +         * (Fortunately, NOT NULL constraints cannot be table constraints.)
> > +         */
>
> Is the primary key bit really a problem? To me it sounds like it's
> actually good to retain a NOT NULL in that case. Note that pg_dump
> actually *does* emit a NOT NULL here, even if you drop the primary
> key. Since the column behaves as NOT NULL and is dumped as such it seems
> like a good idea to fully treat it as such.

I think pg_dump is wrong -- as far as I recall, the standard wants you
to drop the NOT NULL bit if you drop the primary key, which would
probably not happen if you just add NOT NULL blindly.  My patch to
catalog not null constraints tries to handle this .. but of course,
that's still dormant.


> > +    list = NIL;
> > +    if (node->deferrable)
> > +        list = lappend(list,
> > +                       new_string_object("DEFERRABLE"));
> > +    if (node->initdeferred)
> > +        list = lappend(list,
> > +                       new_string_object("INITIALLY DEFERRED"));
>
> I mildly wonder if DEFERRABLE/INITIALLY DEFERRED shouldn't instead have
> an 'is_present' style representation.

Yeah, good point.  Many other clauses in various commands probably need
the same thing.

> > +/*
> > + * deparse_CreateStmt
> > + *        Deparse a CreateStmt (CREATE TABLE)
>
> I really wish CreateStmt were named CreateTableStmt. I don't know how
> often that tripped me over, let alone everyone. Yes, that's an unrelated
> rant ;)

Agreed.

> > + * Given a table OID and the parsetree that created it, return an ObjTree
> > + * representing the creation command.
> > + */
> > +static ObjTree *
> > +deparse_CreateStmt(Oid objectId, Node *parsetree)
> > +{
> ...
> > +    tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0);
> > +    if (node->tablespacename)
> > +        append_string_object(tmp, "tablespace", node->tablespacename);
> > +    else
> > +    {
> > +        append_null_object(tmp, "tablespace");
> > +        append_bool_object(tmp, "present", false);
> > +    }
> > +    append_object_object(createStmt, "tablespace", tmp);
>
> Hm. I had not thought about that before, but given that
> default_tablespace exists, we should maybe somehow indicate which one
> was chosen?

Hmm, yeah, good point.  We do likewise for WITH/WITHOUT OIDS, for
instance, so that's fair.  Will add.

> > +    if (!ownedby)
> > +        /* XXX this shouldn't happen ... */
> > +        ownedby = new_objtree_VA("OWNED BY %{owner}D",
> > +                                 3,
> > +                                 "clause", ObjTypeString, "owned",
> > +                                 "owner", ObjTypeNull,
> > +                                 "present", ObjTypeBool, false);
>
> Why shouldn't this happen? Free standing sequences are perfectly normal?
> And OWNED BY NONE will need to be deparseable.

Mumble.

> > +static ObjTree *
> > +deparse_CreateSeqStmt(Oid objectId, Node *parsetree)
> > +{

> > +    /* we purposefully do not emit OWNED BY here */
>
> Needs explanation about the reason. Hm. I don't think it's actually
> correct. Right now the following won't be deparsed correctly:
>
> CREATE TABLE seqowner(id int);
> CREATE SEQUENCE seqowned OWNED BY seqowner.id;
> that generates
> CREATE  TABLE  public.seqowner (id pg_catalog.int4   )  WITH (oids=OFF)
> CREATE  SEQUENCE public.seqowned CACHE 1 NO CYCLE INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START WITH 1
RESTART1 

I think I was blindfolded into supporting SERIAL.  Will look into this.

> > + * deparse_CreateExtensionStmt
> > + *        deparse a CreateExtensionStmt
> > + *
> > + * Given an extension OID and the parsetree that created it, return the JSON
> > + * blob representing the creation command.
> > + *
> > + * XXX the current representation makes the output command dependant on the
> > + * installed versions of the extension.  Is this a problem?
>
> I tend to think it's ok. Might be worthwhile to add the final extension
> version in some unused attribute?

Yes, agreed.

> > It supports everything but functions, aggregates, operator classes and
> > operator families.
>
> AFAICS those are implemented, and you updated the description in git
> since.

Yeah, sorry.  I just changed the way we handle functions/aggregates as
you suggest, by adding format_procedure_args().  However, in looking
closer, the representation is not very nice, because you end up with
veyr ugly quoting:

alvherre=# alter function "test schema"."foo bar" ("test schema"."test type") rename to "ack ick";
NOTICE:  JSON blob: {
    "fmt": "ALTER FUNCTION %{identity}s RENAME TO %{newname}I",
    "identity": "\"test schema\".\"ack ick\"(\"test schema\".\"test type\")",
    "newname": "ack ick"
}
NOTICE:  expanded: ALTER FUNCTION "test schema"."ack ick"("test schema"."test type") RENAME TO "ack ick"
ALTER FUNCTION

Note the identity bit.  I think I should use a signature object like
CREATE FUNCTION, which ends up a lot more verbose but seems more usable
programmatically:

    "signature": {
        "arguments": [
            {
                "default": {
                    "fmt": "DEFAULT %{value}s",
                    "present": false
                },
                "fmt": "%{mode}s %{name}s %{type}T %{default}s",
                "mode": "IN",
                "name": {
                    "fmt": "%{name}I",
                    "name": "a",
                    "present": true
                },
                "type": {
                    "schemaname": "test schema",
                    "typarray": false,
                    "typename": "test type",
                    "typmod": ""
                }
            }
        ],
        "fmt": "%{identity}D(%{arguments:, }s)",
        "identity": {
            "objname": "test function",
            "schemaname": "test schema"
        }

(ALTER FUNCTION RENAME has no need for things such as DEFAULT and INOUT,
though, I think, so it might end up being simpler than this)


> > +static ObjTree *
> > +deparse_RenameStmt(Oid objectId, Node *parsetree)
> > +{
> > +    RenameStmt *node = (RenameStmt *) parsetree;
> > +    ObjTree       *renameStmt;
> > +    char       *fmtstr;
> > +    Relation    relation;
> > +    Oid            schemaId;
> > +
> > +    /*
> > +     * FIXME --- this code is missing support for inheritance behavioral flags,
> > +     * i.e. the "*" and ONLY elements.
> > +     */
>
> Hm. I think we probably need that?

Sure.

> > +     * XXX what if there's another event trigger running concurrently that
> > +     * renames the schema or moves the object to another schema?  Seems
> > +     * pretty far-fetched, but possible nonetheless.
> > +     */
>
> We should probably prohibit DDL from within event triggers?

Not real sure about adding that restriction.  Would make things simpler,
sure.

> > +    switch (node->renameType)
> > +    {
> > +        case OBJECT_COLLATION:
> > +        case OBJECT_CONVERSION:
> > +        case OBJECT_DOMAIN:
> > +        case OBJECT_TSDICTIONARY:
> > +        case OBJECT_TSPARSER:
> > +        case OBJECT_TSTEMPLATE:
> > +        case OBJECT_TSCONFIGURATION:
> > +        case OBJECT_TYPE:
> > +            {
> > +                char       *identity;
> > +                HeapTuple    objTup;
> > +                Relation    catalog;
> > +                Datum        objnsp;
> > +                bool        isnull;
> > +                Oid            classId = get_objtype_catalog_oid(node->renameType);
> > +                AttrNumber    Anum_namespace = get_object_attnum_namespace(classId);
> > +
> > +                catalog = relation_open(classId, AccessShareLock);
> > +                objTup = get_catalog_object_by_oid(catalog, objectId);
> > +                objnsp = heap_getattr(objTup, Anum_namespace,
> > +                                      RelationGetDescr(catalog), &isnull);
> > +                if (isnull)
> > +                    elog(ERROR, "invalid NULL namespace");
> > +
> > +                identity = psprintf("%s.%s", get_namespace_name(DatumGetObjectId(objnsp)),
> > +                                    strVal(llast(node->object)));
> > +                fmtstr = psprintf("ALTER %s %%{identity}s RENAME TO %%{newname}I",
> > +                                  stringify_objtype(node->renameType));
>
> Is that correct? I.e. will it work for a namespace that needs to be
> quoted? There's a couple of these. Afaics they all should use
> new_objtree_for_qualname and %{}D? I guess in some cases, like the type,
> that could be slightly more complicated, but I guess it has to be done
> nonetheless?

Well, the quoting could be solved by adding a call to
quote_qualified_identifier(), but you're right that it would be simpler
to use %{}D here.  It would apply similarly to the function example I
gave above.

> > +        case OBJECT_AGGREGATE:
> > +        case OBJECT_FUNCTION:

> > +                /*
> > +                 * Generating a function/aggregate identity is altogether too
> > +                 * messy, so instead of doing it ourselves, we generate one for
> > +                 * the renamed object, then strip out the name and replace it
> > +                 * with the original name from the parse node.  This is so ugly
> > +                 * that we don't dare do it for any other object kind.
> > +                 */
>
> Yuck.
>
> I think it'd be just as easy to split the argument output from the
> function name output in format_procedure_internal, and make that
> separately callable.

Fixed, at least partially.

> > Subject: [PATCH 18/42] deparse: Support CREATE FUNCTION
>
> >  /*
> > + * deparse_CreateFunctionStmt
> > + *        Deparse a CreateFunctionStmt (CREATE FUNCTION)
> > + *
> > + * Given a function OID and the parsetree that created it, return the JSON
> > + * blob representing the creation command.
> > + *
> > + * XXX this is missing the per-function custom-GUC thing.
> > + */
>
> Support attached as 0003.

Thanks, folded.

> > +void
> > +EventTriggerComplexCmdStart(Node *parsetree, ObjectType objtype)
> > +{
> > +    MemoryContext    oldcxt;
> > +    StashedCommand *stashed;
> > +
> > +    oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt);
> > +
> > +    stashed = palloc(sizeof(StashedCommand));
> > +
> > +    stashed->type = SCT_AlterTable;

> Hm. So +EventTriggerComplexCmdStart is hardcoded to use SCT_AlterTable? That's a bit odd.

Yeah, that's broken, isn't it.  My initial thought was that we would
have Simple commands and Complex commands only.  Only later I grew the
SCT_Grant case, and it's becoming obvious that I will need
SCT_AlterDefaultPrivileges and SCT_AlterOpFamily, at least.  I will
probably rename these functions from StuffComplexCommandSomething into
StuffAlterTableSomething, to make it clear that they only serve
AlterTable.

> > +/*
> > + * EventTriggerEndRecordingSubcmds
> > + *         Finish up saving a complex DDL command
> > + *
> > + * FIXME this API isn't considering the possibility that a xact/subxact is
> > + * aborted partway through.  Probably it's best to add an
> > + * AtEOSubXact_EventTriggers() to fix this.
> > + */
>
> I think that'd generally be better than requiring PG_CATCH.

Haven't done anything about this yet.

> > +static ObjTree *
> > +deparse_AlterTableStmt(StashedCommand *cmd)
> > +{
>
> > +    foreach(cell, cmd->d.alterTable.subcmds)
> > +    {
> > +        StashedATSubcmd    *substashed = (StashedATSubcmd *) lfirst(cell);
> > +        AlterTableCmd    *subcmd = (AlterTableCmd *) substashed->parsetree;
> > +        ObjTree       *tree;
> > +
> > +        Assert(IsA(subcmd, AlterTableCmd));
> > +
> > +        switch (subcmd->subtype)
> > +        {
> > +            case AT_AddColumn:
> > +            case AT_AddColumnRecurse:
> > +                /* XXX need to set the "recurse" bit somewhere? */
> > +                Assert(IsA(subcmd->def, ColumnDef));
> > +                tree = deparse_ColumnDef(rel, dpcontext, false,
> > +                                         (ColumnDef *) subcmd->def);
> > +                tmp = new_objtree_VA("ADD COLUMN %{definition}s",
> > +                                     2, "type", ObjTypeString, "add column",
> > +                                     "definition", ObjTypeObject, tree);
> > +                subcmds = lappend(subcmds, new_object_object(tmp));
> > +                break;
>
> Misses ONLY handling?

Yeah -- the "Recurse" bit in the subtype tells us whether to add ONLY or
not, I think.

> > +            case AT_AddIndex:
> > +                {
> > +                    Oid            idxOid = substashed->oid;
> > +                    IndexStmt  *istmt;
> > +                    Relation    idx;
> > +                    const char *idxname;
> > +                    Oid            constrOid;
> > +
> > +                    Assert(IsA(subcmd->def, IndexStmt));
> > +                    istmt = (IndexStmt *) subcmd->def;
> > +
> > +                    if (!istmt->isconstraint)
> > +                        break;
> > +
> > +                    idx = relation_open(idxOid, AccessShareLock);
> > +                    idxname = RelationGetRelationName(idx);
> > +
> > +                    constrOid = get_relation_constraint_oid(
> > +                        cmd->d.alterTable.objectId, idxname, false);
> > +
> > +                    tmp = new_objtree_VA("ADD CONSTRAINT %{name}I %{definition}s",
> > +                                         3, "type", ObjTypeString, "add constraint",
> > +                                         "name", ObjTypeString, idxname,
> > +                                         "definition", ObjTypeString,
> > +                                         pg_get_constraintdef_string(constrOid, false));
> > +                    subcmds = lappend(subcmds, new_object_object(tmp));
> > +
> > +                    relation_close(idx, AccessShareLock);
> > +                }
> > +                break;
>
> Hm, didn't you have a out of line version of this somewhere?

It's similar, but as far as I remember the other one was different.
Didn't look closely, though.


> > +            case AT_AlterColumnType:
> > +                {
>
> > +                    /*
> > +                     * Error out if the USING clause was used.  We cannot use
> > +                     * it directly here, because it needs to run through
> > +                     * transformExpr() before being usable for ruleutils.c, and
> > +                     * we're not in a position to transform it ourselves.  To
> > +                     * fix this problem, tablecmds.c needs to be modified to store
> > +                     * the transformed expression somewhere in the StashedATSubcmd.
> > +                     */
> > +                    tmp2 = new_objtree_VA("USING %{expression}s", 0);
> > +                    if (def->raw_default)
> > +                        elog(ERROR, "unimplemented deparse of ALTER TABLE TYPE USING");
> > +                    else
> > +                        append_bool_object(tmp2, "present", false);
> > +                    append_object_object(tmp, "using", tmp2);
>
> Hm. This probably needs to be fixed :(.

Sure.

> Shouldn't be too hard...
>
> Could you perhaps attach some string that can be searched for to all
> commands that you think need to be implemented before being committable?

Well, I think all the ones that say "unimplemented" should be
implemented before commit.  Most of the remaining ones should be
relatively simple to add, with the two exceptions I mentioned above that
require their own symbol in the SCT enum.

> >                              else
> >                              {
> > -                                /* Recurse for anything else */
> > +                                /*
> > +                                 * Recurse for anything else.  If we need to do
> > +                                 * so, "close" the current complex-command set,
> > +                                 * and start a new one at the bottom; this is
> > +                                 * needed to ensure the ordering of queued
> > +                                 * commands is consistent with the way they are
> > +                                 * executed here.
> > +                                 */
> > +                                EventTriggerComplexCmdEnd();
> >                                  ProcessUtility(stmt,
> >                                                 queryString,
> >                                                 PROCESS_UTILITY_SUBCOMMAND,
> >                                                 params,
> >                                                 None_Receiver,
> >                                                 NULL);
> > +                                EventTriggerComplexCmdStart(parsetree, atstmt->relkind);
> > +                                EventTriggerComplexCmdSetOid(relid);
> >                              }
>
> Hm. Commands are always ordered in a way this is ok?

Well, as far as I can tell this works well.  If we had a concrete
example of a problematic sequence of commands, we could look into how to
fix it.


> > +    copfStmt = new_objtree_VA("CREATE OPERATOR FAMILY %{identity}D USING %{amname}s",
> > +                              0);
> > +
> > +    tmp = new_objtree_for_qualname(opfForm->opfnamespace,
> > +                                   NameStr(opfForm->opfname));
> > +    append_object_object(copfStmt, "identity", tmp);
> > +    append_string_object(copfStmt, "amname", NameStr(amForm->amname));
>
> Hm. Amname is unquoted? I can't believe this will ever really be a problem, but still.

That's a bug -- fixed.

> > Subject: [PATCH 24/42] deparse: support ALTER THING OWNER TO
> >  static ObjTree *
> > +deparse_AlterOwnerStmt(Oid objectId, Node *parsetree)
> > +{
> > +    AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree;
> > +    ObjTree       *ownerStmt;
> > +    ObjectAddress addr;
> > +    char       *fmt;
> > +
> > +    fmt = psprintf("ALTER %s %%{identity}s OWNER TO %%{newname}I",
> > +                   stringify_objtype(node->objectType));
>
> newname sounds like copied from rename.

Oops, true.  Renamed to "newowner".


> > +            else
> > +            {
> > +                Assert(cmd->type == SCT_Grant);
> > +
> > +                /* classid */
> > +                nulls[i++] = true;
> > +                /* objid */
> > +                nulls[i++] = true;
> > +                /* objsubid */
> > +                nulls[i++] = true;
>
> Might actually be nice to fill those out to the object that's being
> changed. But it might end up being more work than justified, and it can
> be easily added later. I guess it'd also mean we would have to generate
> multiple GRANT/REVOKE commands.

Nah -- a single GRANT command can modify several objects of the same
kind.  That's why I used a separate representation.

> > diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
> > index c09915d..92bccf5 100644
> > --- a/src/backend/tcop/utility.c
> > +++ b/src/backend/tcop/utility.c
> > @@ -781,6 +781,7 @@ standard_ProcessUtility(Node *parsetree,
> >                  else
> >                      ExecuteGrantStmt((GrantStmt *) parsetree);
> >              }
> > +            break;
>
> Uh. Was that missing from an earlier commit?

Yep, sorry, fixed.  (In fact I already pushed it.)

> > diff --git a/src/include/utils/aclchk.h b/src/include/utils/aclchk.h
> > new file mode 100644
> > index 0000000..1ca7095
> > --- /dev/null
> > +++ b/src/include/utils/aclchk.h
> > @@ -0,0 +1,45 @@
> > +/*-------------------------------------------------------------------------
> > + *
> > + * aclchk.h
> > + *
> > + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
> > + * Portions Copyright (c) 1994, Regents of the University of California
> > + *
> > + * src/include/utils/aclchk.h
>
> Maybe we should name it aclchk_internal.h?

Might as well ...

> > +static ObjTree *
> > +deparse_CommentOnConstraintSmt(Oid objectId, Node *parsetree)
> > +{
> > +    CommentStmt *node = (CommentStmt *) parsetree;
> > +    ObjTree       *comment;
> > +    HeapTuple    constrTup;
> > +    Form_pg_constraint constrForm;
> > +    char       *fmt;
> > +    ObjectAddress addr;
> > +
> > +    Assert(node->objtype == OBJECT_TABCONSTRAINT || node->objtype == OBJECT_DOMCONSTRAINT);
> > +
> > +    if (node->comment)
> > +        fmt = psprintf("COMMENT ON CONSTRAINT %%{identity}s ON %s%%{parentobj}s IS %%{comment}L",
> > +                       node->objtype == OBJECT_TABCONSTRAINT ? "" : "DOMAIN ");
> > +    else
> > +        fmt = psprintf("COMMENT ON CONSTRAINT %%{identity}s ON %s%%{parentobj}s IS NULL",
> > +                       node->objtype == OBJECT_TABCONSTRAINT ? "" : "DOMAIN ");
>
> psprintf(...' IS %s', .... node->comment ? "%{comment}L" : "NULL") maybe?

I fixed COMMENT and SECURITY LABEL to use a different, more complex
representation, that can be used to turn the comment literal and the
NULL part on/off.  That seems better for programmatic access.

> > +    if (node->label)
> > +    {
> > +        fmt = psprintf("SECURITY LABEL FOR %%{provider}s ON %s %%{identity}s IS %%{label}L",
> > +                   stringify_objtype(node->objtype));
> > +        label = new_objtree_VA(fmt, 0);
> > +
> > +        append_string_object(label, "label", node->label);
> > +    }

> "deparse: Handle default security provider." should be squashed into this.

No, actually I think that one is wrong.  The security provider should be
returned as an Oid into secondaryOid to ProcessUtilitySlow, to avoid
messing with the parse node.

> > +    if (node->relkind == OBJECT_TABLE)
> > +    {
> > +        tmp = new_objtree_VA("ON COMMIT %{on_commit_value}s", 0);
> > +        switch (node->into->onCommit)
> > +        {
> > +            case ONCOMMIT_DROP:
> > +                append_string_object(tmp, "on_commit_value", "DROP");
> > +                break;
> > +
> > +            case ONCOMMIT_DELETE_ROWS:
> > +                append_string_object(tmp, "on_commit_value", "DELETE ROWS");
> > +                break;
> > +
> > +            case ONCOMMIT_PRESERVE_ROWS:
> > +                append_string_object(tmp, "on_commit_value", "PRESERVE ROWS");
> > +                break;
> > +
> > +            case ONCOMMIT_NOOP:
> > +                append_null_object(tmp, "on_commit_value");
> > +                append_bool_object(tmp, "present", false);
> > +                break;
> > +        }
> > +        append_object_object(createStmt, "on_commit", tmp);
> > +    }
>
> That switch already exists somewhere else? Maybe add a function that
> converts ONCOMMIT_* into the relevant string?

Some refactoring is probably in order here.

> Ran out of energy here...

You had a lot of it -- this stuff isn't small.  Many, many thanks for
the review.


> Subject: [PATCH 2/3] fixup! deparse: infrastructure needed for command
>  deparsing
>
> Slight cleanup in new_objtree_VA() by treating ObjTypeNull just like the
> other types.

BTW these two patches created a compiler warning for me; it says "elem
can be used uninitialized".  I had to add a default clause.


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: deparsing utility commands

От

Stephen Frost

Дата:

24 февраля 2015 г., 23:21:09

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Stephen Frost wrote:
>
> > Regardless, as I've tried to point out above, the
> > changes which I'm actually suggesting for this initial body of work are
> > just to avoid the parsetree and go based off of what the catalog has.
> > I'm hopeful that's a small enough and reasonable enough change to happen
> > during this commitfest.  I don't have any issue tabling the rest of the
> > discussion and future work based on that to a later time.
>
> At some point I did consider the idea that the CREATE cases of the
> deparse code could be usable without a parse tree.  For the specific
> case of deparsing a DDL command as it's being ran, it's important to
> have the parse tree: options such as IF EXISTS, OR REPLACE and others
> don't live in the catalogs and so they must be obtained from the parse
> tree.  I think it is possible to adjust some of the code so that it can
> run with a NULL parse tree, and set the options to empty in those cases;
> that should return a usable command to re-create the object.

Right, being able to run it with a NULL parse tree would absolutely work
for the use-cases that I'm thinking of.  I suggested somewhere up-thread
that those bits of information (IF EXISTS, OR REPLACE) might be possible
to pass in as independent arguments or through a compound argument (eg:
a List or struct or something), instead of coming from the parse tree,
but that's clearly something which could be done later.

> One point to bear in mind is that there is still some work left; and if
> I need to additionally add extra interfaces so that this code can be
> called from outside event triggers, I will not have any time to review
> other people's patches from the commitfest.  So here's my suggestion:
> let this be pushed with the current interfaces only, with an eye towards
> not adding obstacles to future support for a different interface.
> That's more in the spirit of incremental improvement than trying to cram
> everything in the initial commit.

I'm perfectly fine with having the initial commit of this work using the
existing interfaces and further that we only need one set of interfaces
for 9.5, but, for my part, I'd love to be able to build an extension on
9.5 which calls these functions and passes in a NULL parsetree.  Perhaps
that's wishful thinking at this part and maybe I'll have to wait til
9.6, but that's why I've brought this up.

> I'm thinking something like
>  SELECT * FROM pg_creation_commands({'pg_class'::regclass, 'sometable'::pg_class});
> would return a set of commands in the JSON-blob format that creates the
> table.  The input argument is an array of object addresses, so that you
> can request creation commands for multiple objects.  (It's not my
> intention to provide this functionality right away, but if somebody else
> wants to work on top of the current deparse patch, by my guest; if it
> proves simple enough we can still get it into 9.5 as part of this
> patch.)

I'm all for this, but it seems too late for it to get into 9.5.  I'd be
happy with just the functions existing and available to extensions for
9.5.
Thanks!
    Stephen

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

28 февраля 2015 г., 01:30:02

Stephen Frost wrote:

> > and at the bottom of the loop we would transform the objid/type into
> > address for the cases that need it:
> >
> >         if (!commandStashed)
> >         {
> >             if (objectId != InvalidOid)
> >             {
> >                 address.classId = get_objtype_catalog_oid(objectType);
> >                 address.objectId = objectId;
> >                 address.objectSubId = 0;
> >             }
> >             EventTriggerStashCommand(address, secondaryOid, parsetree);
> >         }
>
> That'd be fine with me, though for my 2c, I wouldn't object to changing
> them all to return ObjectAddress either.  I agree that it'd cause a fair
> bit of code churn to do so, but there's a fair bit of code churn
> happening here anyway (looking at what 0008 does to ProcessUtilitySlow
> anyway).

Thanks.  I have adjusted the code to use ObjectAddress in all functions
called by ProcessUtilitySlow; the patch is a bit tedious but seems
pretty reasonable to me.  As I said, there is quite a lot of churn.

Also attached are some patches to make some commands return info about
a secondary object regarding the execution, which can be used to know
more about what's being executed:

1) ALTER DOMAIN ADD CONSTRAINT returns (in addition to the address of
the doamin) the address of the new constraint.

2) ALTER OBJECT SET SCHEMA returns (in addition to the address of the
object) the address of the old schema.

3) ALTER EXTENSION ADD/DROP OBJECT returns (in addition to the address
of the extension) the address of the added/dropped object.

Also attached is the part of these patches that have ALTER TABLE return
the AttrNumber and OID of the affected object.

I think this is all mostly uninteresting, but thought I'd throw it out
for public review before pushing, just in case.  I have fixed many
issues in the rest of the branch meanwhile, and it's looking much better
IMO, but much work remains there and will leave them for later.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Andres Freund wrote:
> Hi,
>
> On 2015-02-15 01:48:15 -0300, Alvaro Herrera wrote:
> > This is a repost of the patch to add CREATE command deparsing support to
> > event triggers.  It now supports not only CREATE but also ALTER and
> > other commands such as GRANT/REVOKE, COMMENT ON and SECURITY LABEL.
> > This patch series is broken up in a rather large number of patches, but
> > my intention would be to commit separately only the first 6 patches; the
> > remaining parts are split up only so that it's easy to see how deparsing
> > each patch is implemented, and would be committed as a single changeset
> > (but before that I need a bunch of extra fixes and additions to other
> > parts.)
>
> I find the split of the patches not really helpful - it makes
> fixing/cleaning up things unnecessarily complicated. I'd like most of
> the individual command commits. Obviously the preparatory stuff that'll
> be independely committed should stay separate. I also like that the
> infrastructure patch is split of; same with the more wildly different
> patches like support for GRANT.

Here's a new series.  I have reduced the number of patches, per your
request.  (Of course, a number of those preparatory commits have gotten
pushed.)

One thing that Stephen commented on was the ALTER TABLE preparatory
patch.  He said why not return ObjectAddress from the subcommand
routines instead of just Oid/attnum?  The reason is that to interpret
the address correctly you will still need a lot more context; the OID
and attnum are only part of the truth anyway.  I think there are a small
number of cases where we could meaningfully return an ObjectAddress and
have the whole thing be useful, but I'm not sure it's worth the bother.

In patch 0004, I removed most of the Stash() calls in ProcessUtility,
instead adding one at the bottom that takes care of most of the simple
cases.  That means that a developer adding support for something new in
ProcessUtilitySlow without realizing that something must be added to the
command stash will get an error fairly early, which I think is helpful.

Patch 0021 (fixing a bug in SECURITY LABEL support) I'm not really sure
about.  I don't like modifying a parse node during execution -- seems a
bad habit.  It seems better to me to return the chosen security label as
an ObjectAddress in ExecSecLabelStmt, as pass that as "secondaryOid" to
the deparse_utility.c routine.

In patch 0023 (CREATE/ALTER POLICY support) I ran into trouble.  I
represent the role in the json like this:
    tmp = new_objtree_VA("TO %{role:, }I", 0);
which means that role names are quoted as identifiers.  This means it
doesn't work as soon as we get a command referencing PUBLIC (which many
in the regression test do, because when the TO clause is omitted, PUBLIC
is the default).  So this ends up as "PUBLIC" and everything goes
downhill.  I'm not sure what to do about this, except to invent a new
%{} code.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: deparsing utility commands

От

Stephen Frost

Дата:

19 марта 2015 г., 07:40:34

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> One thing that Stephen commented on was the ALTER TABLE preparatory
> patch.  He said why not return ObjectAddress from the subcommand
> routines instead of just Oid/attnum?  The reason is that to interpret
> the address correctly you will still need a lot more context; the OID
> and attnum are only part of the truth anyway.  I think there are a small
> number of cases where we could meaningfully return an ObjectAddress and
> have the whole thing be useful, but I'm not sure it's worth the bother.

Alright, fair enough.

> In patch 0004, I removed most of the Stash() calls in ProcessUtility,
> instead adding one at the bottom that takes care of most of the simple
> cases.  That means that a developer adding support for something new in
> ProcessUtilitySlow without realizing that something must be added to the
> command stash will get an error fairly early, which I think is helpful.

Right, I like this.

> Patch 0021 (fixing a bug in SECURITY LABEL support) I'm not really sure
> about.  I don't like modifying a parse node during execution -- seems a
> bad habit.  It seems better to me to return the chosen security label as
> an ObjectAddress in ExecSecLabelStmt, as pass that as "secondaryOid" to
> the deparse_utility.c routine.

I agree, changing a parse node during execution definitely seems like a
bad idea, particularly if there's a way you could avoid doing so, which
it sounds like there is.  Any reason to not follow up on that?

> In patch 0023 (CREATE/ALTER POLICY support) I ran into trouble.  I
> represent the role in the json like this:
>     tmp = new_objtree_VA("TO %{role:, }I", 0);
> which means that role names are quoted as identifiers.  This means it
> doesn't work as soon as we get a command referencing PUBLIC (which many
> in the regression test do, because when the TO clause is omitted, PUBLIC
> is the default).  So this ends up as "PUBLIC" and everything goes
> downhill.  I'm not sure what to do about this, except to invent a new
> %{} code.

How is this any different from the GRANT/REVOKE cases..?  Surely you
have to deal with 'public' being special there also.

> Subject: [PATCH 01/29] add secondaryOid to DefineTSConfiguration()

This looks fine to me.

> Subject: [PATCH 02/29] deparse/core: have ALTER TABLE return table OID and
>  other data
[...]
> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
> index 5ce4395..9675d21 100644
> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>  /*
>   * Store a default expression for column attnum of relation rel.
>   */
> -void
> +Oid
>  StoreAttrDefault(Relation rel, AttrNumber attnum,
>                   Node *expr, bool is_internal)
>  {

Missing a comment about the return value?

> @@ -2074,7 +2083,7 @@ StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal)
>      int            numchecks = 0;
>      ListCell   *lc;
>
> -    if (!cooked_constraints)
> +    if (list_length(cooked_constraints) == 0)
>          return;                    /* nothing to do */
>
>      /*

While this is used in a few places, it's by far more common to use:

if (cooked_constraints == NIL)

or to keep it the way it is..


> diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
> index 623e6bf..fc2c8a9 100644
> @@ -3416,78 +3418,95 @@ static void
>          case AT_DropColumn:        /* DROP COLUMN */
>              ATExecDropColumn(wqueue, rel, cmd->name,
> -                     cmd->behavior, false, false, cmd->missing_ok, lockmode);
> +                             cmd->behavior, false, false,
> +                             cmd->missing_ok, lockmode);
>              break;
>          case AT_DropColumnRecurse:        /* DROP COLUMN with recursion */
>              ATExecDropColumn(wqueue, rel, cmd->name,
> -                      cmd->behavior, true, false, cmd->missing_ok, lockmode);
> +                             cmd->behavior, true, false,
> +                             cmd->missing_ok, lockmode);
>              break;

[...]

>          case AT_ReplicaIdentity:
> -            ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def, lockmode);
> +            ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def,
> +                                  lockmode);
>              break;
>          case AT_EnableRowSecurity:
>              ATExecEnableRowSecurity(rel);

I realize the whole thing is changing anyway, but the random whitespace
changes don't make reviewing any easier.

> -static void
> +static AttrNumber
>  ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
>                  ColumnDef *colDef, bool isOid,
>                  bool recurse, bool recursing, LOCKMODE lockmode)
> @@ -4698,7 +4720,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
>                        colDef->colname, RelationGetRelationName(rel))));
>
>              heap_close(attrdesc, RowExclusiveLock);
> -            return;
> +            return InvalidAttrNumber;
>          }
>      }
>
> @@ -4944,6 +4966,8 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
>
>          heap_close(childrel, NoLock);
>      }
> +
> +    return newattnum;
>  }

This doesn't need to do anything for inheiriting children..?  I suppose
perhaps not, but might be useful to add a comment noting that only the
attnum for the parent table is returned.

> @@ -5058,7 +5082,7 @@ ATPrepAddOids(List **wqueue, Relation rel, bool recurse, AlterTableCmd *cmd, LOC
>  /*
>   * ALTER TABLE ALTER COLUMN DROP NOT NULL
>   */
> -static void
> +static AttrNumber
>  ATExecDropNotNull(Relation rel, const char *colName, LOCKMODE lockmode)
>  {
>      HeapTuple    tuple;
> @@ -5143,18 +5167,22 @@ ATExecDropNotNull(Relation rel, const char *colName, LOCKMODE lockmode)
>
>          /* keep the system catalog indexes current */
>          CatalogUpdateIndexes(attr_rel, tuple);
> +
> +        /* XXX should we return InvalidAttrNumber if there was no change? */
>      }

I don't think I commented on this the first time through, but on
reflection, I'm inclined to say "yes" that InvalidAttrNumber should be
returned if there was no change.

Comments about what's being returned wouldn't hurt either.

>  /*
>   * ALTER TABLE ALTER COLUMN SET NOT NULL
>   */
> -static void
> +static AttrNumber
>  ATExecSetNotNull(AlteredTableInfo *tab, Relation rel,
>                   const char *colName, LOCKMODE lockmode)
>  {
> @@ -5198,18 +5226,22 @@ ATExecSetNotNull(AlteredTableInfo *tab, Relation rel,
>
>          /* Tell Phase 3 it needs to test the constraint */
>          tab->new_notnull = true;
> +
> +        /* XXX should we return InvalidAttrNumber if there was no change? */
>      }

ditto.

>   *
>   * Subroutine for ATExecAddConstraint.
>   *
> @@ -5914,7 +5975,7 @@ ATExecAddConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
>   * "is_readd" flag for that; just setting recurse=false would result in an
>   * error if there are child tables.
>   */
> -static void
> +static Oid
>  ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
>                       Constraint *constr, bool recurse, bool recursing,
>                       bool is_readd, LOCKMODE lockmode)
> @@ -5923,6 +5984,7 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
>      ListCell   *lcon;
>      List       *children;
>      ListCell   *child;
> +    Oid            constrOid = InvalidOid;
>
>      /* At top level, permission check was done in ATPrepCmd, else do it */
>      if (recursing)
> @@ -5965,6 +6027,13 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
>          /* Save the actually assigned name if it was defaulted */
>          if (constr->conname == NULL)
>              constr->conname = ccon->name;
> +
> +        /*
> +         * Save our return value. Note we don't expect more than one element in
> +         * this list.
> +         */
> +        Assert(constrOid == InvalidOid);
> +        constrOid = ccon->conoid;
>      }

I feel like this Assert() should really be outside of this, like so:

Assert(list_length(newcons) <= 1);

With appropriate comment that we should only ever have a list of 0 or 1
being returned from this call to AddRelationNewConstraints.

Further, it might be better as just an if() instead of a foreach, since
we aren't actually going to be ever looping here.

I further wonder if this whole thing could be simplified by moving the
CommandCounterIncrement(), if (newcons == NIL) and is_no_inherit checks
up to right under the call to AddRelationNewConstraints and then we
don't need a conditional or a foreach.

> @@ -6573,7 +6652,7 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd,
>   * no need to lock children in that case, yet we wouldn't be able to avoid
>   * doing so at that level.
>   */
> -static void
> +static Oid
>  ATExecValidateConstraint(Relation rel, char *constrName, bool recurse,
>                           bool recursing, LOCKMODE lockmode)
>  {
> @@ -6583,6 +6662,7 @@ ATExecValidateConstraint(Relation rel, char *constrName, bool recurse,
>      HeapTuple    tuple;
>      Form_pg_constraint con = NULL;
>      bool        found = false;
> +    Oid            constrOid = InvalidOid;
>
>      conrel = heap_open(ConstraintRelationId, RowExclusiveLock);
>
> @@ -6624,9 +6704,10 @@ ATExecValidateConstraint(Relation rel, char *constrName, bool recurse,
>          HeapTuple    copyTuple;
>          Form_pg_constraint copy_con;
>
> +        constrOid = HeapTupleGetOid(tuple);
> +
>          if (con->contype == CONSTRAINT_FOREIGN)
>          {
> -            Oid            conid = HeapTupleGetOid(tuple);
>              Relation    refrel;
>
>              /*
> @@ -6641,7 +6722,7 @@ ATExecValidateConstraint(Relation rel, char *constrName, bool recurse,
>
>              validateForeignKeyConstraint(constrName, rel, refrel,
>                                           con->conindid,
> -                                         conid);
> +                                         constrOid);
>              heap_close(refrel, NoLock);
>
>              /*
> @@ -6722,6 +6803,8 @@ ATExecValidateConstraint(Relation rel, char *constrName, bool recurse,
>      systable_endscan(scan);
>
>      heap_close(conrel, RowExclusiveLock);
> +
> +    return constrOid;
>  }

This whole thing would be nicer if we inverted the second half to check
if (con->convalidated) { ... cleanup and return ... } ... continue, imv.

Regardless, a comment about how this will return InvalidOid if nothing
is done (because the constraint has already been validated) would be
good.

I thought the rest looked fine.

> Subject: [PATCH 03/29] deparse: infrastructure needed for command deparsing

> diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c

> +void
> +EventTriggerStashCommand(ObjectAddress address, ObjectAddress *secondaryObject,
> +                         Node *parsetree)

Why pass secondaryObject in as a pointer here?

> +{
> +    MemoryContext oldcxt;
> +    StashedCommand *stashed;
> +
> +    if (currentEventTriggerState->commandCollectionInhibited)
> +        return;
> +
> +    oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt);
> +
> +    stashed = palloc(sizeof(StashedCommand));
> +
> +    stashed->type = SCT_Simple;
> +    stashed->in_extension = creating_extension;
> +
> +    stashed->d.simple.address = address;
> +    stashed->d.simple.secondaryObject =
> +        secondaryObject ? *secondaryObject : InvalidObjectAddress;

Per above, this could just be the same as 'address' if it wasn't a
pointer being passed in but rather either a valid object or
InvalidObjectAddress.  Perhaps a helper function could be used if that
seems awkward, which only takes a single address and then calls this
function with the secondaryObject set to InvalidObjectAddress?

> +Datum
> +pg_event_trigger_get_creation_commands(PG_FUNCTION_ARGS)

I agree with the earlier comment, seems like this should get a different
name..  pg_event_trigger_get_commands()?

> +        /*
> +         * For IF NOT EXISTS commands that attempt to create an existing
> +         * object, the returned OID is Invalid; in those cases, return an empty
> +         * command instead of trying to soldier on.

That's not quite accurate, is it?  We don't return an empty command, we
simply don't return anything, right?

> +        command = deparse_utility_command(cmd);
> +
> +        /*
> +         * Some parse trees return NULL when deparse is attempted; we don't
> +         * emit anything for them.
> +         */
> +        if (command != NULL)
> +        {

Couldn't this be flipped around to be if (!command) continue; instead
then, similar to earlier where we got an InvalidId returned?

> diff --git a/src/backend/tcop/deparse_utility.c b/src/backend/tcop/deparse_utility.c

Is it just me, or does this feel a bit odd to have in backend/tcop/?

> @@ -0,0 +1,992 @@
> +/*-------------------------------------------------------------------------
> + *
> + * deparse_utility.c
> + *      Functions to convert utility commands to machine-parseable
> + *      representation
> + *
> + * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group

Pretty sure this should be up to 2015. :)

> +#ifdef NOT_USED
> +static void append_integer_object(ObjTree *tree, char *name, int64 value);
> +#endif

...?

> +/*
> + * Given a utility command parsetree and the OID of the corresponding object,
> + * return a JSON representation of the command.
[...]
> +char *
> +deparse_utility_command(StashedCommand *cmd)

That's not quite right now, is it?  It's the parsetree and the
ObjectAddress.  It'd also be good to point out that those are what's in
that StashedCommand object.

> Subject: [PATCH 04/29] deparse: add EventTriggerStashCommand() calls to
>  ProcessUtilitySlow

No particular complaints here.

> Subject: [PATCH 05/29] deparse: Support CREATE
>  SCHEMA/TABLE/SEQUENCE/INDEX/TRIGGER/TYPE AS


> diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c

> +static char *
> +get_persistence_str(char persistence)
> +{
> +    switch (persistence)
> +    {
> +        case RELPERSISTENCE_TEMP:
> +            return "TEMPORARY";
> +        case RELPERSISTENCE_UNLOGGED:
> +            return "UNLOGGED";
> +        case RELPERSISTENCE_PERMANENT:
> +            return "";
> +        default:
> +            return "???";
> +    }
> +}

??? strikes me as less than ideal?  Should we ereport() there?

> @@ -977,6 +987,8 @@ pg_get_indexdef_columns(Oid indexrelid, bool pretty)
>   *
>   * This is now used for exclusion constraints as well: if excludeOps is not
>   * NULL then it points to an array of exclusion operator OIDs.
> + *
> + * XXX if you change this function, see pg_get_indexdef_detailed too.
>   */

Isn't that more of a 'Note:' or similar, not an 'XXX'?

...  There was a lot here and a lot of it looked pretty familiar, so I
didn't press too hard on it.  I'm sure you've been testing it with
Andres, etc, which is all good, but to the extent that issues remain,
they aren't really in our critical paths and the buildfarm along with
other testing by users will flush them out.  The structure and approach
certainly seems good to me.

> Subject: [PATCH 06/29] deparse: Support CREATE TYPE AS ENUM / RANGE

Looked fine.

> Subject: [PATCH 07/29] deparse: Support EXTENSION commands

> diff --git a/src/backend/tcop/deparse_utility.c b/src/backend/tcop/deparse_utility.c

>  /*
> + * deparse_CreateExtensionStmt
> + *        deparse a CreateExtensionStmt
> + *
> + * Given an extension OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + *
> + * XXX the current representation makes the output command dependant on the
> + * installed versions of the extension.  Is this a problem?
> + */

I think requiring the same version is perfectly sensible and, yeah, if
that version isn't installed then things go boom, but that shouldn't
really surprise anyone at this point.

Having been looking at pg_dump, extensions, and binary upgrades
recently, it does make me wonder if any thought has been put towards
those functions which extensions (and binary upgrade) can call to
associated objects with extensions..?  Those are modifying objects but
they don't go through the normal standard_ProcessUtility() framework.
I'm inclined to think that's a mistake, as we build this kind of
introspection into what happens in utility, but we have those cases now
and we should probably think about how to handle them.

> Subject: [PATCH 08/29] deparse: Support CREATE RULE

Didn't see any issues here.

> Subject: [PATCH 09/29] deparse: Support ALTER TYPE / ADD VALUE (enums)

Ditto.

> Subject: [PATCH 10/29] deparse: Support ALTER .. {RENAME/OWNER/SCHEMA}

> diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
> index 3ddd7ec..d28758c 100644
> --- a/src/backend/commands/alter.c
> +++ b/src/backend/commands/alter.c
> @@ -446,7 +446,6 @@ ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt,
>                  Relation    relation;
>                  Oid            classId;
>                  Oid            nspOid;
> -                ObjectAddress address;
>
>                  address = get_object_address(stmt->objectType,
>                                               stmt->object,

Wow.  I'm surprised one of the various compilers we have running around
wasn't bitching about that shadowing.  Is there any reason to think we
need to worry about this?  We don't have any code that cares, but
perhaps some external users might?

> diff --git a/src/backend/tcop/deparse_utility.c b/src/backend/tcop/deparse_utility.c

The comments in here seemed to, particularly, have a lot of unanswered
questions.  Would be good to get those addressed.

> Subject: [PATCH 11/29] deparse: Support CREATE/ALTER DOMAIN
> Subject: [PATCH 12/29] deparse: Support CREATE/ALTER FUNCTION
> Subject: [PATCH 13/29] deparse: Support ALTER TABLE
> Subject: [PATCH 14/29] deparse: Support CREATE VIEW
> Subject: [PATCH 15/29] deparse: Support CREATE OPERATOR FAMILY
> Subject: [PATCH 16/29] deparse: Support CREATE CONVERSION
> Subject: [PATCH 17/29] deparse: Support DefineStmt commands
> Subject: [PATCH 18/29] deparse: support CREATE CAST
> Subject: [PATCH 19/29] deparse: Support GRANT/REVOKE

These looked alright.

>  pg_event_trigger_get_creation_commands(PG_FUNCTION_ARGS)
>  {
> @@ -1915,6 +1997,30 @@ pg_event_trigger_get_creation_commands(PG_FUNCTION_ARGS)
>                  /* command */
>                  values[i++] = CStringGetTextDatum(command);
>              }
> +            else
> +            {
> +                Assert(cmd->type == SCT_Grant);
> +
> +                /* classid */
> +                nulls[i++] = true;
> +                /* objid */
> +                nulls[i++] = true;
> +                /* objsubid */
> +                nulls[i++] = true;
> +                /* command tag */
> +                values[i++] = CStringGetTextDatum(cmd->d.grant.istmt->is_grant ?
> +                                                  "GRANT" : "REVOKE");
> +                /* object_type */
> +                values[i++] = CStringGetTextDatum(cmd->d.grant.type);
> +                /* schema */
> +                nulls[i++] = true;
> +                /* identity */
> +                nulls[i++] = true;
> +                /* in_extension */
> +                values[i++] = BoolGetDatum(cmd->in_extension);
> +                /* command */
> +                values[i++] = CStringGetTextDatum(command);
> +            }

Really seems like we need to do better here, no?  That's a lot of things
just getting returned as NULLs...

> Subject: [PATCH 20/29] deparse: support COMMENT ON and SECURITY LABEL
> Subject: [PATCH 21/29] deparse: Handle default security provider.

> diff --git a/src/backend/commands/seclabel.c b/src/backend/commands/seclabel.c
> index 1ef98ce..aa4de97 100644
> --- a/src/backend/commands/seclabel.c
> +++ b/src/backend/commands/seclabel.c
> @@ -63,6 +63,11 @@ ExecSecLabelStmt(SecLabelStmt *stmt)
>                      (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>                       errmsg("must specify provider when multiple security label providers have been loaded")));
>          provider = (LabelProvider *) linitial(label_provider_list);
> +        /*
> +         * Set the provider in the statement so that DDL deparse can use
> +         * provider explicitly in generated statement.
> +         */
> +        stmt->provider = (char *) provider->provider_name;
>      }
>      else
>      {

This should really be in the SECURITY LABEL patch, no?  Not a big deal
tho.

> Subject: [PATCH 22/29] deparse: support CREATE TABLE AS
> Subject: [PATCH 23/29] deparse: support CREATE/ALTER POLICY
> Subject: [PATCH 24/29] deparse: support REFRESH MATERIALIZED VIEW
> Subject: [PATCH 25/29] deparse: support FDW-related commands
> Subject: [PATCH 26/29] infrastructure for ALTER OPERATOR FAMILY
> Subject: [PATCH 27/29] deparse: Support ALTER OPERATOR FAMILY
> Subject: [PATCH 28/29] deparse: support ALTER DEFAULT PRIVILEGES
> Subject: [PATCH 29/29] deparse: support CREATE OPERATOR CLASS

Looked alright.

Again though, as I got at above, I'm not sure we're really going to see
a lot of improvements or changes to this until it gets more exposure.
I'd really like to see this happen because I think it's a fantastic
introspective capability, but I do wish it had gone in about 4 months
ago.  Still, I'm on board with moving forward on this and look forward
to it being in the tree where we can get the buildfarm working on it and
other people testing and playing with it.
Thanks!
    Stephen

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

25 марта 2015 г., 21:15:23

Here's an updated version of this series.  I have fixed many issues; in
particular the supported command set is now complete, with the only
exception being CREATE TABLE AS EXECUTE.

I went to all the prior reviews and compiled a list of unhandled issues,
on which I will be working shortly.  I copied it below.  I wanted to
post an updated version so in case someone wants to give this another
look while I do so.

I have changed the ALTER TABLE preparatory patch: Stephen had previously
commented that it'd be better to have it return ObjAddress rather than
Oid/attnum.  I originally thought this wasn't an improvement, but after
looking again, it turns out that it is, so I changed it that way.
Things look better this way.

Note patch 0004: this modifies pg_event_trigger_dropped_objects to
report temporary objects as they are dropped.  In the original coding,
all temp objects are silently ignored, but for the purposes of the
testing module of this patch, this is a problem; consider the case where
a test create a temp object, then drops it, then creates another object
with the same name.  We need the sql_drop event trigger in the test
harness to report the dropped object, so that it can be dropped in the
regression_deparse database as well -- otherwise the second create
command fails because the first object still exists.

I have applied numerous other fixes.

For the first time I also include the regression test harness.  To run
this, execute "make deparsecheck" in src/test/regress.  This runs a
special schedule file, corresponding to serial_schedule minus the
tablespace test, and adds one test at the beginning and one test after
the end; those add event triggers to capture all commands and then run
the commands so captured in another database.  The output of that is
grepped for errors; there's a very short list of errors that are always
present.  Later, pg_dump is run on both databases, and the dumps are
compared.  Currently there are some differences in the dumps; those
correspond to remaining issues in the deparse code.  Regarding CREATE
TABLE AS EXECUTE, I had to add two "expected" files for two tests that
have an extra WARNING line when that command fails to deparse.  I would
love to, instead, be able to deparse the command to something sensible.
(This is a bit brittle, because it's run from a makefile rather than
inside pg_regress; it's not pretty when things fail.  Must be improved,
perhaps as a separate pg_regress binary that does all the process
control internally.)

Known remaining issues:

* Rename pg_event_trigger_get_creation_command().  Some ideas:
  - pg_event_trigger_commands()
  - pg_event_trigger_ddl_commands()
  - pg_event_trigger_deparsed_commands()
  - pg_event_trigger_deparsed_ddl_commands()

* CREATE SEQUENCE OWNED BY is not handled properly for freestanding
  sequences.

* Handle default_tablespace GUC in CREATE TABLE and others.

* Handle the extension version in some way to CREATE EXTENSION.

* RENAME and others need to handle the inheritance flags (ONLY, etc)

* Many constant strings in SQL syntax ought to use a JSON object
  representation with the "present" boolean thingy, instead of an empty
  string as currently.  That makes it easier to turn them on/off from
  event triggers without forcing the user to be aware of the exact
  spelling of each option.

* Should we prohibit DDL from within event triggers?

* Handle subxact aborts in ALTER TABLE processing.

* Maybe stash of commands should be an slist rather than List, just like
  the dropped objects list.

* For CREATE TABLE AS, the generated command does not have a column list,
  and the deparsed version might depend on temp tables for instance;
  maybe we need some way to control that one from producing a normal
  CREATE TABLE instead.

* The "provider" bit in SECURITY LABEL is messy.  There is no Oid for
  security providers, they're just loadable modules.

* event trigger currentEventTriggerContext was only being set up when
  there were sql_drop/table_rewrite events; we remove that and set it up
  always now.  That should be restored.

* There are several XXX/FIXME/TODO comments in deparse_utility.c.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: deparsing utility commands

От

Ryan Pedela

Дата:

25 марта 2015 г., 22:21:53

On Wed, Mar 25, 2015 at 11:59 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

* Should we prohibit DDL from within event triggers?

Please don't prohibit DDL unless there is a really, really good reason to do so. I have several use cases in mind for event triggers, but they are only useful if I can perform DDL.

For example, I want to create an Elasticsearch FDW and use that to index and search Postgres tables. When a table is created, I am planning to use an event trigger to capture the CREATE event and automatically create a foreign table via the Elasticsearch FDW. In addition, I would add normal triggers to the new table which capture DML and update the foreign table accordingly. In other words, I want to use FDWs and event triggers to automatically sync table DDL and DML to Elasticsearch.

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

25 марта 2015 г., 23:26:07

Alvaro Herrera wrote:
> Here's an updated version of this series.

I just pushed patches 0001 and 0002, with very small tweaks; those had
already been reviewed and it didn't seem like there was much
controversy.

To test the posted series it's probably easiest togit checkout b3196e65f5bfc997ec7fa3f91645a09289c10dee
and apply it all on top of that.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Robert Haas

Дата:

02 апреля 2015 г., 19:05:21

On Wed, Mar 25, 2015 at 3:21 PM, Ryan Pedela <rpedela@datalanche.com> wrote:
> On Wed, Mar 25, 2015 at 11:59 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>> * Should we prohibit DDL from within event triggers?
>
> Please don't prohibit DDL unless there is a really, really good reason to do
> so. I have several use cases in mind for event triggers, but they are only
> useful if I can perform DDL.
>
> For example, I want to create an Elasticsearch FDW and use that to index and
> search Postgres tables. When a table is created, I am planning to use an
> event trigger to capture the CREATE event and automatically create a foreign
> table via the Elasticsearch FDW. In addition, I would add normal triggers to
> the new table which capture DML and update the foreign table accordingly. In
> other words, I want to use FDWs and event triggers to automatically sync
> table DDL and DML to Elasticsearch.

+1.  I think that if it's unsafe to run DDL from with event triggers,
then that might be a sign that it's not safe to run *any* user-defined
code at that location.  I think it's the job of the event trigger to
make sure that it doesn't assume anything about what may have changed
while the user-defined code was running.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: deparsing utility commands

От

Andres Freund

Дата:

02 апреля 2015 г., 19:11:16

On 2015-04-02 12:05:13 -0400, Robert Haas wrote:
> On Wed, Mar 25, 2015 at 3:21 PM, Ryan Pedela <rpedela@datalanche.com> wrote:
> > On Wed, Mar 25, 2015 at 11:59 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> > wrote:
> >> * Should we prohibit DDL from within event triggers?
> >
> > Please don't prohibit DDL unless there is a really, really good reason to do
> > so. I have several use cases in mind for event triggers, but they are only
> > useful if I can perform DDL.
> >
> > For example, I want to create an Elasticsearch FDW and use that to index and
> > search Postgres tables. When a table is created, I am planning to use an
> > event trigger to capture the CREATE event and automatically create a foreign
> > table via the Elasticsearch FDW. In addition, I would add normal triggers to
> > the new table which capture DML and update the foreign table accordingly. In
> > other words, I want to use FDWs and event triggers to automatically sync
> > table DDL and DML to Elasticsearch.
> 
> +1.  I think that if it's unsafe to run DDL from with event triggers,
> then that might be a sign that it's not safe to run *any* user-defined
> code at that location.  I think it's the job of the event trigger to
> make sure that it doesn't assume anything about what may have changed
> while the user-defined code was running.

First of: I don't see a fundamental reason to forbid it, I think it's
just simpler to analyze that way. But I'm unconvinced by that
reasoning. As you know we prohibit executing DDL on some objects in
*lots* of places c.f. CheckTableNotInUse(), without that imo being a
sign of a more fundamental problem.

Greetings,

Andres Freund

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

07 апреля 2015 г., 23:32:13

Executive summary:

There is now a CommandDeparse_hook;
deparse_utility_command is provided as an extension, intended for 9.6;
rest of patch would be pushed to 9.5.


Long version:

I've made command deparsing hookable.  Attached there are three patches:
the first patch contains changes to core that just add the "command
list" stuff, so that on ddl_command_end there is access to what has been
executed.  This includes the OID of the object just created, the command
tag, and assorted other details; the deparsed command in JSON format is
not immediately part of the result.

The third patch contains all the deparse code, packaged as a contrib
module and extension named ddl_deparse.  Essentially, it's everything
that was previously in tcop/deparse_utility.c and utils/adt/ddl_json.c:
the stuff that takes the parsenode and OID of a command execution and
turns it into a JSON blob, and also the support function that takes the
JSON blob and converts back into the plain text rendering of the
command.

The second patch contains some changes to core code that support the
ddl_deparse extension; mainly some ruleutils.c changes.

What links patches 0001 and 0003 is a hook, CommandDeparse_hook.  If
unset, the pg_event_trigger_ddl_commands function returns some
boilerplate text like "no deparse function installed"; if the extension
is installed, the JSON rendering is returned instead and can be used
with the ddl_deparse_expand_command() function.

The rationale for doing things this way is that it will be useful to
have 9.5 expose the pg_event_trigger_ddl_commands() function for various
uses, while we refine the JSON bits some more and get it committed for
9.6.  In reviews, it's clear that there's some more bits to fiddle so
that it can be as general as possible.  I think we should label the
whole DDL command reporting as experimental in 9.5 and subject to
change, so that we can just remove the hook in 9.6 when the ddl_deparse
thing becomes part of core.

Thoughts?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

07 апреля 2015 г., 23:33:19

Attachments.


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: deparsing utility commands

От

David Steele

Дата:

08 апреля 2015 г., 22:30:29

On 4/7/15 4:32 PM, Alvaro Herrera wrote:
> Executive summary:
>
> There is now a CommandDeparse_hook;
> deparse_utility_command is provided as an extension, intended for 9.6;
> rest of patch would be pushed to 9.5.
>
>
> Long version:
>
> I've made command deparsing hookable.  Attached there are three patches:
> the first patch contains changes to core that just add the "command
> list" stuff, so that on ddl_command_end there is access to what has been
> executed.  This includes the OID of the object just created, the command
> tag, and assorted other details; the deparsed command in JSON format is
> not immediately part of the result.

I'm looking forward to testing this with pg_audit.  I think the first
patch alone will give us the bulk of the desired functionality.

> The third patch contains all the deparse code, packaged as a contrib
> module and extension named ddl_deparse.  Essentially, it's everything
> that was previously in tcop/deparse_utility.c and utils/adt/ddl_json.c:
> the stuff that takes the parsenode and OID of a command execution and
> turns it into a JSON blob, and also the support function that takes the
> JSON blob and converts back into the plain text rendering of the
> command.
>
> The second patch contains some changes to core code that support the
> ddl_deparse extension; mainly some ruleutils.c changes.

> What links patches 0001 and 0003 is a hook, CommandDeparse_hook.  If
> unset, the pg_event_trigger_ddl_commands function returns some
> boilerplate text like "no deparse function installed"; if the extension
> is installed, the JSON rendering is returned instead and can be used
> with the ddl_deparse_expand_command() function.

I would prefer for the column to be NULL or to have a boolean column
that indicates if JSON output is available (or both).  I'd rather not do
string matching if I can help it (or test for invalid JSON).

> The rationale for doing things this way is that it will be useful to
> have 9.5 expose the pg_event_trigger_ddl_commands() function for various
> uses, while we refine the JSON bits some more and get it committed for
> 9.6.  In reviews, it's clear that there's some more bits to fiddle so
> that it can be as general as possible.  I think we should label the
> whole DDL command reporting as experimental in 9.5 and subject to
> change, so that we can just remove the hook in 9.6 when the ddl_deparse
> thing becomes part of core.

I'm completely on board with this.  I think deparse is a cool piece of
code and I see a bunch of potential uses for it, but at the same time
I'm not sure it has finished baking yet.  This is a smart and safe choice.

Is there a particular commit you'd recommend for applying the deparse
patches (especially the first set)?

--
- David Steele
david@pgmasters.net

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

09 апреля 2015 г., 19:14:22

Alvaro Herrera wrote:
> Executive summary:
>
> There is now a CommandDeparse_hook;
> deparse_utility_command is provided as an extension, intended for 9.6;
> rest of patch would be pushed to 9.5.

Actually here's another approach I like better: use a new pseudotype,
pg_ddl_command, which internally is just a pointer to a CollectedCommand
struct.  We cannot give access to the pointer directly in SQL, so much
like type internal or pg_node_tree the in/out functions should just
error out.  But the type can be returned as a column in
pg_event_trigger_ddl_command.  An extension can create a function that
receives that type as argument, and return a different representation of
the command; the third patch creates a function ddl_deparse_to_json()
which does that.

You can have as many extensions as you want, and your event triggers can
use the column as many times as necessary.  This removes the limitation
of the previous approach that you could only have a single extension
providing a CommandDeparse_hook function.

This patch applies cleanly on top of current master.  You need to
install the extension with "CREATE EXTENSION ddl_deparse" after building
it in contrib.

Note: the extension is NOT to be committed to core, only the first two
patches; that will give us more room to revisit the JSON representation
more promptly.  My intention is that the extension will be published
elsewhere.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: deparsing utility commands

От

David Steele

Дата:

14 апреля 2015 г., 19:56:25

On 4/9/15 12:14 PM, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> Executive summary:
>>
>> There is now a CommandDeparse_hook;
>> deparse_utility_command is provided as an extension, intended for 9.6;
>> rest of patch would be pushed to 9.5.
>
> Actually here's another approach I like better: use a new pseudotype,
> pg_ddl_command, which internally is just a pointer to a CollectedCommand
> struct.  We cannot give access to the pointer directly in SQL, so much
> like type internal or pg_node_tree the in/out functions should just
> error out.  But the type can be returned as a column in
> pg_event_trigger_ddl_command.  An extension can create a function that
> receives that type as argument, and return a different representation of
> the command; the third patch creates a function ddl_deparse_to_json()
> which does that.

I've tested this approach and it returns the same results as the
previous approach for all calls to pg_event_trigger_ddl_command() made
by the pg_audit regression tests.  For my testing I was generally only
applying patch 0001 and 0002 (although 0001 on it's own also worked for
my case).  I applied patch 0003 only for the purpose of being sure it
did not break anything, and did not specifically test the functionality
of 0002 or 0003.

However, I've tested 0001 over a very wide range of commands and have
not found any anomalous behaviors.

I've reviewed the 0001 and 0002 patches and don't have anything to add
beyond what has been mentioned in previous reviews.  You've applied your
pattern consistently across a wide variety of utility commands which is
something of a feat.

Since 0003 will not be committed to core, I was more interested in the
mechanism used to connect 0003 to 0001 and 0002.  I like the new
approach better than the old one in that there is no connecting hook
now.  As you say, it provides more flexibility in terms of using the
command data in as many ways as you like.  Even better, the user has
access to three different representations and can choose which one is
best for them.  And, of course, they can write code for their own
representation using pg_ddl_command, JSON, or SQL.

I also like the way the patch has been broken up.  The core
functionality will finally make event triggers truly useful in a
high-level language like pl/pgsql.  I can see many uses for them now
that pg_event_trigger_ddl_command() is present.

From my review and testing, I see no barriers to committing patches 0001
and 0002.  I'd love to see this functionality in 9.5.

Regards,
--
- David Steele
david@pgmasters.net

Re: deparsing utility commands

От

Amit Kapila

Дата:

19 апреля 2015 г., 06:59:41

On Thu, Apr 9, 2015 at 9:44 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Alvaro Herrera wrote:
> > Executive summary:
> >
> > There is now a CommandDeparse_hook;
> > deparse_utility_command is provided as an extension, intended for 9.6;
> > rest of patch would be pushed to 9.5.
>
> Actually here's another approach I like better: use a new pseudotype,
> pg_ddl_command, which internally is just a pointer to a CollectedCommand
> struct. We cannot give access to the pointer directly in SQL, so much
> like type internal or pg_node_tree the in/out functions should just
> error out. But the type can be returned as a column in
> pg_event_trigger_ddl_command. An extension can create a function that
> receives that type as argument, and return a different representation of
> the command; the third patch creates a function ddl_deparse_to_json()
> which does that.
>
> You can have as many extensions as you want, and your event triggers can
> use the column as many times as necessary. This removes the limitation
> of the previous approach that you could only have a single extension
> providing a CommandDeparse_hook function.
>

Few Comments/Questions regrading first 2 patches:

Regarding Patch 0001-deparse-infrastructure-needed-for-command-deparsing

+ * Currently, sql_drop, table_rewrite, ddL_command_end events are the only

/ddL_command_end/ddl_command_end

'L' should be in lower case.

+ * FIXME this API isn't considering the possibility that a xact/subxact is

+ * aborted partway through. Probably it's best to add an

+ * AtEOSubXact_EventTriggers() to fix this.

+ */

+void

+EventTriggerAlterTableEnd(void)

{

}

Wouldn't the same fix be required for RollbackToSavePoint case

as well? Do you intend to fix this in separate patch?

+/*-------------------------------------------------------------------------

+ *

+ * aclchk_internal.h

+ *

+/*-------------------------------------------------------------------------

+ *

+ * deparse_utility.h

+ *

+ /*

+ * copying the node is moderately challenging ... should we consider

+ * changing InternalGrant into a full-fledged node instead?

+ */

+ icopy = palloc(sizeof(InternalGrant));

+ memcpy(icopy, istmt, sizeof(InternalGrant));

+ icopy->objects = list_copy(istmt->objects);

Don't we need to copy (AclMode privileges;)?

-static void AlterOpFamilyAdd(List *opfamilyname, Oid amoid, Oid opfamilyoid,

+static void AlterOpFamilyAdd(AlterOpFamilyStmt *stmt,

+ List *opfamilyname, Oid amoid, Oid opfamilyoid,

int maxOpNumber, int maxProcNumber,

List *items);

-static void AlterOpFamilyDrop(List *opfamilyname, Oid amoid, Oid opfamilyoid,

+static void AlterOpFamilyDrop(AlterOpFamilyStmt *stmt,

+ List *opfamilyname, Oid amoid, Oid opfamilyoid,

int maxOpNumber, int maxProcNumber,

List *items);

Now that both the above functions have parameter AlterOpFamilyStmt *stmt,

so can't we get rid of second parameter List *opfamilyname as that

is part of structure AlterOpFamilyStmt?

@@ -1175,204 +1229,258 @@ ProcessUtilitySlow(Node *parsetree,

+ EventTriggerAlterTableStart(parsetree);

+ address =

+ DefineIndex(relid, /* OID of heap relation */

+ stmt,

+ InvalidOid, /* no predefined OID */

+ false, /* is_alter_table */

+ true, /* check_rights */

+ false, /* skip_build */

+ false); /* quiet */

+ /*

+ * Add the CREATE INDEX node itself to stash right away; if

+ * there were any commands stashed in the ALTER TABLE code,

+ * we need them to appear after this one.

+ */

+ EventTriggerCollectSimpleCommand(address, secondaryObject,

+ parsetree);

+ commandCollected = true;

+ EventTriggerAlterTableEnd();

Is there a reason why EventTriggerAlterTableRelid() is not called

after EventTriggerAlterTableStart() in this flow?

+void

+EventTriggerInhibitCommandCollection(void)

+void

+EventTriggerUndoInhibitCommandCollection(void)

These function names are understandable, some alternative names could be

InhibitEventTriggerCommandCollection(), PreventEventTriggerCommandCollection(),

or ProhibitEventTriggerCommandCollection() if you prefer anyone of these

over others.

case T_CreateOpClassStmt:

- DefineOpClass((CreateOpClassStmt *) parsetree);

+ address = DefineOpClass((CreateOpClassStmt *) parsetree);

+ /* command is stashed in DefineOpClass */

+ commandCollected = true;

Is there a need to remember address if command is already

collected?

Regarding Patch 0002-changes-to-core-to-support-the-deparse-extension:

+char *

+format_procedure_args(Oid procedure_oid, bool force_qualify)

Is there a reason of exposing new API instead of using existing API

format_procedure_parts()?

As far as I can see the only difference is that in new API, you have

control of whether to schema_qualify it, is that the main reason of

exposing new API?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: deparsing utility commands

От

Dimitri Fontaine

Дата:

28 апреля 2015 г., 18:23:53

Hi,

As a comment to the whole email below, I think the following approach is
the best compromise we will find, allowing everybody to come up with
their own format much as in the Logical Decoding plugins world.

Of course, as in the Logical Decoding area, BDR and similar projects
will implement their own representation, in BDR IIUC the DDL will get
translated into a JSON based AST thing.

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Actually here's another approach I like better: use a new pseudotype,
> pg_ddl_command, which internally is just a pointer to a CollectedCommand
> struct.  We cannot give access to the pointer directly in SQL, so much
> like type internal or pg_node_tree the in/out functions should just
> error out.  But the type can be returned as a column in
> pg_event_trigger_ddl_command.  An extension can create a function that
> receives that type as argument, and return a different representation of
> the command; the third patch creates a function ddl_deparse_to_json()
> which does that.
>
> You can have as many extensions as you want, and your event triggers can
> use the column as many times as necessary.  This removes the limitation
> of the previous approach that you could only have a single extension
> providing a CommandDeparse_hook function.
>
> This patch applies cleanly on top of current master.  You need to
> install the extension with "CREATE EXTENSION ddl_deparse" after building
> it in contrib.
>
> Note: the extension is NOT to be committed to core, only the first two
> patches; that will give us more room to revisit the JSON representation
> more promptly.  My intention is that the extension will be published
> elsewhere.

+1 from me,

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

04 мая 2015 г., 21:57:37

Here's a cleaned up version of this patch; I threw together a very quick
test module, and updated a conflicting OID.  As far as I can tell, I'm
only missing the documentation updates before this is push-able.

One change to note is that the AlterTable support used to ignore
commands that didn't match the OID as set by
EventTriggerAlterTableRelid(); the comment there said that the point was
to avoid collecting the same commands to child tables as recursion
occured in execution.  I think that would be imposing such a decision;
perhaps some users of this infrastructure want to know about the
operations as they happen on child tables too.  With this definition it
is up to the user module to ignore the duplicates.

Thanks for your review.  In reply to your comments:

Amit Kapila wrote:

> Few Comments/Questions regrading first 2 patches:
>
> Regarding Patch 0001-deparse-infrastructure-needed-for-command-deparsing
>
> 1.
> + * Currently, sql_drop, table_rewrite, ddL_command_end events are the only
>
> /ddL_command_end/ddl_command_end
>
> 'L' should be in lower case.

True. Fixed.

> 2.
> + * FIXME this API isn't considering the possibility that a xact/subxact is
> + * aborted partway through.  Probably it's best to add an
> + * AtEOSubXact_EventTriggers() to fix this.
> + */
> +void
> +EventTriggerAlterTableEnd(void)
> {
> ..
> }
>
> Wouldn't the same fix be required for RollbackToSavePoint case
> as well?  Do you intend to fix this in separate patch?

Acrually, I figured that this is not at issue.  When a subxact is rolled
back, the whole currentEventTriggerState thing is reverted to a previous
item in the stack; if an event trigger is fired by ALTER TABLE, and the
resulting function invokes ALTER TABLE again, they collect their
commands in separate state elements, so there is never a conflict.
I can't think of any situation in which one event trigger function adds
elements to the list of the calling command.

> 3.
> + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
> + * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
>
> Copyright notice years should be same.

Yeah, fixed.

> 4.
> + /*
> + * copying the node is moderately challenging ... should we consider
> + * changing InternalGrant into a full-fledged node instead?
> + */
> + icopy = palloc(sizeof(InternalGrant));
> + memcpy(icopy, istmt, sizeof(InternalGrant));
> + icopy->objects = list_copy(istmt->objects);
>
> Don't we need to copy (AclMode privileges;)?

AFAICT that's copied by memcpy.

> 5.
> -static void AlterOpFamilyAdd(List *opfamilyname, Oid amoid, Oid
> opfamilyoid,
> +static void AlterOpFamilyAdd(AlterOpFamilyStmt *stmt,
> + List *opfamilyname, Oid amoid, Oid opfamilyoid,
>   int maxOpNumber, int maxProcNumber,
>   List *items);
> -static void AlterOpFamilyDrop(List *opfamilyname, Oid amoid, Oid
> opfamilyoid,
> +static void AlterOpFamilyDrop(AlterOpFamilyStmt *stmt,
> +  List *opfamilyname, Oid amoid, Oid opfamilyoid,
>    int maxOpNumber, int maxProcNumber,
>    List *items);
>
> Now that both the above functions have parameter AlterOpFamilyStmt *stmt,
> so can't we get rid of second parameter List *opfamilyname as that
> is part of structure AlterOpFamilyStmt?

Yeah, I considered that as I wrote the code but dropped it out of
laziness.  I have done so now.

> 6.
> @@ -1175,204 +1229,258 @@ ProcessUtilitySlow(Node *parsetree,
> ..
> + EventTriggerAlterTableStart(parsetree);
> + address =
> + DefineIndex(relid, /* OID of heap relation */
> + stmt,
> + InvalidOid, /* no predefined OID */
> + false, /* is_alter_table */
> + true, /* check_rights */
> + false, /* skip_build */
> + false); /* quiet */
> + /*
> + * Add the CREATE INDEX node itself to stash right away; if
> + * there were any commands stashed in the ALTER TABLE code,
> + * we need them to appear after this one.
> + */
> + EventTriggerCollectSimpleCommand(address, secondaryObject,
> + parsetree);
> + commandCollected = true;
> + EventTriggerAlterTableEnd();
>
> Is there a reason why EventTriggerAlterTableRelid() is not called
> after EventTriggerAlterTableStart() in this flow?

All paths that go through AlterTableInternal() have the Oid set by that
function.

> 7.
> +void
> +EventTriggerInhibitCommandCollection(void)
>
> +void
> +EventTriggerUndoInhibitCommandCollection(void)
>
> These function names are understandable, some alternative names could be
> InhibitEventTriggerCommandCollection(),
> PreventEventTriggerCommandCollection(),
> or ProhibitEventTriggerCommandCollection() if you prefer anyone of these
> over others.

Hmm, most of the reason I picked these names is the EventTrigger prefix.

> 8.
>   case T_CreateOpClassStmt:
> - DefineOpClass((CreateOpClassStmt *) parsetree);
> + address = DefineOpClass((CreateOpClassStmt *) parsetree);
> + /* command is stashed in DefineOpClass */
> + commandCollected = true;
>
> Is there a need to remember address if command is already
> collected?

None.  Removed that.

> Regarding Patch 0002-changes-to-core-to-support-the-deparse-extension:

I decided against committing 0002 patch for now, as it's mostly code
that would not have any callers in core.  If needed, in BDR we can just
duplicate the ruleutils.c code ... it's not a huge problem.  We can
reconsider later.

Note: for BDR, the JSON representation is pointless complication.  All
it wants is the "normalized" command string unchanged, i.e. add schema
names everywhere and such.  We came up with the JSON representation as a
way to appease reviewers here who wanted truly generalized access to the
parse tree.  As far as BDR is concerned, we could just as well remove
all the JSON stuff and go back to some simpler representation ...  The
only reason to keep it is the expectation that someday (9.6 hopefully)
we will include the whole thing in core.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

ddl-deparse.patch

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

08 мая 2015 г., 21:13:43

Alvaro Herrera wrote:
> Here's a cleaned up version of this patch; I threw together a very quick
> test module, and updated a conflicting OID.  As far as I can tell, I'm
> only missing the documentation updates before this is push-able.

Here is a complete version.  Barring serious problems, I intend to
commit this on Monday.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Fabrízio de Royes Mello

Дата:

08 мая 2015 г., 21:49:51

<div dir="ltr"><div class="gmail_extra"><br />On Fri, May 8, 2015 at 3:14 PM, Alvaro Herrera <<a
href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>>wrote:<br />><br />> Alvaro Herrera
wrote:<br/>> > Here's a cleaned up version of this patch; I threw together a very quick<br />> > test
module,and updated a conflicting OID.  As far as I can tell, I'm<br />> > only missing the documentation updates
beforethis is push-able.<br />><br />> Here is a complete version.  Barring serious problems, I intend to<br
/>>commit this on Monday.<br />><br /><br /></div><div class="gmail_extra">You forgot to attach the patches!<br
/></div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br /></div><div class="gmail_extra"><br
/>--<br/>Fabrízio de Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

08 мая 2015 г., 22:28:58

Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Here's a cleaned up version of this patch; I threw together a very quick
> > test module, and updated a conflicting OID.  As far as I can tell, I'm
> > only missing the documentation updates before this is push-able.
>
> Here is a complete version.  Barring serious problems, I intend to
> commit this on Monday.


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

ddl-deparse-3.patch

Re: deparsing utility commands

От

David Steele

Дата:

11 мая 2015 г., 22:38:38

On 5/8/15 3:29 PM, Alvaro Herrera wrote:
>> Here is a complete version.  Barring serious problems, I intend to
>> commit this on Monday.

I have reviewed and tested this patch and everything looks good to me.
It also looks like you added better coverage for schema DDL, which is a
welcome addition.

--
- David Steele
david@pgmasters.net

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

12 мая 2015 г., 01:24:36

David Steele wrote:

> I have reviewed and tested this patch and everything looks good to me.
> It also looks like you added better coverage for schema DDL, which is a
> welcome addition.

Thanks -- I have pushed this now.

My hope is to get this test module extended quite a bit, not only to
cover existing commands, but also so that it causes future changes to
cause failure unless command collection is considered.  (In a previous
version of this patch, there was a test mode that ran everything in the
serial_schedule of regular regression tests.  That was IMV a good way to
ensure that new commands were also tested to run under command
collection.  That hasn't been enabled in the new test module, and I
think it's necessary.)

If anyone wants to contribute to the test module so that more is
covered, that would be much appreciated.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

"Shulgin, Oleksandr"

Дата:

29 июля 2015 г., 13:45:03

On Tue, May 12, 2015 at 12:25 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

David Steele wrote:

> I have reviewed and tested this patch and everything looks good to me.
> It also looks like you added better coverage for schema DDL, which is a
> welcome addition.

Thanks -- I have pushed this now.

Hi,

I've tried compiling the 0003-ddl_deparse-extension part from http://www.postgresql.org/message-id/20150409161419.GC4369@alvh.no-ip.org on current master and that has failed because the 0002 part hasn't been actually pushed (I've asked Alvaro off the list about this, that's how I know the reason ;-).

I was able to update the 0002 part so it applies cleanly (updated version attached), and then the contrib module compiles after one minor change and seems to work.

I've started to look into what it would take to move 0002's code to the extension itself, and I've got a question about use of printTypmod() in format_type_detailed():

if (typemod >= 0)

*typemodstr = printTypmod(NULL, typemod, typeform->typmodout);

else

*typemodstr = pstrdup("");

Given that printTypmod() does psprintf("%s%s") one way or the other, shouldn't we pass an empty string here instead of NULL as typname argument?

My hope is to get this test module extended quite a bit, not only to
cover existing commands, but also so that it causes future changes to
cause failure unless command collection is considered. (In a previous
version of this patch, there was a test mode that ran everything in the
serial_schedule of regular regression tests. That was IMV a good way to
ensure that new commands were also tested to run under command
collection. That hasn't been enabled in the new test module, and I
think it's necessary.)

If anyone wants to contribute to the test module so that more is
covered, that would be much appreciated.

I'm planning to have a look at this part also.

Alex

Вложения

ddl_deparse-core-support.patch

Re: deparsing utility commands

От

"Shulgin, Oleksandr"

Дата:

31 июля 2015 г., 16:46:09

On Wed, Jul 29, 2015 at 12:44 PM, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:

On Tue, May 12, 2015 at 12:25 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
David Steele wrote:

> I have reviewed and tested this patch and everything looks good to me.
> It also looks like you added better coverage for schema DDL, which is a
> welcome addition.

Thanks -- I have pushed this now.

Hi,

I've tried compiling the 0003-ddl_deparse-extension part from http://www.postgresql.org/message-id/20150409161419.GC4369@alvh.no-ip.org on current master and that has failed because the 0002 part hasn't been actually pushed (I've asked Alvaro off the list about this, that's how I know the reason ;-).

I was able to update the 0002 part so it applies cleanly (updated version attached), and then the contrib module compiles after one minor change and seems to work.

I've started to look into what it would take to move 0002's code to the extension itself, and I've got a question about use of printTypmod() in format_type_detailed():

if (typemod >= 0)
*typemodstr = printTypmod(NULL, typemod, typeform->typmodout);
else
*typemodstr = pstrdup("");

Given that printTypmod() does psprintf("%s%s") one way or the other, shouldn't we pass an empty string here instead of NULL as typname argument?

Testing shows this is a bug indeed, easily triggered by deparsing any type with typmod.

My hope is to get this test module extended quite a bit, not only to
cover existing commands, but also so that it causes future changes to
cause failure unless command collection is considered. (In a previous
version of this patch, there was a test mode that ran everything in the
serial_schedule of regular regression tests. That was IMV a good way to
ensure that new commands were also tested to run under command
collection. That hasn't been enabled in the new test module, and I
think it's necessary.)

If anyone wants to contribute to the test module so that more is
covered, that would be much appreciated.

I'm planning to have a look at this part also.

While running deparsecheck suite I'm getting a number of oddly looking errors:

WARNING: state: 42883 errm: operator does not exist: pg_catalog.oid = pg_catalog.oid

This is caused by deparsing create view, e.g.:

STATEMENT: create view v1 as select * from t1 ;

ERROR: operator does not exist: pg_catalog.oid = pg_catalog.oid at character 52

HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.

QUERY: SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND rulename = $2

CONTEXT: PL/pgSQL function test_ddl_deparse() line 1 at FOR over SELECT rows

The pg_rewrite query comes from ruleutils.c, while ddl_deparse.c calls it through pg_get_viewdef_internal() but don't understand how is it different from e.g., select pg_get_viewdef(...), and that last one is not affected.

Thoughts?

Alex

Re: deparsing utility commands

От

Jim Nasby

Дата:

05 августа 2015 г., 22:20:46

On 7/31/15 8:45 AM, Shulgin, Oleksandr wrote:
> While running deparsecheck suite I'm getting a number of oddly looking
> errors:
>
> WARNING:  state: 42883 errm: operator does not exist: pg_catalog.oid =
> pg_catalog.oid
>
> This is caused by deparsing create view, e.g.:
>
> STATEMENT:  create view v1 as select * from t1 ;
> ERROR:  operator does not exist: pg_catalog.oid = pg_catalog.oid at
> character 52
> HINT:  No operator matches the given name and argument type(s). You
> might need to add explicit type casts.
> QUERY:  SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND
> rulename = $2
> CONTEXT:  PL/pgSQL function test_ddl_deparse() line 1 at FOR over SELECT
> rows
>
> The pg_rewrite query comes from ruleutils.c, while ddl_deparse.c calls
> it through pg_get_viewdef_internal() but don't understand how is it
> different from e.g., select pg_get_viewdef(...), and that last one is
> not affected.

I'm not sure what test_ddl_deparse is doing, is that where the oid = oid 
is coming from?

It might be enlightening to replace = with OPERATOR(pg_catalog.=) and 
see if that works.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

06 августа 2015 г., 05:55:41

Jim Nasby wrote:
> On 7/31/15 8:45 AM, Shulgin, Oleksandr wrote:

> >STATEMENT:  create view v1 as select * from t1 ;
> >ERROR:  operator does not exist: pg_catalog.oid = pg_catalog.oid at
> >character 52
> >HINT:  No operator matches the given name and argument type(s). You
> >might need to add explicit type casts.
> >QUERY:  SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND
> >rulename = $2
> >CONTEXT:  PL/pgSQL function test_ddl_deparse() line 1 at FOR over SELECT
> >rows

> I'm not sure what test_ddl_deparse is doing, is that where the oid = oid is
> coming from?
> 
> It might be enlightening to replace = with OPERATOR(pg_catalog.=) and see if
> that works.

That worked, thanks!

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Jim Nasby

Дата:

06 августа 2015 г., 18:26:19

On 8/5/15 9:55 PM, Alvaro Herrera wrote:
> Jim Nasby wrote:
>> On 7/31/15 8:45 AM, Shulgin, Oleksandr wrote:
>
>>> STATEMENT:  create view v1 as select * from t1 ;
>>> ERROR:  operator does not exist: pg_catalog.oid = pg_catalog.oid at
>>> character 52
>>> HINT:  No operator matches the given name and argument type(s). You
>>> might need to add explicit type casts.
>>> QUERY:  SELECT * FROM pg_catalog.pg_rewrite WHERE ev_class = $1 AND
>>> rulename = $2
>>> CONTEXT:  PL/pgSQL function test_ddl_deparse() line 1 at FOR over SELECT
>>> rows
>
>> I'm not sure what test_ddl_deparse is doing, is that where the oid = oid is
>> coming from?
>>
>> It might be enlightening to replace = with OPERATOR(pg_catalog.=) and see if
>> that works.
>
> That worked, thanks!

Good, but why weren't operator resolution rules working correctly here 
to begin with?

FWIW, my interest in this is largely due to the problems I've had with 
this in the variant type. In particular, using the same resolution rules 
for functions and operators. So I'm wondering if there's a bigger issue 
here.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

06 августа 2015 г., 18:32:43

Jim Nasby wrote:

> FWIW, my interest in this is largely due to the problems I've had with this
> in the variant type. In particular, using the same resolution rules for
> functions and operators. So I'm wondering if there's a bigger issue here.

I'd be glad to review your variant stuff, but I doubt it's the same
thing at play.  Please don't hijack this thread.  Deparse is doing some
pretty nasty games with the backend innards to make the deparsed
representation usable generally.  We were just missing a simple trick.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

"Shulgin, Oleksandr"

Дата:

20 августа 2015 г., 17:29:19

My hope is to get this test module extended quite a bit, not only to
cover existing commands, but also so that it causes future changes to
cause failure unless command collection is considered. (In a previous
version of this patch, there was a test mode that ran everything in the
serial_schedule of regular regression tests. That was IMV a good way to
ensure that new commands were also tested to run under command
collection. That hasn't been enabled in the new test module, and I
think it's necessary.)

If anyone wants to contribute to the test module so that more is
covered, that would be much appreciated.

I'm planning to have a look at this part also.

I've made some progress with this patch recently. Now I was able to get rid of most errors in the deparsecheck run and it's time to have a look on deparse_dump.errors. For now it can be summarized as follows:

src/test/regress$ wc -l results/deparse_dump.errors

147 results/deparse_dump.errors

src/test/regress$ < results/deparse_dump.errors sed 's/"[^"]*"\|[0-9]\+/???/g' | sort | uniq -c | sort -nr

98 ERROR: relation ??? does not exist

11 ERROR: column ??? of relation ??? does not exist

10 ERROR: role ??? does not exist

9 ERROR: large object ??? does not exist

6 ERROR: cannot alter type of column ??? twice

4 ERROR: column ??? of relation ??? already exists

2 ERROR: constraint ??? of relation ??? does not exist

1 ERROR: type public.rw_view??? does not exist

1 ERROR: schema ??? does not exist

1 ERROR: invalid value for parameter ???: ???

1 ERROR: invalid reference to FROM-clause entry for table ???

1 ERROR: index ??? does not exist

1 ERROR: column data type integer can only have storage PLAIN

1 ERROR: cannot alter type ??? because it is the type of a typed table

A particularly nasty one is:

ERROR: index "cwi_replaced_pkey" does not exist

The test statement that's causing it is this one:

ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,

ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY

USING INDEX cwi_uniq2_idx;

Which gets deparsed as:

ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,

ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY

USING INDEX cwi_replaced_pkey;

The problem is that during constraint transformation, the index is being renamed to match the constraint name and at the deparse stage the original index name appears to be lost completely... I haven't figure out if there's a way around unless we introduce a new field in IndexStmt (originalName?) to cover exactly this corner case.

The updated versions of the core-support patch and the contrib modules are attached.

Alex

Вложения

Re: deparsing utility commands

От

"Shulgin, Oleksandr"

Дата:

20 августа 2015 г., 19:27:36

On Thu, Aug 20, 2015 at 4:28 PM, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:

Which gets deparsed as:

ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,
ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY
USING INDEX cwi_replaced_pkey;

The problem is that during constraint transformation, the index is being renamed to match the constraint name and at the deparse stage the original index name appears to be lost completely... I haven't figure out if there's a way around unless we introduce a new field in IndexStmt (originalName?) to cover exactly this corner case.

The updated versions of the core-support patch and the contrib modules are attached.

Another quirk of ALTER TABLE is that due to multi-pass processing in ATRewriteCatalogs, the same command might be collected a number of times. For example, in src/test/regress/sql/inherit.sql:

alter table a alter column aa type integer using bit_length(aa);

the "alter column type" then appears 4 times in the deparsed output as identical subcommands of a single ALTER TABLE command.

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

20 августа 2015 г., 19:36:13

Shulgin, Oleksandr wrote:

> Another quirk of ALTER TABLE is that due to multi-pass processing in
> ATRewriteCatalogs, the same command might be collected a number of times.
> For example, in src/test/regress/sql/inherit.sql:
> 
> alter table a alter column aa type integer using bit_length(aa);
> 
> the "alter column type" then appears 4 times in the deparsed output as
> identical subcommands of a single ALTER TABLE command.

Yeah, I had a hack somewhere in the collection code that if the relation
ID was different from what was specified, then the command was ignored.
I removed that before commit because it seemed possible that for some
cases you actually want the command reported separately for each child.

I think our best option in this case is to ignore commands that are
reported for different relations during JSON deparsing.  Not sure how
easy that is.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

Alvaro Herrera

Дата:

20 августа 2015 г., 19:46:33

Shulgin, Oleksandr wrote:

> A particularly nasty one is:
> 
> ERROR:  index "cwi_replaced_pkey" does not exist
> 
> The test statement that's causing it is this one:
> 
> ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,
> ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY
> USING INDEX cwi_uniq2_idx;
> 
> Which gets deparsed as:
> 
> ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,
> ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY
> USING INDEX cwi_replaced_pkey;
> 
> The problem is that during constraint transformation, the index is being
> renamed to match the constraint name and at the deparse stage the original
> index name appears to be lost completely...  I haven't figure out if
> there's a way around unless we introduce a new field in IndexStmt
> (originalName?) to cover exactly this corner case.

Oh :-(  Hmm, modifying parse nodes should normally be considered last
resort, and at the same time identifying objects based on names rather
than OIDs is frowned upon.  But I don't see any way to handle this
otherwise.  I'm not sure what's the best solution for this one.

> The updated versions of the core-support patch and the contrib modules are
> attached.

Please try to split things up, or at least don't mix more than it
already is.  For instance, the tsconfig mapping stuff should be its own
patch; we can probably make a case for pushing that on its own.

Also I see you added one caller of EventTriggerInhibitCommandCollection.
I don't like that crap at all and I think we would be better off if we
were able to remove it completely.  Please see whether it's possible to
handle this case in some other way.

You seem to have squashed the patches?  Please keep the split out.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: deparsing utility commands

От

"Shulgin, Oleksandr"

Дата:

21 августа 2015 г., 15:40:23

On Thu, Aug 20, 2015 at 6:46 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Shulgin, Oleksandr wrote:

> A particularly nasty one is:
>
> ERROR: index "cwi_replaced_pkey" does not exist
>
> The test statement that's causing it is this one:
>
> ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,
> ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY
> USING INDEX cwi_uniq2_idx;
>
> Which gets deparsed as:
>
> ALTER TABLE cwi_test DROP CONSTRAINT cwi_uniq_idx,
> ADD CONSTRAINT cwi_replaced_pkey PRIMARY KEY
> USING INDEX cwi_replaced_pkey;
>
> The problem is that during constraint transformation, the index is being
> renamed to match the constraint name and at the deparse stage the original
> index name appears to be lost completely... I haven't figure out if
> there's a way around unless we introduce a new field in IndexStmt
> (originalName?) to cover exactly this corner case.

Oh :-( Hmm, modifying parse nodes should normally be considered last
resort, and at the same time identifying objects based on names rather
than OIDs is frowned upon. But I don't see any way to handle this
otherwise. I'm not sure what's the best solution for this one.

Yeah, me neither. At least one can see that in the Constraint struct we have both indexname and conname, so adding conname to IndexStmt doesn't seem to be really in disconnect with that (especially since we're constructing an IndexStmt from a Constraint in this case). See patch #5.

Or maybe we should drop the isconstraint/deferrable/initdeferred fields and put there an optional pointer to Constraint struct instead? Just an idea, I didn't check how intrusive such a change might be.

> The updated versions of the core-support patch and the contrib modules are
> attached.

Please try to split things up, or at least don't mix more than it
already is. For instance, the tsconfig mapping stuff should be its own
patch; we can probably make a case for pushing that on its own.

Also I see you added one caller of EventTriggerInhibitCommandCollection.
I don't like that crap at all and I think we would be better off if we
were able to remove it completely. Please see whether it's possible to
handle this case in some other way.

Well, I've only added it to suppress the extra SET NOT NULL sub-commands produced by the original ADD CONSTRAINT PRIMARY KEY command. The deparsed command is still correct I believe, though a bit too verbose:

ALTER TABLE public.cwi_test

ALTER COLUMN a SET NOT NULL,

ALTER COLUMN b SET NOT NULL,

ADD CONSTRAINT cwi_uniq_idx PRIMARY KEY USING INDEX cwi_uniq_idx NOT DEFERRABLE INITIALLY IMMEDIATE;

The only other place where this inhibiting is employed is around refresh matview command and it prevents extra plumbing output like this:

CREATE TEMPORARY TABLE pg_temp.pg_temp_32070_2 WITH (oids=OFF) AS

SELECT mv.ctid AS tid, newdata.*::pg_temp_32070 AS newdata FROM (...);

Now, what I think would make the most sense is still capture all the intermediary low-level commands, but put them hierarchically under the command invoking them, so that interested clients can still inspect them, but for the purpose of reconstructing of captured DDL events one needs to replay only the top-level commands.

From what I can see, this can be achieved by removing check on isCompleteQuery in ProcessUtilitySlow and run all EventTrigger*() regardless, which will stack the captured commands as they are being captured.

The question when arises is how to reflect the command hierarchy in the output of pg_event_trigger_ddl_commands(). Maybe simply assigning an integer id and referencing it in an additional parent_command_id column would do the trick.

> Another quirk of ALTER TABLE is that due to multi-pass processing in
> ATRewriteCatalogs, the same command might be collected a number of times.
> For example, in src/test/regress/sql/inherit.sql:
>
> alter table a alter column aa type integer using bit_length(aa);
>
> the "alter column type" then appears 4 times in the deparsed output as
> identical subcommands of a single ALTER TABLE command.

Yeah, I had a hack somewhere in the collection code that if the relation
ID was different from what was specified, then the command was ignored.
I removed that before commit because it seemed possible that for some
cases you actually want the command reported separately for each child.

I think our best option in this case is to ignore commands that are
reported for different relations during JSON deparsing. Not sure how
easy that is.

Hm, but there is no different relation in this case, it's just that we get the "ALTER COLUMN ... SET DATA TYPE" sub-command gets repeated multiple times:

ALTER TABLE public.a

ALTER COLUMN aa SET DATA TYPE pg_catalog.int4 USING pg_catalog.bit_length(aa),

ALTER COLUMN aa SET DATA TYPE pg_catalog.int4 USING pg_catalog.bit_length(aa);

Oh, five times actually...

You seem to have squashed the patches? Please keep the split out.

Well, if that makes the review process easier :-)

Alex

Вложения

Re: deparsing utility commands

От

Ajin Cherian

Дата:

13 апреля 2022 г., 07:29:02

On Wed, Apr 13, 2022 at 2:12 PM Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de> wrote:

>> You seem to have squashed the patches?  Please keep the split out.
>
>
> Well, if that makes the review process easier :-)
>
> --
> Alex
>

This patch-set has not been rebased for some time. I see that there is
interest in this patch from the logical
replication of DDL thread [1].

I will take a stab at rebasing this patch-set, I have already rebased
the first patch and will work on the other
patches in the coming days. Do review and give me feedback.

regards,
Ajin Cherian
Fujitsu Australia

Вложения

0001-ddl_deparse-core-support.patch

Re: deparsing utility commands

От

Ajin Cherian

Дата:

13 апреля 2022 г., 07:33:13

On Wed, Apr 13, 2022 at 2:29 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
>
> This patch-set has not been rebased for some time. I see that there is
> interest in this patch from the logical
> replication of DDL thread [1].
>

Forgot to add the link to the thread.

[1] - https://www.postgresql.org/message-id/202203162206.7spggyktx63e@alvherre.pgsql

Re: deparsing utility commands

От

Peter Eisentraut

Дата:

13 апреля 2022 г., 17:19:30

On 13.04.22 06:29, Ajin Cherian wrote:
> This patch-set has not been rebased for some time. I see that there is
> interest in this patch from the logical
> replication of DDL thread [1].
> 
> I will take a stab at rebasing this patch-set, I have already rebased
> the first patch and will work on the other
> patches in the coming days. Do review and give me feedback.

The patch you posted contains neither a detailed commit message nor 
documentation or test changes, so it's impossible to tell what it's 
supposed to do.

Re: deparsing utility commands

От

Ajin Cherian

Дата:

15 апреля 2022 г., 19:40:53

On Wed, Apr 13, 2022 at 2:12 PM Shulgin, Oleksandr
<oleksandr.shulgin@zalando.de> wrote:
>
>
>> You seem to have squashed the patches?  Please keep the split out.
>
>
> Well, if that makes the review process easier :-)
>
> --
> Alex
>

I've rebased patches 1, 2 and 5 (now 1,2 and 3). Patches 3 and 4 seem
to be related to the testing of the extension and contrib module
ddl_deparse which is not in this patch-set, so I have left those
patches out, as I have no way of testing the test cases added as part
of these patches.

regards,
Ajin Cherian
Fujitsu Australia

Вложения

Re: deparsing utility commands

От

Ajin Cherian

Дата:

15 апреля 2022 г., 19:42:37

On Thu, Apr 14, 2022 at 12:19 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> The patch you posted contains neither a detailed commit message nor
> documentation or test changes, so it's impossible to tell what it's
> supposed to do.
>

Sorry, I was only rebasing the patches and have kept the same commit
messages as previously present.

regards,
Ajin Cherian

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: deparsing utility commands

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения