Re: MERGE ... RETURNING

Поиск
Список
Период
Сортировка
От Gurjeet Singh
Тема Re: MERGE ... RETURNING
Дата
Msg-id CABwTF4V2H9enRQQu5qCxhSOF49Z8ODMOcU2P6e5uSWGWTE-N5A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MERGE ... RETURNING  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Ответы Re: MERGE ... RETURNING
Re: MERGE ... RETURNING
Список pgsql-hackers
On Fri, Jul 7, 2023 at 3:48 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Thu, 6 Jul 2023 at 06:13, Gurjeet Singh <gurjeet@singh.im> wrote:
> >
> > > One minor annoyance is that, due to the way that pg_merge_action() and
> > > pg_merge_when_clause() require access to the MergeActionState, they
> > > only work if they appear directly in the RETURNING list. They can't,
> > > for example, appear in a subquery in the RETURNING list, and I don't
> > > see an easy way round that limitation.
> >
> > I believe that's a serious limitation, and would be a blocker for the feature.
>
> Yes, that was bugging me for quite a while.
>
> Attached is a new version that removes that restriction, allowing the
> merge support functions to appear anywhere. This requires a bit of
> planner support, to deal with merge support functions in subqueries,
> in a similar way to how aggregates and GROUPING() expressions are
> handled. But a number of the changes from the previous patch are no
> longer necessary, so overall, this version of the patch is slightly
> smaller.

+1

> > I think the name of function pg_merge_when_clause() can be improved.
> > How about pg_merge_when_clause_ordinal().
> >
>
> That's a bit of a mouthful, but I don't have a better idea right now.
> I've left the names alone for now, in case something better occurs to
> anyone.

+1. How do we make sure we don't forget that it needs to be named
better. Perhaps a TODO item within the patch will help.

> > In the doc page of MERGE, you've moved the 'with_query' from the
> > bottom of Parameters section to the top of that section. Any reason
> > for this? Since the Parameters section is quite large, for a moment I
> > thought the patch was _adding_ the description of 'with_query'.
> >
>
> Ah yes, I was just making the order consistent with the
> INSERT/UPDATE/DELETE pages. That could probably be committed
> separately.

I don't think that's necessary, if it improves consistency with related docs.

> > +    /*
> > +     * Merge support functions should only be called directly from a MERGE
> > +     * command, and need access to the parent ModifyTableState.  The parser
> > +     * should have checked that such functions only appear in the RETURNING
> > +     * list of a MERGE, so this should never fail.
> > +     */
> > +    if (IsMergeSupportFunction(funcid))
> > +    {
> > +        if (!state->parent ||
> > +            !IsA(state->parent, ModifyTableState) ||
> > +            ((ModifyTableState *) state->parent)->operation != CMD_MERGE)
> > +            elog(ERROR, "merge support function called in non-merge context");
> >
> > As the comment says, this is an unexpected condition, and should have
> > been caught and reported by the parser. So I think it'd be better to
> > use an Assert() here. Moreover, there's ERROR being raised by the
> > pg_merge_action() and pg_merge_when_clause() functions, when they
> > detect the function context is not appropriate.
> >
> > I found it very innovative to allow these functions to be called only
> > in a certain part of certain SQL command. I don't think  there's a
> > precedence for this in  Postgres; I'd be glad to learn if there are
> > other functions that Postgres exposes this way.
> >
> > -    EXPR_KIND_RETURNING,        /* RETURNING */
> > +    EXPR_KIND_RETURNING,        /* RETURNING in INSERT/UPDATE/DELETE */
> > +    EXPR_KIND_MERGE_RETURNING,  /* RETURNING in MERGE */
> >
> > Having to invent a whole new ParseExprKind enum, feels like a bit of
> > an overkill. My only objection is that this is exactly like
> > EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt
> > with exactly in as many places. But I don't have any alternative
> > proposals.
> >
>
> That's all gone now from the new patch, since there is no longer any
> restriction on where these functions can appear.

I believe this elog can be safely turned into an Assert.

+    switch (mergeActionCmdType)
     {
-        CmdType     commandType = relaction->mas_action->commandType;
....
+        case CMD_INSERT:
....
+        default:
+            elog(ERROR, "unrecognized commandType: %d", (int)
mergeActionCmdType);

The same treatment can be applied to the elog(ERROR) in pg_merge_action().

> > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
> > +                                     OUT action text, OUT tid int,
> > OUT new_balance int)
> > +LANGUAGE sql AS
> > +$$
> > +    MERGE INTO sq_target t
> > +    USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
> > +    ON tid = v.sid
> > +    WHEN MATCHED AND tid > 2 THEN
> >
> > Again, for consistency, the comparison operator should be >=. There
> > are a few more occurrences of this comparison in the rest of the file,
> >  that need the same treatment.
> >
>
> I changed the new tests to use ">= 2" (and the COPY test now returns 3
> rows, with an action of each type, which is easier to read), but I
> don't think it's really necessary to change all the existing tests
> from "> 2". There's nothing wrong with the "= 2" case having no
> action, as long as the tests give decent coverage.

I was just trying to drive these tests towards a consistent pattern.
As a reader, if I see these differences, the first and the
conservative thought I have is that these differences must be there
for a reason, and then I have to work to find out what those reasons
might be. And that's a lot of wasted effort, just in case someone
intends to change something in these tests.

I performed this round of review by comparing the diff between the v7
and v8 patches (after applying to commit 4f4d73466d)

-ExecProcessReturning(ResultRelInfo *resultRelInfo,
+ExecProcessReturning(ModifyTableContext *context,
+                     ResultRelInfo *resultRelInfo,
...
+    TupleTableSlot *rslot;
...
+    if (context->relaction)
+    {
...
+        PG_TRY();
+        {
+            rslot = ExecProject(projectReturning);
+        }
+        PG_FINALLY();
+        {
+            mergeActionCmdType = saved_mergeActionCmdType;
+            mergeActionIdx = saved_mergeActionIdx;
+        }
+        PG_END_TRY();
+    }
+    else
+        rslot = ExecProject(projectReturning);
+
+    return rslot;

In the above hunk, if there's an exception/ERROR, I believe we should
PG_RE_THROW(). If there's a reason to continue, we should at least set
rslot = NULL, otherwise we may be returning an uninitialized value to
the caller.

 { oid => '9499', descr => 'command type of current MERGE action',
-  proname => 'pg_merge_action',  provolatile => 'v',
+  proname => 'pg_merge_action',  provolatile => 'v', proparallel => 'r',
....
 { oid => '9500', descr => 'index of current MERGE WHEN clause',
-  proname => 'pg_merge_when_clause',  provolatile => 'v',
+  proname => 'pg_merge_when_clause',  provolatile => 'v', proparallel => 'r',
....

I see that you've now set proparallel of these functions to 'r'. I'd
just like to understand how you got to that conclusion.

--- error when using MERGE support functions outside MERGE
-SELECT pg_merge_action() FROM sq_target;

I believe it would be worthwhile to keep a record of the expected
outputs of these invocations in the tests, just in case they change
over time.

Best regards,
Gurjeet
http://Gurje.et



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gurjeet Singh
Дата:
Сообщение: Re: Document that server will start even if it's unable to open some TCP/IP ports
Следующее
От: Tom Lane
Дата:
Сообщение: Re: unrecognized node type while displaying a Path due to dangling pointer