Re: [HACKERS] MERGE SQL Statement for PG11

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: [HACKERS] MERGE SQL Statement for PG11
Дата
Msg-id CABOikdOueODEoWYR9Bu8yC_RvcZVUa=rXVT2DX-xzjFnKNo79A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] MERGE SQL Statement for PG11  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers


On Fri, Apr 6, 2018 at 1:30 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-04-05 11:31:48 +0530, Pavan Deolasee wrote:
> > +/*---------------------------------------------------------
> > ----------------
> > + *
> > + * nodeMerge.c
> > + *   routines to handle Merge nodes relating to the MERGE command
> >
> > Isn't this file misnamed and it should be execMerge.c?  The rest of the
> > node*.c files are for nodes that are invoked via execProcnode /
> > ExecProcNode().  This isn't an actual executor node.
> >
>
> Makes sense. Done. (Now that the patch is committed, I don't know if there
> would be a rethink about changing file names. May be not, just raising that
> concern)

It absolutely definitely needed to be renamed. But that's been done, so
...


> > What's the reasoning behind making this be an anomaluous type of
> > executor node?

> Didn't quite get that. I think naming of the file was bad (fixed now), but
> I think it's a good idea to move the new code in a new file, from
> maintainability as well as coverage perspective. If you've something very
> different in mind, can you explain in more details?

Well, it was kinda modeled as an executor node, including the file
name. That's somewhat fixed now.   I still am extremely suspicious of
the codeflow here.

My impression is that this simply shouldn't be going through
nodeModifyTuple, but be it's own nodeMerge node. The trigger handling
would need to be abstraced into execTrigger.c

There is nothing called execTrigger.c. You probably mean creating something new or trigger.c. That being said, I don't think the current code is bad. There is already a switch to handle different command types. We add one more for CMD_MERGE and then fire individual BEFORE/AFTER statement triggers based on what actions MERGE may take. The ROW trigger handling doesn't require any change.
 
or such to avoid
duplication.  We're now going from nodeModifyTable.c:ExecModifyTable()
into execMerge.c:ExecMerge(), back to
nodeModifyTable.c:ExecUpdate/Insert(). To avoid ExecInsert() doing
things that aren't appropriate for merge, we then pass an actionState,
which neuters part of ExecUpdate/Insert(). This is just bad.

I have spent most part of today trying to hack around the executor with the goal to do what you're suggesting. Having done so, I must admit I am actually quite puzzled why you think having a new node is going to be any significant improvement.
 
I first tried to split nodeModifyTable.c into two parts. One that deals with just ModifyTable node (the exported Init/Exec/End functions to start with) and the other which abstracts out the actual Insert/Update/Delete operations. The idea was that the second part shouldn't know anything about ModifyTable and it being just one of the users.  I started increasingly disliking the result as I continued hacking. The knowledge of ModifyTable node is quite wide-spread, including execPartition.c and FDW. The ExecInsert/Update, partitioning, ON CONFLICT handling all make use of ModifyTable extensively. While MERGE does not need ON CONFLICT, it needs everything else, even FDW at some point in future. Do you agree that this is a bad choice or am I getting it completely wrong or do you really expect MERGE to completely refactor nodeModifyTable.c, including the partitioning related code?

I gave up on this approach.

I then started working on other possibility where we keep a ModifyTable node inside a MergeTable (that's the name I am using, for now). The paths/plans/executor state gets built that way. So all common members still belong to the ModifyTable (and friends) and we only add MERGE specific information to MergeTable (and friends).

This, at least looks somewhat better. We can have:

- ExecInitMergeTable(): does some basic initialisation of the embedded ModifyTable node so that it can be passed around
- ExecMergeTable(): executes the (only) subplan, fetches tuples, fires BR triggers, does work for handling transition tables (so mostly duplication of what ExecModifyTable does already) and then executes the MERGE actions. It will mostly use the ModifyTable node created during initialisation when calling ExecUpdate/Insert etc
- ExecEndMergeTable(): ends the executor

This is probably far better than the first approach. But is it really a huge improvement over the committed code? Or even an improvement at all?

If that's not what you've in mind, can you please explain in more detail how to you see the final design and how exactly it's better than what we have today? Once there is clarity, I can work on it in a fairly quick manner.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Excessive PostmasterIsAlive calls slow down WAL redo
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [HACKERS] logical decoding of two-phase transactions