Re: MERGE Specification
От | Heikki Linnakangas |
---|---|
Тема | Re: MERGE Specification |
Дата | |
Msg-id | 4C679DC2.2020508@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: MERGE Specification (Boxuan Zhai <bxzhai2010@gmail.com>) |
Список | pgsql-hackers |
On 15/08/10 10:31, Boxuan Zhai wrote: > On Sun, Aug 15, 2010 at 4:05 AM, Heikki Linnakangas< > heikki.linnakangas@enterprisedb.com> wrote: >> Thanks. I went through this, fixing all the spurious end-of-line whitespace >> first with "git apply --whitespace=fix", and then manually fixing comment >> and whitespace style, typos, and doing some other small comment editing. >> Resulting patch attached. It is also available at my git repository at >> git://git.postgresql.org/git/users/heikki/postgres.git, branch >> 'mergestmt'. Please base any further patch versions on this patch, so that >> we don't need to redo this cleanup. >> > Thanks for the cleanup. The codes looks better now. My problem is that I > have done some more modifications after merge_v102. Is there any way to > apply the cleanup patch without erasing my new codes? Yeah, there's multiple ways. You can create a patch of what you have now, and run interdiff against your previous patch that I worked against. That gives you a diff of changes since the last patch you sent to the list. You can then apply that patch to the codetree from my git repository. Or you can just generate a new patch of what you have as usual, and I'll incorporate the changes to the cleaned-up patch. >> I'll continue reviewing this sometime next week, but here's few >> miscellaneous issues for starters; >> >> * Explain output of actions needs some work: >> >>> Merge (cost=246.37..272.96 rows=1770 width=38) >>> ACTION: UPDATE WHEN NOT MATCHED >>> ACTION: INSERT WHEN NOT MATCHED >>> -> Merge Left Join (cost=246.37..272.96 rows=1770 width=38) >> >> Should print out more details of the action, like for normal >> updates/inserts/deletes. > > > Well, I think, in normal UPDATE/INSERT/DELETE explanation, there are no more > details than what we have here except the costs, rows and width, which is > print out at the MERGE command line. > > For example: > > Explain > update buy set volume = volume + 1; > QUERY PLAN > -------------------------------------------------------------- > Update (cost=0.00..36.75 rows=2140 width=14) > -> Seq Scan on buy (cost=0.00..36.75 rows=2140 width=14) > (2 rows) > > For the explanation of an UPDATE command, we only have a Update title > followed by the costs, then, after the arrow -> there comes the plan tree. > In a MERGE command, no action has its real private plan. They all share the > main plan. Thus, the costs for a merge command is that of the main plan. > What is useful for a merge action is only the target list and quals. So my > design is to show the merge command costs in first line. Then print out the > actions and their qualifications in a list, followed by the main plan tree. It's more more interesting with more complex statement: postgres=# explain UPDATE target SET id = (SELECT COUNT(*) FROM generate_series(1,10)); QUERY PLAN -------------------------------------------------------------------------------------- Update (cost=12.51..52.52 rows=2400width=6) InitPlan 1 (returns $0) -> Aggregate (cost=12.50..12.51 rows=1 width=0) -> FunctionScan on generate_series (cost=0.00..10.00 rows=1000 width=0) -> Seq Scan on target (cost=0.00..40.00 rows=2400 width=6) (5 rows) > Is there any other thing you suggest to print out for each action? It should match the output of a normal Update/Insert as closely as possible. >> * Is the behavior now SQL-compliant? We had long discussions about the >> default action being insert or do nothing or raise error, but I never got a >> clear picture of what the SQL spec says and whether we're compliant. >> >> * What does the "one tuple is error" notice mean? We'll have to do >> something about that.. I don't think we've properly thought through the >> meaning of RAISE ERROR. Let's cut it out from the patch until then. It's >> only a few dozen lines to put back when we know what to do about it. >> > I find that we have not reached an agreement on MERGE's syntax yet. > Personally, I support Simon's idea, that is the default action should be > RAISE ERROR. However, I am no sure what RAISE ERROR should do when we > get a missing tuple. Here I just use a simple elog(NOTICE, "a tuple is > error"); for this situation. > > I leave this for further extension when a more specific design for RAISE > ERROR is available. > > Well, I have to say the current RAISE ERROR elog is a little bit ugly. Do > you want me to chage the default action back to DO NOTHING? Or any other > suggetions? (In fact, my personal thinking is to add a non-omissible "ELSE" > clause as the end of the action list which forces the user to specify the > default action for merge). Whatever the spec says is what we should do. >> * Do you need the MergeInsert/MergeUpdate/MergeDelete alias nodes? Would it >> be simpler to just use InsertStmt/UpdateStmt/DeleteStmt directly? > > I need one flag in these statement to differentiate them from normal > InsertStmt/UpdateStmt/DeleteStmt. There are slight difference for the > process of these two kinds of statements in the transfromStmt() function. I > define a set of alias nodes which have different nodetags. This can make > the code simpler. Ok. You might also consider just adding a "isMergeAction" boolean to InsertStmt/UpdateStmt/DeleteStmt instead, or if the difference between the regular statements and merge actions grow big, create a completely separate node struct for them. >> * Do you need the out-function for DeleteStmt? Why not for UpdateStmt and >> InsertStmt? > > Ah, I add this function because I want to have a look of the content of > DeleteStmt. I did this long ago, just after I started the project. If you > don't want this function, I will remove it. Ok. >> * Instead of scanning the list of actions in ExecBS/ExecASMergeTriggers >> every time, should set flags at plan time to mark what kind of actions there >> is in the statement. > > First of all, I need to say that the functions of ExecBSMergeTriggers() and > ExecASMergeTriggers() are for per-statement triggers, and are invoked only > once in the whole process of a MERGE command. Setting flags at plan time can > save the scanning. I may add it to next patch. But don't expect it save > much execution time. Yeah, I'm not so much concerned about performance, it just seems like it might make the code slightly simpler. >> Or should we defer firing the statement triggers until we hit the first >> matching row and execute the action for the first time? > > No, we cannot do this. These are Per-statement triggers. They should be > fired before/after the main plan is executed. The statement triggers are > fired even no tuple is really matched with the actions. Ok. >> * This crashes: >> >> postgres=# CREATE TABLE target AS (SELECT 1 AS id); >> SELECT 1 >> postgres=# MERGE into target t >> USING (select 1 AS id) AS s >> >> ON t.id = s.id >> WHEN MATCHED THEN >> UPDATE SET id = (SELECT COUNT(*) FROM generate_series(1,10)) >> ; >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Failed. > > Oh, a really serious bug. I have not considered this case (user series > generator in actions) before. I will see what I can do for it. It's not specific to generate_series() but subqueries in general that are the problem. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: