Re: gSoC - ADD MERGE COMMAND - code patch submission

Поиск
Список
Период
Сортировка
От Boxuan Zhai
Тема Re: gSoC - ADD MERGE COMMAND - code patch submission
Дата
Msg-id AANLkTimmu8tng5Bkxw6bDlpIMoYZmBMlExEAVgb3nqq9@mail.gmail.com
обсуждение исходный текст
Ответ на Fwd: gSoC - ADD MERGE COMMAND - code patch submission  (Boxuan Zhai <bxzhai2010@gmail.com>)
Список pgsql-hackers
Hi,
 
I have just moved my modifications to the latest git edition. And I made a patch file through git diff as the second submission. I think the format is much better the my last submission.
 
As I mentioned before, our progress has come into the executor. So far, the executor can accept the top-level query and return tuples for it. The next step is to add action qualification evaluation on the returned tuple slot.
 
Thanks
 
Boxuan


 
2010/7/17 Boxuan Zhai <bxzhai2010@gmail.com>


---------- Forwarded message ----------
From: Boxuan Zhai <bxzhai2010@gmail.com>
Date: 2010/7/17
Subject: Re: [HACKERS] gSoC - ADD MERGE COMMAND - code patch submission
To: Simon Riggs <simon@2ndquadrant.com>




2010/7/17 Simon Riggs <simon@2ndquadrant.com>

On Fri, 2010-07-16 at 08:26 +0800, Boxuan Zhai wrote:
> The merge actions are transformed into lower level queries. I create a
> Query node  for each of them and append them in a newly create List
> field mergeActQry. The action queries have different command type and
> specific target list and qual list, according to their declaration by
> user. But they all share the same range table. This is because we
> don't need the action queries to be planned latter. The joining
> strategy is decided by the top query. We are only interest in their
> specific action qualifications. In other words, these action queries
> are only containers for their target list and qualifications.
>
> 2. When the query is ready, it will be send to rewriter. In this part,
> we can call RewriteQuery() to handle the action queries. The UPDATE
> action will trigger rules on UPDATE, and so on. What need to be
> noticed are: 1. the actions of the same type should not be rewritten
> repeatedly. If there are two UPDATE actions in merge command, we
> should not trigger the ON UPDATE rules twice. 2. if an action type is
> fully replaced by rules, we should remove all actions of this type
> from the action list.
> Rewriter will also do some process on the target list of each action.

IMHO it is a bad thing that we are attempting to execute each action
statement as a query. That means we need to execute an inner SQL
statement for each row returned by the top level query.

That design makes MERGE similar in performance to an upsert PL/pgsql
function, which will perform terribly on large numbers of rows.

Dear Simmon,
 
Thanks for your feedback. I may not present my idea clearly. 
In my design, the merge actions are not executed as separate queries. Only the top level query (that is a query like "<source table> LEFT JOIN <target_table> ON <matching_qual>" ) will be planned and executed. For each tuple return by this plan, we will choose a proper action for it and do the corresponding modification. The tables will only be scanned and joined once. One merge action will not do a full run of tables join and then modify table as a standard UPDATE/DELETE/INSERT query.  (Is this what you are worried about?)
 
In fact, for one action, we only need the information of: 1. the action type (UPDATE or DELTE or INSERT). 2 the target list. and 3. the additional qualifications. And a Query node is a perfect container for these infor. That's why I transform them in to Query nodes. But all through the analyzer, rewriter, planner and executor. I just call related functions to formalize the expressions in their target list and qual lists. The range table and join tree is only dermined by the top level query, they will not be effected by merge actions.
 
 
 
This was exactly the point where I stopped implementation previously:
attempting to make MERGE work with rules is enough to prevent a tighter
in-executor implementation of the action list.
I am sorry that I don't catch your meanning here clearly.
As my understanding, if there is a rule on the target table, the rewriter will add a new query in the execution queue. (or replace the original query).  I think the rule queries will not effect the process within the original query, because they are totally separate queries which will be run before or after the original query. Are you suggest that we should not allow rules on MERGE command?
 
 
[To Boxuan, on a personal note, you seem to be coping quite well with
the code and the process; congratulations and keep going.]
 
Thank you. Your encouragement is very important to me.
 
--

 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services




Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Boxuan Zhai
Дата:
Сообщение: Fwd: gSoC - ADD MERGE COMMAND - code patch submission
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: bg worker: overview