Обсуждение: Does rewriteTargetListIU still need to add UPDATE tlist entries?

Поиск
Список
Период
Сортировка

Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Tom Lane
Дата:
The comments for rewriteTargetListIU say (or said until earlier today)

 * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any
 * unassigned-to attributes, assigning them their old values.  These will
 * later get expanded to the output values of the view.  (This is equivalent
 * to what the planner's expand_targetlist() will do for UPDATE on a regular
 * table, but it's more convenient to do it here while we still have easy
 * access to the view's original RT index.)  This is only necessary for
 * trigger-updatable views, for which the view remains the result relation of
 * the query.  For auto-updatable views we must not do this, since it might
 * add assignments to non-updatable view columns.  For rule-updatable views it
 * is unnecessary extra work, since the query will be rewritten with a
 * different result relation which will be processed when we recurse via
 * RewriteQuery.

I noticed that this is referencing something that, in fact,
expand_targetlist() doesn't do anymore, so that this is a poor
justification for the behavior.  My first thought was that we still
need to do it to produce the correct row contents for the INSTEAD OF
trigger, so I updated the comment (in 08a986966) to claim that.

However, on closer inspection, that's nonsense.  nodeModifyTable.c
populates the trigger "OLD" row from the whole-row variable that is
generated for the view, and then it computes the "NEW" row using
that old row and the UPDATE tlist; there is no need there for the
UPDATE tlist to compute all the columns.  The regression tests still
pass just fine if we take out the questionable logic (cf. attached
patch).  Moreover, if you poke into it a little closer than the tests
do, you notice that the existing code is uselessly computing non-updated
columns twice, in both the extra tlist item and the whole-row variable.
As an example, consider

create table base (a int, b int);
create view v1 as select a+1 as a1, b+2 as b2 from base;
-- you also need an INSTEAD OF UPDATE trigger, not shown here
explain verbose update v1 set a1 = a1 - 44;

With HEAD you get

 Update on public.v1  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=46)
         Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid

There's really no need to compute base.b + 2 twice, and with this
patch we don't:

 Update on public.v1  (cost=0.00..55.20 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..55.20 rows=2260 width=42)
         Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid


I would think that this is a totally straightforward improvement,
but there's one thing in the comments for rewriteTargetListIU that
gives me a little pause: it says

 * We must do items 1,2,3 before firing rewrite rules, else rewritten
 * references to NEW.foo will produce wrong or incomplete results.

As far as I can tell, though, references to NEW values still do the
right thing.  I'm not certain whether any of the existing regression
tests really cover this point, but experimenting with the scenario shown
in the attached SQL file says that the DO ALSO rule gets the right
results.  It's possible that the expansion sequence is a bit different
than before, but we still end up with the right answer.

So, as far as I can tell, this is an oversight in 86dc90056 and we
ought to clean it up as attached.

            regards, tom lane

diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 4fa4ac2aef..4bba6eb805 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -679,18 +679,7 @@ adjustJoinTreeList(Query *parsetree, bool removert, int rt_index)
  * and UPDATE, replace explicit DEFAULT specifications with column default
  * expressions.
  *
- * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any
- * unassigned-to attributes, assigning them their old values.  These will
- * later get expanded to the output values of the view.  This is only needed
- * for trigger-updatable views, for which the view remains the result relation
- * of the query; without it, we would not have a complete "new tuple" to pass
- * to triggers.  For auto-updatable views we must not do this, since it might
- * add assignments to non-updatable view columns.  For rule-updatable views it
- * is unnecessary extra work, since the query will be rewritten with a
- * different result relation which will be processed when we recurse via
- * RewriteQuery.
- *
- * 3. Merge multiple entries for the same target attribute, or declare error
+ * 2. Merge multiple entries for the same target attribute, or declare error
  * if we can't.  Multiple entries are only allowed for INSERT/UPDATE of
  * portions of an array or record field, for example
  *            UPDATE table SET foo[2] = 42, foo[4] = 43;
@@ -698,11 +687,11 @@ adjustJoinTreeList(Query *parsetree, bool removert, int rt_index)
  * the expression we want to produce in this case is like
  *        foo = array_set_element(array_set_element(foo, 2, 42), 4, 43)
  *
- * 4. Sort the tlist into standard order: non-junk fields in order by resno,
+ * 3. Sort the tlist into standard order: non-junk fields in order by resno,
  * then junk fields (these in no particular order).
  *
- * We must do items 1,2,3 before firing rewrite rules, else rewritten
- * references to NEW.foo will produce wrong or incomplete results.  Item 4
+ * We must do items 1 & 2 before firing rewrite rules, else rewritten
+ * references to NEW.foo will produce wrong or incomplete results.  Item 3
  * is not needed for rewriting, but it is helpful for the planner, and we
  * can do it essentially for free while handling the other items.
  *
@@ -984,29 +973,6 @@ rewriteTargetListIU(List *targetList,
                                           false);
         }

-        /*
-         * For an UPDATE on a trigger-updatable view, provide a dummy entry
-         * whenever there is no explicit assignment.
-         */
-        if (new_tle == NULL && commandType == CMD_UPDATE &&
-            target_relation->rd_rel->relkind == RELKIND_VIEW &&
-            view_has_instead_trigger(target_relation, CMD_UPDATE))
-        {
-            Node       *new_expr;
-
-            new_expr = (Node *) makeVar(result_rti,
-                                        attrno,
-                                        att_tup->atttypid,
-                                        att_tup->atttypmod,
-                                        att_tup->attcollation,
-                                        0);
-
-            new_tle = makeTargetEntry((Expr *) new_expr,
-                                      attrno,
-                                      pstrdup(NameStr(att_tup->attname)),
-                                      false);
-        }
-
         if (new_tle)
             new_tlist = lappend(new_tlist, new_tle);
     }
drop table base cascade;
drop table log cascade;
create table base (a int, b int);
create view v1 as select a+1 as a1, b+2 as b2 from base;
insert into base values (1,10),(11,13);

create table log(ao int, an int, bo int, bn int);

CREATE OR REPLACE FUNCTION vtrigger()
 RETURNS trigger
 LANGUAGE plpgsql
AS $function$
begin
    raise notice '% % % %', TG_TABLE_NAME, TG_WHEN, TG_OP, TG_LEVEL;
    raise NOTICE 'OLD: %, NEW: %', OLD, NEW;
    UPDATE base SET a = NEW.a1-1, b = NEW.b2-2
      WHERE a = OLD.a1-1 AND b = OLD.b2-2;
    if NOT FOUND then RETURN NULL; end if;
    RETURN NEW;
end;
$function$;

CREATE TRIGGER instead_of_update_trig INSTEAD OF UPDATE ON v1
FOR EACH ROW EXECUTE PROCEDURE vtrigger();

create rule logit as on update to v1 do also
insert into log values(old.a1, new.a1, old.b2, new.b2);

update v1 set a1 = a1-44;

Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Amit Langote
Дата:
On Mon, Apr 26, 2021 at 9:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The comments for rewriteTargetListIU say (or said until earlier today)
>
>  * 2. For an UPDATE on a trigger-updatable view, add tlist entries for any
>  * unassigned-to attributes, assigning them their old values.  These will
>  * later get expanded to the output values of the view.  (This is equivalent
>  * to what the planner's expand_targetlist() will do for UPDATE on a regular
>  * table, but it's more convenient to do it here while we still have easy
>  * access to the view's original RT index.)  This is only necessary for
>  * trigger-updatable views, for which the view remains the result relation of
>  * the query.  For auto-updatable views we must not do this, since it might
>  * add assignments to non-updatable view columns.  For rule-updatable views it
>  * is unnecessary extra work, since the query will be rewritten with a
>  * different result relation which will be processed when we recurse via
>  * RewriteQuery.
>
> I noticed that this is referencing something that, in fact,
> expand_targetlist() doesn't do anymore, so that this is a poor
> justification for the behavior.  My first thought was that we still
> need to do it to produce the correct row contents for the INSTEAD OF
> trigger, so I updated the comment (in 08a986966) to claim that.

Check.

> However, on closer inspection, that's nonsense.  nodeModifyTable.c
> populates the trigger "OLD" row from the whole-row variable that is
> generated for the view, and then it computes the "NEW" row using
> that old row and the UPDATE tlist; there is no need there for the
> UPDATE tlist to compute all the columns.  The regression tests still
> pass just fine if we take out the questionable logic (cf. attached
> patch).  Moreover, if you poke into it a little closer than the tests
> do, you notice that the existing code is uselessly computing non-updated
> columns twice, in both the extra tlist item and the whole-row variable.
>
> As an example, consider
>
> create table base (a int, b int);
> create view v1 as select a+1 as a1, b+2 as b2 from base;
> -- you also need an INSTEAD OF UPDATE trigger, not shown here
> explain verbose update v1 set a1 = a1 - 44;
>
> With HEAD you get
>
>  Update on public.v1  (cost=0.00..60.85 rows=0 width=0)
>    ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=46)
>          Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1), (base.b + 2)), base.ctid
>
> There's really no need to compute base.b + 2 twice, and with this
> patch we don't:
>
>  Update on public.v1  (cost=0.00..55.20 rows=0 width=0)
>    ->  Seq Scan on public.base  (cost=0.00..55.20 rows=2260 width=42)
>          Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid

That makes sense to me, at least logically.

> I would think that this is a totally straightforward improvement,
> but there's one thing in the comments for rewriteTargetListIU that
> gives me a little pause: it says
>
>  * We must do items 1,2,3 before firing rewrite rules, else rewritten
>  * references to NEW.foo will produce wrong or incomplete results.
>
> As far as I can tell, though, references to NEW values still do the
> right thing.  I'm not certain whether any of the existing regression
> tests really cover this point, but experimenting with the scenario shown
> in the attached SQL file says that the DO ALSO rule gets the right
> results.  It's possible that the expansion sequence is a bit different
> than before, but we still end up with the right answer.

I also checked what the rewriter and the planner do for the following
DO ALSO insert:

create rule logit as on update to v1 do also
insert into log values(old.a1, new.a1, old.b2, new.b2);

and can see that the insert ends up with the right targetlist
irrespective of whether or not rewriteTargetListIU() adds an item for
NEW.b2.  So, I attached a debugger to the update query in your shared
script and focused on how ReplaceVarsFromTargetList(), running on the
insert query added by the rule, handles the item for NEW.b2 no longer
being added to the update's targetlist after your patch.  Turns out
the result (the insert's targetlist) is the same even if the path
taken in ReplaceVarsFromTargetList_callback() is different after your
patch.

Before:

explain verbose update v1 set a1 = a1-44;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Insert on public.log  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=16)
         Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2)

 Update on public.v1  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=46)
         Output: ((base.a + 1) - 44), (base.b + 2), ROW((base.a + 1),
(base.b + 2)), base.ctid
(7 rows)

After:

explain verbose update v1 set a1 = a1-44;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Insert on public.log  (cost=0.00..60.85 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..60.85 rows=2260 width=16)
         Output: (base.a + 1), ((base.a + 1) - 44), (base.b + 2), (base.b + 2)

 Update on public.v1  (cost=0.00..55.20 rows=0 width=0)
   ->  Seq Scan on public.base  (cost=0.00..55.20 rows=2260 width=42)
         Output: ((base.a + 1) - 44), ROW((base.a + 1), (base.b + 2)), base.ctid
(7 rows)

I didn't however study closely why REPLACEVARS_CHANGE_VARNO does the
correct thing, so am not sure if there might be cases that would be
broken.

> So, as far as I can tell, this is an oversight in 86dc90056 and we
> ought to clean it up as attached.

Thanks for noticing this and the patch.  If you are confident that
REPLACEVARS_CHANGE_VARNO covers all imaginable cases, I suppose it
makes sense to apply it.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Tom Lane
Дата:
Amit Langote <amitlangote09@gmail.com> writes:
> On Mon, Apr 26, 2021 at 9:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I would think that this is a totally straightforward improvement,
>> but there's one thing in the comments for rewriteTargetListIU that
>> gives me a little pause: it says
>> 
>> * We must do items 1,2,3 before firing rewrite rules, else rewritten
>> * references to NEW.foo will produce wrong or incomplete results.
>> 
>> As far as I can tell, though, references to NEW values still do the
>> right thing.  I'm not certain whether any of the existing regression
>> tests really cover this point, but experimenting with the scenario shown
>> in the attached SQL file says that the DO ALSO rule gets the right
>> results.  It's possible that the expansion sequence is a bit different
>> than before, but we still end up with the right answer.

> I also checked what the rewriter and the planner do for the following
> DO ALSO insert:
> create rule logit as on update to v1 do also
> insert into log values(old.a1, new.a1, old.b2, new.b2);
> and can see that the insert ends up with the right targetlist
> irrespective of whether or not rewriteTargetListIU() adds an item for
> NEW.b2.

Thanks for looking at that.  On reflection I think this must be so,
because those rewriter mechanisms were designed long before we had
trigger-updatable views, and rewriteTargetListIU has never added
tlist items like this for any other sort of view.  So the ability
to insert the original view output column has necessarily been there
from the beginning.  This is just getting rid of a weird implementation
difference between trigger-updatable views and other views.

            regards, tom lane



Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Dean Rasheed
Дата:
On Mon, 26 Apr 2021 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thanks for looking at that.  On reflection I think this must be so,
> because those rewriter mechanisms were designed long before we had
> trigger-updatable views, and rewriteTargetListIU has never added
> tlist items like this for any other sort of view.  So the ability
> to insert the original view output column has necessarily been there
> from the beginning.  This is just getting rid of a weird implementation
> difference between trigger-updatable views and other views.
>

FWIW, I had a look at this too and came to much the same conclusion,
so I think this is a safe change that makes the code a little neater
and more efficient.

Regards,
Dean



Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Tom Lane
Дата:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On Mon, 26 Apr 2021 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thanks for looking at that.  On reflection I think this must be so,
>> because those rewriter mechanisms were designed long before we had
>> trigger-updatable views, and rewriteTargetListIU has never added
>> tlist items like this for any other sort of view.  So the ability
>> to insert the original view output column has necessarily been there
>> from the beginning.  This is just getting rid of a weird implementation
>> difference between trigger-updatable views and other views.

> FWIW, I had a look at this too and came to much the same conclusion,
> so I think this is a safe change that makes the code a little neater
> and more efficient.

Again, thanks for looking!

I checked into the commit history (how'd we ever survive without "git
blame"?) and found that my argument above is actually wrong in detail.
Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
columns for all views not only trigger-updatable ones.  However, that
behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
triggers on views; before that there was indeed no such expansion.
Of course the view rewrite mechanisms are ten or so years older than
that, so the conclusion that they weren't designed to need this still
stands.

            regards, tom lane



Re: Does rewriteTargetListIU still need to add UPDATE tlist entries?

От
Dean Rasheed
Дата:
On Mon, 26 Apr 2021 at 15:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I checked into the commit history (how'd we ever survive without "git
> blame"?) and found that my argument above is actually wrong in detail.
> Before cab5dc5da of 2013-10-18, rewriteTargetListIU expanded non-updated
> columns for all views not only trigger-updatable ones.  However, that
> behavior itself goes back only to 2ec993a7c of 2010-10-10, which added
> triggers on views; before that there was indeed no such expansion.

Ah, that makes sense. Before cab5dc5da, expanding non-updated columns
of auto-updatable views was safe because until then a view was only
auto-updatable if all it's columns were. It was still unnecessary work
though, and with 20/20 hindsight, when triggers on views were first
added in 2ec993a7c, it probably should have only expanded the
targetlist for trigger-updatable views.

> Of course the view rewrite mechanisms are ten or so years older than
> that, so the conclusion that they weren't designed to need this still
> stands.

Yeah, I think that conclusion is right. The trickiest part I found was
deciding whether any product queries from conditional rules would do
the right thing if the main trigger-updatable query no longer expands
its targetlist. But I think that has to be OK, because even before
trigger-updatable views were added, it was possible to have product
queries from conditional rules together with an unconditional
do-nothing rule, so the product queries don't rely on the expanded
targetlist, and never have.

Regards,
Dean