Обсуждение: Unnecessary checks for new rows by some RI trigger functions?
When reviewing a related patch [1], I spent some time thinking about the
"detectNewRows" argument of ri_PerformCheck(). My understanding is that it
should (via the es_crosscheck_snapshot field of EState) prevent the executor
from accidentally deleting PK value that another transaction (whose data
changes the trigger's transaction does not see) might need.
However I find it confusing that the trigger functions pass detectNewRows=true
even if they only execute SELECT statement. I think that in these cases the
trigger functions 1) detect themselves that another transaction inserted
row(s) referencing the PK whose deletion is being checked, 2) do not try to
delete the PK anyway. Furthermore (AFAICS), only heap_update() and
heap_delete() functions receive the crosscheck snapshot, so ri_PerformCheck()
does not have to pass the crosscheck snapshot to SPI_execute_snapshot() for
SELECT queries.
Following is patch proposal. Is anything wrong about that?
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index e1aa3d0044..e235ad9cd0 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -574,7 +574,7 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
     result = ri_PerformCheck(riinfo, &qkey, qplan,
                              fk_rel, pk_rel,
                              old_row, NULL,
-                             true,    /* treat like update */
+                             false,
                              SPI_OK_SELECT);
     if (SPI_finish() != SPI_OK_FINISH)
@@ -802,7 +802,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
             ri_PerformCheck(riinfo, &qkey, qplan,
                             fk_rel, pk_rel,
                             old_row, NULL,
-                            true,    /* must detect new rows */
+                            false,
                             SPI_OK_SELECT);
             if (SPI_finish() != SPI_OK_FINISH)
[1] https://commitfest.postgresql.org/22/1975/
--
Antonin Houska
https://www.cybertec-postgresql.com
			
		On Wed, Feb 20, 2019 at 9:27 AM Antonin Houska <ah@cybertec.at> wrote: > However I find it confusing that the trigger functions pass detectNewRows=true > even if they only execute SELECT statement. I don't quite see what those two things have to do with each other, but I might be missing something. I stuck in a debugging elog() to see where ri_Check_Pk_Match() gets called and it fired for the following query in the regression tests: update pp set f1=f1+1; That agrees with my intuition, which is that the logic here has something to do with allowing an update that changes a referenced row but also changes some other row in the same table so that the reference remains valid -- it's just now a reference to some other row. Unfortunately, as you probably also noticed, making the change you propose here doesn't make any tests fail in either the main regression tests or the isolation tests. I suspect that this is a defect in the tests rather than a sign that the code doesn't need to be changed, though. I'd try a statement like the above in a multi-statement transaction with a higher-than-default isolation level, probably REPEATABLE READ, and maybe some concurrent activity in another session. Sorry, I'm hand-waving here; I don't know exactly what's going on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 20, 2019 at 9:27 AM Antonin Houska <ah@cybertec.at> wrote:
>> However I find it confusing that the trigger functions pass detectNewRows=true
>> even if they only execute SELECT statement.
> I don't quite see what those two things have to do with each other,
> but I might be missing something.
Me too.  It's quite easy to demonstrate that the second part of Antonin's
proposed patch (ie, don't check for new rows in the FK table during
ri_restrict) is wrong:
regression=# create table p(f1 int primary key);
CREATE TABLE
regression=# create table f(f1 int references p on delete no action deferrable initially deferred);
CREATE TABLE
regression=# insert into p values (1);
INSERT 0 1
regression=# begin transaction isolation level serializable ;
BEGIN
regression=# table f;
 f1
----
(0 rows)
regression=# delete from p where f1=1;
DELETE 1
-- now, in another session:
regression=# insert into f values (1);
INSERT 0 1
-- next, back in first session:
regression=# commit;
ERROR:  update or delete on table "p" violates foreign key constraint "f_f1_fkey" on table "f"
DETAIL:  Key (f1)=(1) is still referenced from table "f".
With the patch, the first session's commit succeeds, and we're left
with a violated FK constraint.
I'm less sure about whether the change in ri_Check_Pk_Match would be
safe.  On its face, what that is on about is the case where you have
matching rows in p and f, and a serializable transaction tries to
delete the p row, and by the time it commits some other transaction
has inserted a new p row so we could allow our deletion to succeed.
If you try that, however, you'll notice that the other transaction
can't commit its insertion because we haven't committed our deletion,
so the unique constraint on the PK side would be violated.  So maybe
there's a case for simplifying the logic there.  Or maybe there are
actually live cases for that with more complex MATCH rules than what
I tried.
But TBH I think Antonin's question is backwards: the right thing to
be questioning here is whether detectNewRows = false is *ever* OK.
I think he mischaracterized what that option really does; the comments
in ri_PerformCheck explain it this way:
     * In READ COMMITTED mode, we just need to use an up-to-date regular
     * snapshot, and we will see all rows that could be interesting. But in
     * transaction-snapshot mode, we can't change the transaction snapshot. If
     * the caller passes detectNewRows == false then it's okay to do the query
     * with the transaction snapshot; otherwise we use a current snapshot, and
     * tell the executor to error out if it finds any rows under the current
     * snapshot that wouldn't be visible per the transaction snapshot.
The question therefore is under what scenarios is it okay for an FK
constraint to be enforced (in a serializable or repeatable-read
transaction) without attention to operations already committed by
concurrent transactions?  If your gut answer isn't "damn near none",
I don't think you're being paranoid enough ;-)
The one case where we use detectNewRows == false today is in
RI_FKey_check, which we run after an insert or update on the FK
table to see if there's a matching row on the PK side.  In this
case, failing to observe a newly-added PK row won't result in a
constraint violation, just an "unnecessary" error.  I think this
choice is reasonable, since if you're running in serializable
mode you're not supposed to want to depend on transactions
committed after you start.  (Note that if somebody concurrently
*deletes* the PK row we need to match, we will notice that,
because the SELECT FOR SHARE will, and then it'll throw a
serialization error which seems like the right thing here.)
Perhaps a similar argument can be constructed to justify changing
the behavior of ri_Check_Pk_Match, but I've not thought about it
very hard.  In any case, changing ri_restrict is provably wrong.
> Unfortunately, as you probably also noticed, making the change
> you propose here doesn't make any tests fail in either the main
> regression tests or the isolation tests.
Yeah.  All the RI code predates the existence of the isolation
test machinery, so it's not so surprising that we don't have good
coverage for this type of situation.  I thought I remembered that
Peter had added some more FK tests recently, but I don't see
anything in the commit log ...
            regards, tom lane
			
		Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Feb 20, 2019 at 9:27 AM Antonin Houska <ah@cybertec.at> wrote:
> >> However I find it confusing that the trigger functions pass detectNewRows=true
> >> even if they only execute SELECT statement.
>
> > I don't quite see what those two things have to do with each other,
> > but I might be missing something.
>
> Me too.
Sure, that was the problem! I was thinking about the crosscheck snapshot so
much that I missed the fact that snapshot handling is not the only purpose of
the detectNewRows argument. What I considered a problem can be fixed this way,
if it's worth:
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index e1aa3d0044..993e778875 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -233,7 +233,7 @@ static bool ri_PerformCheck(const RI_ConstraintInfo *riinfo,
                 RI_QueryKey *qkey, SPIPlanPtr qplan,
                 Relation fk_rel, Relation pk_rel,
                 HeapTuple old_tuple, HeapTuple new_tuple,
-                bool detectNewRows, int expect_OK);
+                bool detectNewRows, bool select_only, int expect_OK);
 static void ri_ExtractValues(Relation rel, HeapTuple tup,
                  const RI_ConstraintInfo *riinfo, bool rel_is_pk,
                  Datum *vals, char *nulls);
@@ -439,7 +439,7 @@ RI_FKey_check(TriggerData *trigdata)
     ri_PerformCheck(riinfo, &qkey, qplan,
                     fk_rel, pk_rel,
                     NULL, new_row,
-                    false,
+                    false, true,
                     SPI_OK_SELECT);
     if (SPI_finish() != SPI_OK_FINISH)
@@ -575,6 +575,7 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
                              fk_rel, pk_rel,
                              old_row, NULL,
                              true,    /* treat like update */
+                             true,
                              SPI_OK_SELECT);
     if (SPI_finish() != SPI_OK_FINISH)
@@ -803,6 +804,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
                             fk_rel, pk_rel,
                             old_row, NULL,
                             true,    /* must detect new rows */
+                            true,
                             SPI_OK_SELECT);
             if (SPI_finish() != SPI_OK_FINISH)
@@ -943,6 +945,7 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
                             fk_rel, pk_rel,
                             old_row, NULL,
                             true,    /* must detect new rows */
+                            false,
                             SPI_OK_DELETE);
             if (SPI_finish() != SPI_OK_FINISH)
@@ -1099,6 +1102,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
                             fk_rel, pk_rel,
                             old_row, new_row,
                             true,    /* must detect new rows */
+                            false,
                             SPI_OK_UPDATE);
             if (SPI_finish() != SPI_OK_FINISH)
@@ -1286,6 +1290,7 @@ ri_setnull(TriggerData *trigdata)
                             fk_rel, pk_rel,
                             old_row, NULL,
                             true,    /* must detect new rows */
+                            false,
                             SPI_OK_UPDATE);
             if (SPI_finish() != SPI_OK_FINISH)
@@ -1473,6 +1478,7 @@ ri_setdefault(TriggerData *trigdata)
                             fk_rel, pk_rel,
                             old_row, NULL,
                             true,    /* must detect new rows */
+                            false,
                             SPI_OK_UPDATE);
             if (SPI_finish() != SPI_OK_FINISH)
@@ -2356,7 +2362,7 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
                 RI_QueryKey *qkey, SPIPlanPtr qplan,
                 Relation fk_rel, Relation pk_rel,
                 HeapTuple old_tuple, HeapTuple new_tuple,
-                bool detectNewRows, int expect_OK)
+                bool detectNewRows, bool select_only, int expect_OK)
 {
     Relation    query_rel,
                 source_rel;
@@ -2423,17 +2429,20 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
      * that SPI_execute_snapshot will register the snapshots, so we don't need
      * to bother here.
      */
+    test_snapshot = InvalidSnapshot;
+    crosscheck_snapshot = InvalidSnapshot;
     if (IsolationUsesXactSnapshot() && detectNewRows)
     {
         CommandCounterIncrement();    /* be sure all my own work is visible */
+
         test_snapshot = GetLatestSnapshot();
-        crosscheck_snapshot = GetTransactionSnapshot();
-    }
-    else
-    {
-        /* the default SPI behavior is okay */
-        test_snapshot = InvalidSnapshot;
-        crosscheck_snapshot = InvalidSnapshot;
+
+        /*
+         * crosscheck_snapshot is actually used only for UPDATE / DELETE
+         * queries.
+         */
+        if (!select_only)
+            crosscheck_snapshot = GetTransactionSnapshot();
     }
     /*
> It's quite easy to demonstrate that the second part of Antonin's
> proposed patch (ie, don't check for new rows in the FK table during
> ri_restrict) is wrong:
>
> regression=# create table p(f1 int primary key);
> CREATE TABLE
> regression=# create table f(f1 int references p on delete no action deferrable initially deferred);
> CREATE TABLE
> regression=# insert into p values (1);
> INSERT 0 1
> regression=# begin transaction isolation level serializable ;
> BEGIN
> regression=# table f;
>  f1
> ----
> (0 rows)
>
> regression=# delete from p where f1=1;
> DELETE 1
>
> -- now, in another session:
>
> regression=# insert into f values (1);
> INSERT 0 1
>
> -- next, back in first session:
>
> regression=# commit;
> ERROR:  update or delete on table "p" violates foreign key constraint "f_f1_fkey" on table "f"
> DETAIL:  Key (f1)=(1) is still referenced from table "f".
>
> With the patch, the first session's commit succeeds, and we're left
> with a violated FK constraint.
When I was running this example, the other session got blocked until the first
(serializable) transaction committed. To achieve this ERROR (or FK violated
due to my patch proposal) I had to run the other session's INSERT before the
first session's DELETE. But I think I understand the problem.
Thanks for the detailed analysis.
--
Antonin Houska
https://www.cybertec-postgresql.com
			
		Antonin Houska <ah@cybertec.at> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It's quite easy to demonstrate that the second part of Antonin's
>> proposed patch (ie, don't check for new rows in the FK table during
>> ri_restrict) is wrong:
> When I was running this example, the other session got blocked until the first
> (serializable) transaction committed. To achieve this ERROR (or FK violated
> due to my patch proposal) I had to run the other session's INSERT before the
> first session's DELETE.
Oh, right, I copied and pasted out of my terminal windows in the wrong
order :-(.  The INSERT indeed must happen before the DELETE (and the
TABLE or SELECT before that, so as to establish the serializable
transaction's snapshot before the insertion).  Sorry about that.
Not sure what I think about your new proposed patch.  What problem
do you think it solves?  Also, don't think I believe this:
+         * crosscheck_snapshot is actually used only for UPDATE / DELETE
+         * queries.
The commands we're issuing here are SELECT FOR UPDATE^H^H^HSHARE,
and those should chase up to the newest row version and do a
crosscheck just as UPDATE / DELETE would do.  If they don't, there's
a hazard of mis-enforcing the FK constraint in the face of
concurrent updates.
            regards, tom lane
			
		Tom Lane <tgl@sss.pgh.pa.us> wrote: > Not sure what I think about your new proposed patch. What problem > do you think it solves? Also, don't think I believe this: > > + * crosscheck_snapshot is actually used only for UPDATE / DELETE > + * queries. I wanted to clarify the meaning of crosscheck_snapshot, i.e. only set it when it's needed. Anyway I don't feel now it's worth the amount of code changed. > The commands we're issuing here are SELECT FOR UPDATE^H^H^HSHARE, > and those should chase up to the newest row version and do a > crosscheck just as UPDATE / DELETE would do. If they don't, there's > a hazard of mis-enforcing the FK constraint in the face of > concurrent updates. Maybe I missed something. When I searched through the code I saw the crosscheck_snapshot passed only to heap_update() and heap_delete(). -- Antonin Houska https://www.cybertec-postgresql.com