Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
От | Dean Rasheed |
---|---|
Тема | Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 |
Дата | |
Msg-id | CAEZATCVDdYRFbF4NMXTF-NKYibbR2VSfNXRWPyyByaCpV1jwEw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
(Stephen Frost <sfrost@snowman.net>)
|
Список | pgsql-hackers |
On 9 April 2015 at 22:18, Peter Geoghegan <pg@heroku.com> wrote: > The big idea (the fine details of which Stephen appeared to be in > tentative agreement with [1]) is that an UPSERT is a hybrid between an > INSERT and an UPDATE, and not simply an INSERT and separate UPDATE > tied together. So at the very least the exact behavior of such a > hybrid cannot be assumed to be any particular way just from > generalizing from known behaviors for INSERT and UPDATE (which is a > usability issue, since the fine details of RLS are already complex > enough without UPSERT). > I think that you're over-complicating this. From a usability point of view, I think that the best approach is to keep this as simple as possible and make the behaviour consistent with an INSERT and an UPDATE tied together, as is suggested by the new syntax. The key point is that, if you are using the RLS INSERT and UPDATE policies for this new command, the rule should be that the user has permission to insert/update a new/existing row if and only if they would have had permission to do so using a stand-alone INSERT/UPDATE command. On the other hand, if you believe that the behaviour should be something other than that, then I think that it would need a new dedicated kind of RLS policy for this command because, as I will attempt to explain below, merging together the quals from INSERT and UPDATE policies leads to logical problems. > The INSERT policies are only enforced when a tuple is inserted > because, when the alternative path isn't taken then it's really just > an INSERT. > Agreed. > For the UPDATE path, where the stickiness/hybridness begins... > <snip> > I'd appreciate it if you explicitly outlined what policies you feel > should be enforce at each of the 3 junctures within an UPSERT (post > INSERT, pre-UPDATE, post-UPDATE). I would also like you to be very > explicit about whether or not RLS WITH CHECK policies and USING quals > (presumably enforced as RLS WITH CHECK policies) from both INSERT and > UPDATE policies should be enforced at each point. In particular, > should I ever not treat RLS WCO and USING quals equivalently? (recall > that we don't want to elide an UPDATE silently, which makes much less > sense for UPSERT...I had assumed that whatever else, we'd always treat > WCO and USING quals from UPDATE (and ALL) policies equivalently, but > perhaps not). OK, I'll try to explicitly outline how I think this ought to work: For INSERTs and UPDATEs, there are 3 kinds of RLS quals that come into play: 1). insertCheckQuals - the logical OR of the quals from all INSERT WITH CHECK policy clauses. These give the user permission to insert into the table, provided that the new rows satisfy at least one of these quals. 2). updateUsingQuals - the logical OR of the quals from all UPDATE USING policy clauses. These give the user permission to update existing rows in the table, where the existing rows satisfy at least one of these quals. 3). updateCheckQuals - the logical OR of the quals from all UPDATE WITH CHECK policy clauses. These give the user permission to update the table, provided that the new rows satisfy at least one of these quals. In general these may all differ from one another. If we go forward with the idea that RLS quals should be checked before attempting to insert/update any data, as we do for regular permission checks, then stand-alone INSERT and UPDATE commands would work conceptually as follows: INSERT: 1. Check NEW against insertCheckQuals (error if the result is not true) 2. Do the actual insert of NEW UPDATE: 1. Check OLD against updateUsingQuals (skip rows that don't match) 2. Check NEW against updateCheckQuals (error ifthe result is not true) 3. Do the actual update (OLD -> NEW) Of course that's an over-simplification. The updateUsingQuals are actually merged with the user-supplied WHERE clause in a way dependent on the presence of leaky functions, but conceptually it matches the above description. I think that there is universal agreement that an INSERT .. ON CONFLICT UPDATE that follows the insert path ought to match the pure INSERT case, and only check the insertCheckQuals. That just leaves the question of what to do on the update path, where things get more complicated because there are 3 tuples involved in the process: 1). t1 - the tuple originally proposed for insertion, but which could not be inserted due to a conflict with an existing row in the table (aka EXCLUDED). 2). t2 - the existing row in the table that prevented t1 from being inserted (aka TARGET). 3). t3 - the final new row resulting from the update path. In general, we allow this to be quite different from t1 and t2. If we think of INSERT .. ON CONFLICT UPDATE as an INSERT and an UPDATE tied together, then logically the following would happen if the update path were taken: INSERT .. ON CONFLICT UPDATE (update path): 1. Check t1 against insertCheckQuals (error if not true) 2. Detect that t1 conflictswith t2 3. Test user-supplied auxiliary WHERE clause (skip if not true) 4. Check t2 against updateUsingQuals (errorif not true) 5. Check t3 against updateCheckQuals (error if not true) 6. Do the actual update (t2 -> t3) Step (4) is the only point where this differs from an INSERT and an UPDATE tied together, and only in the fact that it throws an error rather than skipping the row to be updated if the user doesn't have permission to do so. The important point is that this is consistent with the rule that it gives the user permission to insert/update a new/existing row if and only if they would have been allowed to do so using a stand-alone INSERT/UPDATE command. Note that there are 3 participating tuples, and each is checked against precisely one of the 3 relevant policies. That is important, as I will attempt to explain. As it stands, your patch classifies WCOs by command type and combines them in a way that results in 4 additional RLS checks being performed (not really in this order): 5.1. Check t2 against insertCheckQuals (error if not true) 5.2. Check t2 against updateCheckQuals (error if not true) 5.3.Check t3 against insertCheckQuals (error if not true) 5.4. Check t3 against updateUsingQuals (error if not true) Some of those additional checks really don't make any sense, e.g., (5.2) prevents you from updating an existing tuple if you wouldn't have had permission to make an update that resulted in that tuple, which just seems completely backwards. But the real problem is that by applying multiple checks to the same tuple, you're implicitly ANDing together the quals from different RLS policy types and the problem with that is that it's possible for the quals from different policy types to be completely incompatible (their logical AND may be identically equivalent to false). So by doing these additional checks it may be impossible for any tuple to ever satisfy all the checks at the same time, making the command unusable. So the general rule, as I alluded to above, should be that no 2 different kinds of RLS quals should ever be applied to the same tuple. The nub of the problem is that you're classifying WCOs by command type, which is wrong because updateUsingQuals have different semantics from updateCheckQuals, even though they both originate from the UPDATE policy. If you factor in updatable SB views, which we might hope to one day have support for INSERT .. ON CONFLICT UPDATE, there will be additional WCOs arising from the view, which will have different semantics again (the SQL spec says that they should apply after the insert/update to ensure that the final result is visible in the view). These are then another separate kind of WCO (that doesn't depend on the command type). I think this will actually be quite straightforward, as long as all the different kinds of WCO aren't lumped together. If you take a look at my patch to apply RLS checks before insert/update, you'll see that I've added a WCOKind enum that allows each of these kinds of WCOs to be treated differently and checked at different stages in the executor. I anticipated that INSERT .. ON CONFLICT UPDATE would extend that by adding a new kind of WCO for updateUsingQuals checked as if they were WCOs, allowing them to be checked at a specific point in the new code (step (4) above). In all of this, I think we should try to keep things as simple as possible, to give the end user a chance to understand it --- although I'm not sure I've achieved that with my explanation :-) Regards, Dean
В списке pgsql-hackers по дате отправления: