Обсуждение: Fix optimization of foreign-key on update actions
I came across an edge case in how our foreign key implementation works that I think is not SQL conforming. It has to do with how updates to values that "look" different but compare as equal are cascaded. A simple case involves float -0 vs. 0, but relevant cases also arise with citext and case-insensitive collations. Consider this example: create table pktable2 (a float8, b float8, primary key (a, b)); create table fktable2 (x float8, y float8, foreign key (x, y) references pktable2 (a, b) on update cascade); insert into pktable2 values ('-0', '-0'); insert into fktable2 values ('-0', '-0'); update pktable2 set a = '0' where a = '-0'; What happens now? select * from pktable2; a | b ---+---- 0 | -0 (1 row) -- buggy: did not update fktable2.x select * from fktable2; x | y ----+---- -0 | -0 (1 row) This happens because ri_KeysEqual() compares the old and new rows and decides that because they are "equal", the ON UPDATE actions don't need to be run. The SQL standard seems clear that ON CASCADE UPDATE means that an analogous UPDATE should be run on matching rows in the foreign key table, so the current behavior is wrong. Moreover, if another column is also updated, like update pktable2 set a = '0', b = '5', then the old and new rows are no longer equal, and so x will get updated in fktable2. So the context creates inconsistencies. The fix is that in these cases we have ri_KeysEqual() use a more low-level form of equality, like for example record_image_eq does. In fact, we can take the same code. See attached patches. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: Peter> The SQL standard seems clear ha hahaha HAHAHAHAHAHA (since when has the SQL standard ever been clear?) SQL2016, 15.17 Execution of referential actions 10) If a non-null value of a referenced column RC in the referenced table is updated to a value that is distinct from the current value of RC, then, [snip all the stuff about how ON UPDATE actions work] does that "is distinct from" mean that IS DISTINCT FROM would be true, or does it mean "is in some way distinguishable from"? Nothing I can see in the spec suggests the latter. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > Peter> The SQL standard seems clear > (since when has the SQL standard ever been clear?) Point to Andrew ;-). However, I kind of like Peter's idea anyway on the grounds that byte-wise comparison is probably faster than invoking the datatype comparators. Also, language lawyering aside, I think he may be right about what people would expect "should" happen. What I *don't* like about the proposed patch is that it installs a new, different comparison rule for the ON UPDATE CASCADE case only. If we were to go in this direction, I'd think we should try to use the same comparison rule for all FK row comparisons. The inconsistencies get messier the more you think about it, really. If a referencing row was datatype-equal, but not byte-equal, to the PK row to start with, why would an update changing the PK row (perhaps to another datatype-equal value) result in forcing the referencing row to become byte-equal? How does this fit in with the fact that our notion of what uniqueness means in the PK table is no-datatype-equality, rather than no-byte-equality? On the whole we might be better off leaving this alone. regards, tom lane
Andrew Gierth wrote: > SQL2016, 15.17 Execution of referential actions > > 10) If a non-null value of a referenced column RC in the referenced > table is updated to a value that is distinct from the current value > of RC, then, > > [snip all the stuff about how ON UPDATE actions work] > > does that "is distinct from" mean that IS DISTINCT FROM would be true, > or does it mean "is in some way distinguishable from"? Nothing I can see > in the spec suggests the latter. My 2003 standard defines, and even condescends to be informal: 3.1.6.8 distinct (of a pair of comparable values): Capable of being distinguished within a given context. Informally, not equal, not both null. A null value and a non-null value are distinct. Yours, Laurenz Albe
>>>>> "Laurenz" == Laurenz Albe <laurenz.albe@cybertec.at> writes: Laurenz> Andrew Gierth wrote: >> SQL2016, 15.17 Execution of referential actions >> >> 10) If a non-null value of a referenced column RC in the referenced >> table is updated to a value that is distinct from the current value >> of RC, then, >> >> [snip all the stuff about how ON UPDATE actions work] >> >> does that "is distinct from" mean that IS DISTINCT FROM would be true, >> or does it mean "is in some way distinguishable from"? Nothing I can see >> in the spec suggests the latter. Laurenz> My 2003 standard defines, and even condescends to be informal: Laurenz> 3.1.6.8 distinct (of a pair of comparable values): Capable of Laurenz> being distinguished within a given context. Informally, not Laurenz> equal, not both null. A null value and a non-null value are Laurenz> distinct. Hrm. SQL2016 has similar language which I previously missed, but I don't think it actually helps: 3.1.6.9 distinct (of a pair of comparable values) capable of being distinguished within a given context NOTE 8 -- Informally, two values are distinct if neither is null and the values are not equal. A null value and a non- null value are distinct. Two null values are not distinct. See Subclause 4.1.5, "Properties of distinct", and the General Rules of Subclause 8.15, "<distinct predicate>". Two values which are sql-equal but not identical, such as two strings in a case-insensitive collation that differ only by case, are distinguishable in some contexts but not others, so what context actually applies to the quoted rule? I think the only reasonable interpretation is that it should use the same kind of distinctness that is being used by the unique constraint and the equality comparison that define whether the FK is satisfied. -- Andrew.
On 06/02/2019 12:23, Andrew Gierth wrote: > Two values which are sql-equal but not identical, such as two strings in > a case-insensitive collation that differ only by case, are > distinguishable in some contexts but not others, so what context > actually applies to the quoted rule? > > I think the only reasonable interpretation is that it should use the > same kind of distinctness that is being used by the unique constraint > and the equality comparison that define whether the FK is satisfied. By that logic, a command such as UPDATE t1 SET x = '0' WHERE x = '-0'; could be optimized away as a noop, because in that world there is no construct by which you can prove whether the update happened. I think that would not be satisfactory. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 05/02/2019 17:20, Tom Lane wrote: > What I *don't* like about the proposed patch is that it installs a > new, different comparison rule for the ON UPDATE CASCADE case only. > If we were to go in this direction, I'd think we should try to use > the same comparison rule for all FK row comparisons. That's easy to change. I had it like that in earlier versions of the patch. I agree it would be better for consistency, but it would create some cases where we do unnecessary extra work. > The inconsistencies get messier the more you think about it, > really. If a referencing row was datatype-equal, but not byte-equal, > to the PK row to start with, why would an update changing the PK row > (perhaps to another datatype-equal value) result in forcing the > referencing row to become byte-equal? How does this fit in with > the fact that our notion of what uniqueness means in the PK table > is no-datatype-equality, rather than no-byte-equality? This patch doesn't actually change the actual foreign key behavior. Foreign keys already work like that. The patch only touches an optimization that checks whether it's worth running the full foreign key behavior because the change was significant enough. That shouldn't affect the outcome. It should be the case that if you replace RI_FKey_pk_upd_check_required() by just "return true", then nothing user-visible changes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-02-06 23:15, Peter Eisentraut wrote: > On 05/02/2019 17:20, Tom Lane wrote: >> What I *don't* like about the proposed patch is that it installs a >> new, different comparison rule for the ON UPDATE CASCADE case only. >> If we were to go in this direction, I'd think we should try to use >> the same comparison rule for all FK row comparisons. > > That's easy to change. I had it like that in earlier versions of the > patch. I agree it would be better for consistency, but it would create > some cases where we do unnecessary extra work. Updated patch with this change made, and some conflicts resolved. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-03-11 12:57, Peter Eisentraut wrote: > On 2019-02-06 23:15, Peter Eisentraut wrote: >> On 05/02/2019 17:20, Tom Lane wrote: >>> What I *don't* like about the proposed patch is that it installs a >>> new, different comparison rule for the ON UPDATE CASCADE case only. >>> If we were to go in this direction, I'd think we should try to use >>> the same comparison rule for all FK row comparisons. >> >> That's easy to change. I had it like that in earlier versions of the >> patch. I agree it would be better for consistency, but it would create >> some cases where we do unnecessary extra work. > > Updated patch with this change made, and some conflicts resolved. Committed like this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services