Re: record identical operator
От | Andres Freund |
---|---|
Тема | Re: record identical operator |
Дата | |
Msg-id | 20130914185832.GA2291@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: record identical operator (Kevin Grittner <kgrittn@ymail.com>) |
Ответы |
Re: record identical operator
Re: record identical operator |
Список | pgsql-hackers |
On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > > what I am talking about is that > > e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3]; > > obviously should be true. > > The patch does not change the behavior of the = operator for any > type under any circumstances. Yes, sure. I wasn't thinking you would. > > But both arrays don't have the same binary representation since > > the former has a null bitmap, the latter not. So, if you had a > > composite type like (int4[]) and would compare that without > > invoking operators you'd return something false in some cases > > because of the null bitmaps. > > Not for the = operator. The new "identical" operator would find > them to not be identical, though. Yep. And I think that's a problem if exposed to SQL. People won't understand the hazards and end up using it because its faster or somesuch. > Since the new operator is only for the record type, I need to wrap > the values in your example: Yes. > The REFRESH causes them to match again, and later REFRESH runs > won't see a need to do any work there unless the on-disk > representation changes again. Yes, I understand that the matview code itself will just perform superflous work. We use such comparisons in other parts of the code similarly. > As far as I can see, we have four choices: > > (1) Never update values that are "equal", even if they appear > different to the users, as was demonstrated with the citext > example. I think, introducing a noticeable amount of infrastructure for this just because of citext is a bad idea. At some point we need to replace citext with proper case-insensitive collation support - then it really might become necessary. > (2) Require every data type which can be used in a matview to > implement some new operator or function for "identical". Perhaps > that could be mitigated to only implementat it if equal values can > have user-visible differences. That basically would require adding a new member to btree opclasses that btrees don't need themselves... Hm. > (3) Embed special cases into record identical tests for types > known to allow multiple on-disk representations which have no > user-visible differences. I think this is a complete nogo. a) I don't forsee we know of all these cases b) it wouldn't be extensible. Oh. Now that I've read further, I see you feel the same. Good ;) > (4) Base the need to update a matview column on whether its > on-disk representation is identical to what a new run of the > defining query would generate. If this causes performance problems > for use of a given type in a matview, one possible solution would > be to modify that particular type to use a canonical format when > storing a value into a record. For example, storing an array which > has a bitmap of null values even though there are no nulls in the > array could strip the bitmap as it is stored to the record. If matview refreshs weren't using plain SQL and thus wouldn't require exposing that operator to SQL I wouldn't have a problem with this... There's the ungodly ugly choice of having an matview_equal function (or operator) that checks if we're doing a refresh atm... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: