Re: record identical operator

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: record identical operator
Дата
Msg-id 20130914185832.GA2291@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Ответы Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Re: record identical operator  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> > what I am talking about is that
> > e.g.: SELECT (ARRAY[1,2,3,NULL])[1:3] = ARRAY[1,2,3];
> > obviously should be true.
>
> The patch does not change the behavior of the = operator for any
> type under any circumstances.

Yes, sure. I wasn't thinking you would.

> > But both arrays don't have the same binary representation since
> > the former has a null bitmap, the latter not. So, if you had a
> > composite type like (int4[]) and would compare that without
> > invoking operators you'd return something false in some cases
> > because of the null bitmaps.
>
> Not for the = operator.  The new "identical" operator would find
> them to not be identical, though.

Yep. And I think that's a problem if exposed to SQL. People won't
understand the hazards and end up using it because its faster or
somesuch.

> Since the new operator is only for the record type, I need to wrap
> the values in your example:

Yes.

> The REFRESH causes them to match again, and later REFRESH runs
> won't see a need to do any work there unless the on-disk
> representation changes again.

Yes, I understand that the matview code itself will just perform
superflous work. We use such comparisons in other parts of the code
similarly.

> As far as I can see, we have four choices:
>
> (1)  Never update values that are "equal", even if they appear
> different to the users, as was demonstrated with the citext
> example.

I think, introducing a noticeable amount of infrastructure for this just
because of citext is a bad idea.
At some point we need to replace citext with proper case-insensitive
collation support - then it really might become necessary.

> (2)  Require every data type which can be used in a matview to
> implement some new operator or function for "identical".  Perhaps
> that could be mitigated to only implementat it if equal values can
> have user-visible differences.

That basically would require adding a new member to btree opclasses that
btrees don't need themselves... Hm.

> (3)  Embed special cases into record identical tests for types
> known to allow multiple on-disk representations which have no
> user-visible differences.

I think this is a complete nogo. a) I don't forsee we know of all these
cases b) it wouldn't be extensible.

Oh. Now that I've read further, I see you feel the same. Good ;)

> (4)  Base the need to update a matview column on whether its
> on-disk representation is identical to what a new run of the
> defining query would generate.  If this causes performance problems
> for use of a given type in a matview, one possible solution would
> be to modify that particular type to use a canonical format when
> storing a value into a record.  For example, storing an array which
> has a bitmap of null values even though there are no nulls in the
> array could strip the bitmap as it is stored to the record.

If matview refreshs weren't using plain SQL and thus wouldn't require
exposing that operator to SQL I wouldn't have a problem with this...

There's the ungodly ugly choice of having an matview_equal function (or
operator) that checks if we're doing a refresh atm...


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marko Tiikkaja
Дата:
Сообщение: Re: Assertions in PL/PgSQL
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: git apply vs patch -p1