Re: record identical operator

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: record identical operator
Дата
Msg-id 20130913215900.GB7437@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Ответы Re: record identical operator  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-hackers
On 2013-09-13 14:36:27 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote:
> >> The new operator is logically similar to IS NOT DISTINCT FROM for a
> >> record, although its implementation is very different.  For one
> >> thing, it doesn't replace the operation with column level operators
> >> in the parser.  For another thing, it doesn't look up operators for
> >> each type, so the "identical" operator does not need to be
> >> implemented for each type to use it as shown above.  It compares
> >> values byte-for-byte, after detoasting.  The test for identical
> >> records can avoid the detoasting altogether for any values with
> >> different lengths, and it stops when it finds the first column with
> >> a difference.
> >
> > In the general case, that operator sounds dangerous to me. We don't
> > guarantee that a Datum containing the same data always has the same
> > binary representation. E.g. array can have a null bitmap or may not have
> > one, depending on how they were created.
> >
> > I am not actually sure whether that's a problem for your usecase, but I
> > get headaches when we try circumventing the type abstraction that way.
> >
> > Yes, we do such tricks in other places already, but afaik in all those
> > places errorneously believing two Datums are distinct is not error, just
> > a missed optimization. Allowing a general operator with such a murky
> > definition to creep into something SQL exposed... Hm. Not sure.
>
> Well, the only two alternatives I could see were to allow
> user-visible differences not to be carried to the matview if they
> old and new values were considered "equal", or to implement an
> "identical" operator or function in every type that was to be
> allowed in a matview.  Given those options, what's in this patch
> seemed to me to be the least evil.
>
> It might be worth noting that this scheme doesn't have a problem
> with correctness if there are multiple equal values which are not
> identical, as long as any two identical values are equal.  If the
> query which generates contents for a matview generates
> non-identical but equal values from one run to the next without any
> particular reason, that might cause performance problems.

I am not actually that concerned with MVCs using this, you're quite
capable of analyzing the dangers. What I am wary of is exposing an
operator that's basically broken from the get go to SQL.
Now, the obvious issue there is that matviews use SQL to refresh :(

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE