Re: record identical operator

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: record identical operator
Дата
Msg-id 1379108187.76931.YahooMailNeo@web162904.mail.bf1.yahoo.com
обсуждение исходный текст
Ответ на Re: record identical operator  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: record identical operator  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-09-12 15:27:27 -0700, Kevin Grittner wrote:
>> The new operator is logically similar to IS NOT DISTINCT FROM for a
>> record, although its implementation is very different.  For one
>> thing, it doesn't replace the operation with column level operators
>> in the parser.  For another thing, it doesn't look up operators for
>> each type, so the "identical" operator does not need to be
>> implemented for each type to use it as shown above.  It compares
>> values byte-for-byte, after detoasting.  The test for identical
>> records can avoid the detoasting altogether for any values with
>> different lengths, and it stops when it finds the first column with
>> a difference.
>
> In the general case, that operator sounds dangerous to me. We don't
> guarantee that a Datum containing the same data always has the same
> binary representation. E.g. array can have a null bitmap or may not have
> one, depending on how they were created.
>
> I am not actually sure whether that's a problem for your usecase, but I
> get headaches when we try circumventing the type abstraction that way.
>
> Yes, we do such tricks in other places already, but afaik in all those
> places errorneously believing two Datums are distinct is not error, just
> a missed optimization. Allowing a general operator with such a murky
> definition to creep into something SQL exposed... Hm. Not sure.

Well, the only two alternatives I could see were to allow
user-visible differences not to be carried to the matview if they
old and new values were considered "equal", or to implement an
"identical" operator or function in every type that was to be
allowed in a matview.  Given those options, what's in this patch
seemed to me to be the least evil.

It might be worth noting that this scheme doesn't have a problem
with correctness if there are multiple equal values which are not
identical, as long as any two identical values are equal.  If the
query which generates contents for a matview generates
non-identical but equal values from one run to the next without any
particular reason, that might cause performance problems.

To mangle Orwell: "Among pairs of equal values, some pairs are more
equal than others."

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Large shared_buffer stalls WAS: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE