Re: record identical operator - Review

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: record identical operator - Review
Дата
Msg-id 20130926215008.GB7113@momjian.us
обсуждение исходный текст
Ответ на Re: record identical operator - Review  (Steve Singer <steve@ssinger.info>)
Ответы Re: record identical operator - Review  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On Thu, Sep 19, 2013 at 06:31:38PM -0400, Steve Singer wrote:
> On 09/12/2013 06:27 PM, Kevin Grittner wrote:
> >Attached is a patch for a bit of infrastructure I believe to be
> >necessary for correct behavior of REFRESH MATERIALIZED VIEW
> >CONCURRENTLY as well as incremental maintenance of matviews.
> 
> Here is attempt at a review of the v1 patch.
> 
> There has been a heated discussion on list about if we even want
> this type of operator. I will try to summarize it here
> 
> The arguments against it are
> * that a record_image_identical call on two records that returns
> false can still return true under the equality operator, and the
> records might (or might not) appear to be the same to users
> * Once we expose this as an operator via SQL someone will find it
> and use it and then complain when we change things such as the
> internal representation of a datatype which might
>    break there queries
> * The differences between = and record_image_identical might not be
> understood by everywhere who finds the operator leading to confusion
> * Using a criteria other than equality (the = operator provided by
> the datatype) might cause problems if we later provide 'on change'
> triggers for materialized views
> 
> The arguments for this patch are
> * We want the materialized view to return the same value as would be
> returned if the query were executed directly.  This not only means
> that the values should be the same according to a datatypes =
> operator but that they should appear the same 'to the eyeball'.
> * Supporting the materialized view refresh check via SQL makes a lot
> of sense but doing that requires exposing something via SQL
> * If we adequately document what we mean by record_image_identical
> and the operator we pick for this then users shouldn't be surprised
> at what they get if they use this

This is a good summary.  I think there are a few things that make this
issue difficult to decide:

1.  We have to use an operator to give the RMVC (REFRESH MATERIALIZED
VIEW CONCURRENTLY) SPI query the ability to optimize this query.   If we
could do this with an SQL or C function, there would be less concern
about the confusion caused by adding this capability.

Question:  If we are comparing based on some primary key, why do we need
this to optimize?  Certainly the primary key index will be used, no?

2.  Agregates are already non-deterministic, so some feel that adding
this feature doesn't improve much.  The counter-argument is that without
the binary row comparison ability, rows could be returned that could
_never_ have been produced by the base data, which is more of a user
surprise than non-deterministic rows.

3.  Our type casting and operators are already complex, and adding
another set of operators only compounds that.

4.  There are legitimate cases where tool makers might want the ability
to compare rows binarily, e.g. for replication, and adding these
operators would help with that.

I think we need to see a patch from Kevin that shows all the row
comparisons documented and we can see the impact of the user-visibile
part.

One interesting approach would be to only allow the operator to be
called from SPI queries.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: FW: REVIEW: Allow formatting in log_line_prefix
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [PATCH] pg_upgrade: Split off pg_fatal() from pg_log()