Re: record identical operator

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: record identical operator
Дата
Msg-id CAHyXU0x3fijRVDJhNf9+9o5+BLZu2Orqas9dY+Gq1z+SESjz_g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: record identical operator  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On Tue, Sep 24, 2013 at 2:22 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Now I admit that's an arguable point.  We could certainly define an
>> intermediate notion of equality that is more equal than whatever =
>> does, but not as equal as exact binary equality.
>
> I suggested it up-thread and don't recall seeing a response, so here it
> is again- passing the data through the binary-out function for the type
> and comparing *that* would allow us to change the interal binary
> representation of data types and would be something which we could at
> least explain to users as to why X isn't the same as Y according to this
> binary operator.
>
>> I think the conservative (and therefore correct) approach is to decide
>> that we're always going to update if we detect any difference at all.
>
> And there may be users who are surprised that a refresh changed parts of
> the table that have nothing to do with what was changed in the
> underlying relation, because a different plan was used and the result
> ended up being binary-different.  It's easy to explain to users why that
> would be when we're doing a wholesale replace but I don't think it'll be
> nearly as clear why that happened when we're not replacing the whole
> table and why REFRESH can basically end up changing anything (but
> doesn't consistently).  If we're paying attention to the records changed
> and only updating the matview's records when they're involved, that
> becomes pretty clear.  What's happening here feels very much like
> unintended consequences.

FWIW you make some interesting points (I did a triple take on the plan
dependent changes) but I'm 100% ok with the proposed behavior.
Matviews satisfy 'truth' as *defined by the underlying query only*.
This is key: there may be N candidate 'truths' for that query: it's
not IMNSHO reasonable to expect the post-refresh truth to be
approximately based in the pre-refresh truth.  Even if the
implementation happened to do what you're asking  for it would only be
demonstrating undefined but superficially useful behavior...a good
analogy would be the old scan behavior where an unordered scan would
come up in 'last update order'.  That (again, superficially useful)
behavior was undefined and we reserved the right to change it.  And we
did.  Unnecessarily defined behaviors defeat future performance
optimizations.

So Kevin's patch AIUI defines a hitherto non-user accessible (except
in the very special case of row-wise comparison) mechanic to try and
cut down the number of rows that *must* be refreshed.  It may or may
not do a good job at that on a situational basis -- if it was always
better we'd probably be overriding the default behavior.  I don't
think it's astonishing at all for matview to pseudo-randomly adjust
case over a citext column; that's just part of the deal with equality
ambiguous types.  As long as the matview doesn't expose a dataset that
was impossible to have been generated by the underlying query, I'm
good.

merlin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Abhijit Menon-Sen
Дата:
Сообщение: Re: [PATCH] bitmap indexes
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE