Re: record identical operator

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: record identical operator
Дата
Msg-id 1379689280.4270.YahooMailNeo@web162906.mail.bf1.yahoo.com
обсуждение исходный текст
Ответ на Re: record identical operator  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: record identical operator  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
Stephen Frost <sfrost@snowman.net> wrote:
> * Kevin Grittner (kgrittn@ymail.com) wrote:
>> The result of this is that with the patch, an incremental refresh
>> of a matview will always match what the query returned at the time
>> it was run (there is no *correctness* problem) but if someone uses
>> a query with non-deterministic results they may have a lot of
>> activity on a concurrent refresh even if there was no change to the
>> underlying data -- so you could have a *performance* penalty in
>> cases where the query returns something different, compared to
>> leaving the old "equal but not the same" results.
>
> You mean 'at the time of the incremental refresh', right?  Yet that may
> or may not match running that query directly by an end-user as the plan
> that a user might get for the entire query could be different than what
> is run for an incremental update, or due to statistics updates, etc.

I'm confused.  The refresh *does* run the query.  Sure, if the
query is run again it could return different results.  I'm missing
the point here.

>>> Consider a GROUP BY with a citext column as one of the key
>>> fields.  You're going to get whatever value the aggregate
>>> happened to come across first when building the HashAgg.
>>> Having the binary equality operator doesn't help that and seems
>>> like it could even result in change storms happening due to a
>>> different plan when the actual data didn't change.
>>
>> Yup.  A person who wants to GROUP BY a citext value for a matview
>> might want to fold it to a consistent capitalization to avoid that
>> issue.
>
> I'm trying to figure out why that's a perfectly acceptable solution for
> users running views with GROUP BYs, but apparently it isn't sufficient
> for mat views?  In other words, you would suggest telling users "sorry,
> you can't rely on the value returned by a GROUP BY on that citext column
> using a normal view, but we're going to try and do better for mat
> views".

Again, I'm lost.  If they don't do something to make the result
deterministic, it could be different on each run of the VIEW, and
on each REFRESH of the matview.  I don't see why that is an
argument for trying to suppress the effort needed make the matview
match the latest run of the query.

> I don't intend the above to imply that we should never update values in
> mat views when we can do so in a reliable way to ensure that the value
> matches what a view would return.  This matches our notion of UPDATE,
> where we will still UPDATE a value even if the old value and the new
> value are equal according to the type's equality operator, when the
> conditional for the UPDATE is using a reliable type (eg: integer).

Well, we provide a trigger function to suppress the UPDATE
operation if the old and new values are identical -- in terms of
what is stored.  We do not attempt to use the default btree equals
operator to suppress updates to different values in some
circumstances.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: record identical operator
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: record identical operator