Re: record identical operator

Поиск

Список

Период

Сортировка

От	Kevin Grittner
Тема	Re: record identical operator
Дата	20 сентября 2013 г. 15:01:27
Msg-id	1379689280.4270.YahooMailNeo@web162906.mail.bf1.yahoo.com обсуждение исходный текст
Ответ на	Re: record identical operator (Stephen Frost <sfrost@snowman.net>)
Ответы	Re: record identical operator
Список	pgsql-hackers

Дерево обсуждения

Stephen Frost <sfrost@snowman.net> wrote:
> * Kevin Grittner (kgrittn@ymail.com) wrote:
>> The result of this is that with the patch, an incremental refresh
>> of a matview will always match what the query returned at the time
>> it was run (there is no *correctness* problem) but if someone uses
>> a query with non-deterministic results they may have a lot of
>> activity on a concurrent refresh even if there was no change to the
>> underlying data -- so you could have a *performance* penalty in
>> cases where the query returns something different, compared to
>> leaving the old "equal but not the same" results.
>
> You mean 'at the time of the incremental refresh', right?  Yet that may
> or may not match running that query directly by an end-user as the plan
> that a user might get for the entire query could be different than what
> is run for an incremental update, or due to statistics updates, etc.

I'm confused.  The refresh *does* run the query.  Sure, if the
query is run again it could return different results.  I'm missing
the point here.

>>> Consider a GROUP BY with a citext column as one of the key
>>> fields.  You're going to get whatever value the aggregate
>>> happened to come across first when building the HashAgg.
>>> Having the binary equality operator doesn't help that and seems
>>> like it could even result in change storms happening due to a
>>> different plan when the actual data didn't change.
>>
>> Yup.  A person who wants to GROUP BY a citext value for a matview
>> might want to fold it to a consistent capitalization to avoid that
>> issue.
>
> I'm trying to figure out why that's a perfectly acceptable solution for
> users running views with GROUP BYs, but apparently it isn't sufficient
> for mat views?  In other words, you would suggest telling users "sorry,
> you can't rely on the value returned by a GROUP BY on that citext column
> using a normal view, but we're going to try and do better for mat
> views".

Again, I'm lost.  If they don't do something to make the result
deterministic, it could be different on each run of the VIEW, and
on each REFRESH of the matview.  I don't see why that is an
argument for trying to suppress the effort needed make the matview
match the latest run of the query.

> I don't intend the above to imply that we should never update values in
> mat views when we can do so in a reliable way to ensure that the value
> matches what a view would return.  This matches our notion of UPDATE,
> where we will still UPDATE a value even if the old value and the new
> value are equal according to the type's equality operator, when the
> conditional for the UPDATE is using a reliable type (eg: integer).

Well, we provide a trigger function to suppress the UPDATE
operation if the old and new values are identical -- in terms of
what is stored.  We do not attempt to use the default btree equals
operator to suppress updates to different values in some
circumstances.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: record identical operator