Re: record identical operator

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: record identical operator
Дата
Msg-id 20130923203241.GK2706@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: record identical operator  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote:
> I don't know why there shouldn't be a question about that.

Because anything else would be an internal optimization which must be
proven to be correct, imv, also..

> Suppose
> that the MAX() aggregate is in use.  If 4 or 5 or 6 is updated so as
> to change the maximum of the three, then matrow 1 needs updating.  But
> if the maximum remains the same, then it doesn't.  The right way to
> decide whether it needs updating is to re-aggregate those three rows
> and then see whether you get the same (read: binary identical) out of
> the aggregate that you got the last time you ran it.

You could argue the same about PG doing that for any row update- check
if anything is actually *binary* different and, if not, then don't
update it.  Of course, there's questions about if that's "right" and
what about triggers, etc..

> Also, suppose the same statement updates row 4, row 5, and row 6.
> Instead of updating the materialized view three times, you do it just
> once at end-of-statmement, like an AFTER STATEMENT trigger that
> somehow knows which rows were updated.

Sorry if I wasn't clear, but that's exactly what I was trying to
describe regarding how it should work.  I was NOT intending to suggest
that each update immediately update the matview.  It's just that we
keep *track* of what was updated and then, at some convenient point,
actually run the process to update the matview rows (maybe in an AFTER
statement, maybe 5 minutes from now).

> In this case even something
> like AVG() could produce the same result as it did before the update.

Sure it could.

> And you'd surely want to avoid updating the matview if the new value
> was the same as what was already stored in the matview (but not if it
> was equal but not the same).

I don't see why updating a row that was built with AVG() should be
avoided over a row that was built with MAX(), unless you're suggesting
there's a different set of rows involved in the two or there's some
additional optimization around figuring out if these particular changes
*should* actually change the result.  That's an analysis which could
still happen and wouldn't need to rely on any binary equality test, and
it'd need to have a whole lot more smarts than this approach anyway or
you'll still end up running a query against all of the rows involved in
the AVG() to then only decide at the last moment to not update the row,
which doesn't strike me as a great optimization.  Perhaps that's why we
didn't implement it for PG itself?
Thanks,
    Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tomas Vondra"
Дата:
Сообщение: Re: Improving avg performance for numeric
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: record identical operator