Re: Query Rewrite for Materialized Views (Postgres Extension)
| От | Nico Williams | 
|---|---|
| Тема | Re: Query Rewrite for Materialized Views (Postgres Extension) | 
| Дата | |
| Msg-id | 20180618191355.GC4200@localhost обсуждение исходный текст | 
| Ответ на | Re: Query Rewrite for Materialized Views (Postgres Extension) (Dent John <denty@QQdd.eu>) | 
| Ответы | Re: Query Rewrite for Materialized Views (Postgres Extension) | 
| Список | pgsql-hackers | 
On Mon, Jun 18, 2018 at 07:38:13PM +0100, Dent John wrote:
> I commented to Corey (privately) that, while my rewrite extension has
> gotten me a server that responds quickly to aggregate queries, the
> constant need to refresh the supporting MVs means the system’s load
> average is constant and much higher than before. I’m happy with the
> tradeoff for now, but it’s a huge waste of energy, and I’m sure it
> must thrash my disk.
>
> I’m very interested in what other people think of Corey’s idea.
I've written an alternative materialization extension (entirely as
PlPgSQL) based on PG's internals, but my version has a few big wins that
might help here.  I'm thinking of properly integrating it with PG.  Some
of the features include:
 - you can write triggers that update the materialization
   This is because the materialization is just a regular table in my
   implementation.
 - you can mark a view as needing a refresh (e.g., in a trigger)
 - you can declare a PK, other constraints, and indexes on a
   materialization
   The DMLs used to refresh a view concurrently can take advantage of
   the PK and/or other indexes to go fast.
 - you get a history table which records updates to the materialization
   This is useful for generating incremental updates to external
   systems.
Keeping track of refresh times should help decide whether to use or not
use a materialization in some query, or whether to refresh it first, or
not use it at all.
One of the things I'd eventually like to do is analyze the view query
AST to automatically generate triggers to update materializations or
mark them as needing refreshes.  A first, very very rough sketch of such
an analysis looks like this:
 - if the view query has CTEs
   -> create triggers on all its table sources to mark the
      materialization as needing a refresh
 - else if a table appears more than once as a table source in the view
   query
   -> create triggers on that table that mark the materialization as
      needing a refresh
 - else if a table appears anywhere other than the top-level
   -> create triggers .. mark as needing refresh
 - else if a table is a right-side of a left join
   -> create triggers .. mark as needing refresh
 - else if a table has no PK
   -> create triggers .. mark as needing refresh
 - else if the query has no GROUP BY, or only does a GROUP BY on this
   table and a list of columns prefixed by the table's PK
   -> rewrite the query to have WHERE eq conditions on values for the
      table's PK columns
      analyze this query
      if the result shows this table source as the first table in the
      plan
      -> create triggers on this table to update the materialization
         directly from querying the source view
 - else
   -> create triggers .. mark as needing refresh
Nico
--
		
	В списке pgsql-hackers по дате отправления: