Re: Incrementally refreshed materialized view

Поиск
Список
Период
Сортировка
От hariprasath nallasamy
Тема Re: Incrementally refreshed materialized view
Дата
Msg-id CAGgejVw9D8PMqd6qifDLmps6_JPV+9+Zm9WN14bAfmUPk==n5A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Incrementally refreshed materialized view  (Kevin Grittner <kgrittn@gmail.com>)
Ответы Re: Incrementally refreshed materialized view  (Nguyễn Trần Quốc Vinh <ntquocvinh@gmail.com>)
Список pgsql-general
We also tried to achieve incremental refresh of materialized view and our solution doesn't solve all of the use cases.

Players:
1) WAL
2) Logical decoding 
3) replication slots 
4) custom background worker

Two kinds of approaches :
1. Deferred refresh (oracle type of creating log table for each base tables with its PK and agg's columns old and new values)
      a) Log table for each base table has to be created and this log table will keep track of delta changes. 
      b) UDF is called to refresh the view incrementally - this will run original materialized view query with the tracked delta PK's in their where clause. so only rows that are modified/inserted will be touched.
      c) Log table will keep track of changed rows from the data given by replication slot which uses logical decoding to decode from WAL.
      d) Shared memory is used to maintain the relationship between the view and its base table. In case of restart they are pushed to maintenance table.

2. RealTime refresh (update the view whenever we get any change-sets related to that base tables)
      a) Delta data from the replication slot will be applied to view by checking the relationship between our delta data and the view definiton. Here also shared memory and maintenance table are used.
      b) Work completed only for materialized views having single table.

Main disadvantage : 
1) Data inconsistency when master failure and also slave doesn't have replication slot as of now. But 2ndquard guys try to create slots in slave using some concepts of failover slots. But that doesn't come along with PG :(. 
2) Sum, count and avg are implemented for aggregates(single table) and for other aggs full refresh comes to play a role.
3) Right join implementation requires more queries to run on the top of MV's.

So we are on a long way to go and dono whether this is the right path.

Only deferred refresh was pushed to github.

I wrote a post regarding that in medium.


В списке pgsql-general по дате отправления:

Предыдущее
От: raf
Дата:
Сообщение: Frequent "pg_ctl status" removing(?) semaphores (unlikely)
Следующее
От: "Marek Petr"
Дата:
Сообщение: lost synchronization with server: got message type "Z"