Re: Implementing Incremental View Maintenance

Поиск
Список
Период
Сортировка
От Yugo Nagata
Тема Re: Implementing Incremental View Maintenance
Дата
Msg-id 20190930223414.62e45861094e8ee68c2d0cbb@sraoss.co.jp
обсуждение исходный текст
Ответ на Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Ответы Re: Implementing Incremental View Maintenance  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Список pgsql-hackers
Hi,

Attached is the latest patch for supporting self-join views. This also
including the following fix mentioned by Tatsuo Ishii.

> > On 2019-Aug-06, Tatsuo Ishii wrote:
> > 
> >> It's not mentioned below but some bugs including seg fault when
> >> --enable-casser is enabled was also fixed in this patch.
> >> 
> >> BTW, I found a bug with min/max support in this patch and I believe
> >> Yugo is working on it. Details:
> >> https://github.com/sraoss/pgsql-ivm/issues/20

This patch allows to support self-join views, simultaneous updates of more
than one base tables, and also multiple updates of the same base table. 
I first tried to support just self-join, but I found that this is essentially
same as to support simultaneous table updates, so I decided to support them in 
the same commit. I think this will be a base for implementing
Deferred-maintenance in future.



In the new implementation, AFTER triggers are used to collecting tuplestores 
containing transition table contents. When multiple tables are changed, 
multiple AFTER triggers are invoked, then the final AFTER trigger performs 
actual update of the matview. In addition AFTER trigger, also BEFORE trigger
is used to handle global information for view maintenance. 

For example, suppose that we have a view V joining table R,S, and new tuples are
inserted to each table, dR,dS, and dT respectively.

 V = R*S*T
 R_new = R + dR
 S_new = S + dS
 T_new = T + dT

In this situation, we can calculate the new view state as bellow.

V_new 
= R_new * S_new * T_new
= (R + dR) * (S + dS) * (T + dT)
= R*S*T + dR*(S + dS)*(T + dT) + R*dS*(T + dT) + R*S*dT
= V + dR*(S + dS)*(T + dT) + R*dS*(T + dT) + R*S*dT
= V + (dR *S_new*T_new) + (R*dS*T_new) + (R*S*dT)

To calculate view deltas, we need both pre-state (R,S, and T) and post-state 
(R_new, S_new, and T_new) of base tables. 

Post-update  states are available in AFTER trigger, and we calculate pre-update
states by filtering inserted tuples using cmin/xmin system columns, and appendding
deleted  tuples which are contained in a old transition table.

In the original core implementation, tuplestores of transition tables were 
freed for each query depth. However, we want to prolong their life plan because
we have to preserve these for a whole query assuming some base tables are changed
in other trigger functions, so I added a hack to trigger.c.

Regression tests are also added for self join view, multiple change on the same
table, simultaneous two table changes, and foreign reference constrains.

Here are behavior examples:

1. Table definition
- t: for self-join
- r,s: for 2-ways join

CREATE TABLE r (i int, v int);
CREATE TABLE
CREATE TABLE s (i int, v int);
CREATE TABLE
CREATE TABLE t (i int, v int);
CREATE TABLE

2. Initial data

INSERT INTO r VALUES (1, 10), (2, 20), (3, 30);
INSERT 0 3
INSERT INTO s VALUES (1, 100), (2, 200), (3, 300);
INSERT 0 3
INSERT INTO t VALUES (1, 10), (2, 20), (3, 30);
INSERT 0 3

3. View definition

3.1. self-join(mv_self, v_slef)

CREATE INCREMENTAL MATERIALIZED VIEW mv_self(v1, v2) AS
 SELECT t1.v, t2.v FROM t t1 JOIN t t2 ON t1.i = t2.i;
SELECT 3
CREATE VIEW v_self(v1, v2) AS
 SELECT t1.v, t2.v FROM t t1 JOIN t t2 ON t1.i = t2.i;
CREATE VIEW

3.2. 2-ways join (mv, v)

CREATE INCREMENTAL MATERIALIZED VIEW mv(v1, v2) AS
 SELECT r.v, s.v FROM r JOIN s USING(i);
SELECT 3
CREATE VIEW v(v1, v2) AS
 SELECT r.v, s.v FROM r JOIN s USING(i);
CREATE VIEW

3.3 Initial contents

SELECT * FROM mv_self ORDER BY v1;
 v1 | v2 
----+----
 10 | 10
 20 | 20
 30 | 30
(3 rows)

SELECT * FROM mv ORDER BY v1;
 v1 | v2  
----+-----
 10 | 100
 20 | 200
 30 | 300
(3 rows)

4. Update a base table for the self-join view

INSERT INTO t VALUES (4,40);
INSERT 0 1
DELETE FROM t WHERE i = 1;
DELETE 1
UPDATE t SET v = v*10 WHERE i=2;
UPDATE 1

4.1. Results
- Comparison with the normal view

SELECT * FROM mv_self ORDER BY v1;
 v1  | v2  
-----+-----
  30 |  30
  40 |  40
 200 | 200
(3 rows)

SELECT * FROM v_self ORDER BY v1;
 v1  | v2  
-----+-----
  30 |  30
  40 |  40
 200 | 200
(3 rows)

5. pdate a base table for the 2-way join view

WITH
 ins_r AS (INSERT INTO r VALUES (1,11) RETURNING 1),
 ins_r2 AS (INSERT INTO r VALUES (3,33) RETURNING 1),
 ins_s AS (INSERT INTO s VALUES (2,222) RETURNING 1),
 upd_r AS (UPDATE r SET v = v + 1000 WHERE i = 2 RETURNING 1),
 dlt_s AS (DELETE FROM s WHERE i = 3 RETURNING 1)
SELECT NULL;
 ?column? 
----------
 
(1 row)

5.1. Results
- Comparison with the normal view

SELECT * FROM mv ORDER BY v1;
  v1  | v2  
------+-----
   10 | 100
   11 | 100
 1020 | 200
 1020 | 222
(4 rows)

SELECT * FROM v ORDER BY v1;
  v1  | v2  
------+-----
   10 | 100
   11 | 100
 1020 | 200
 1020 | 222
(4 rows)

========

Best Regards,
Yugo Nagata

-- 
Yugo Nagata <nagata@sraoss.co.jp>

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Online checksums patch - once again
Следующее
От: Tom Lane
Дата:
Сообщение: Re: SSL tests failing for channel_binding with OpenSSL <= 1.0.1