Re: Implementing Incremental View Maintenance

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: Implementing Incremental View Maintenance
Дата
Msg-id 20191225.081441.1142605825380811731.t-ishii@sraoss.co.jp
обсуждение исходный текст
Ответ на RE: Implementing Incremental View Maintenance  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Ответы RE: Implementing Incremental View Maintenance  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Список pgsql-hackers
> Materialized view reminds me of the use in a data warehouse.  Oracle handles the top in its Database Data Warehousing
Guide,and Microsoft has just started to offer the materialized view feature in its Azure Synapse Analytics (formerly
SQLData Warehouse).  AWS also has previewed Redshift's materialized view feature in re:Invent 2019.  Are you targeting
thedata warehouse (analytics) workload?
 
> 
> IIUC, to put (over) simply, the data warehouse has two kind of tables:
> 
> * Facts (transaction data): e.g. sales, user activity
> Large amount.  INSERT only on a regular basis (ETL/ELT) or continuously (streaming)
> 
> * Dimensions (master/reference data): e.g. product, customer, time, country
> Small amount.  Infrequently INSERTed or UPDATEd.
> 
> 
> The proposed trigger-based approach does not seem to be suitable for the facts, because the trigger overhead imposed
ondata loading may offset or exceed the time saved by incrementally refreshing the materialized views.
 

I think that depends on use case of the DWH. If the freshness of
materialized view tables is important for a user, then the cost of the
trigger overhead may be acceptable for the user.

> Then, does the proposed feature fit the dimension tables?  If the materialized view is only based on the dimension
data,then the full REFRESH of the materialized view wouldn't take so long.  The typical materialized view should join
thefact and dimension tables.  Then, the fact table will have to have the triggers, causing the data loading slowdown.
 
> 
> I'm saying this because I'm concerned about the trigger based overhead.  As you know, Oracle uses materialized view
logsto save changes and incrementally apply them later to the materialized views (REFRESH ON STATEMENT materialized
viewsdoesn't require the materialized view log, so it might use triggers.)  Does any commercial grade database
implementmaterialized view using triggers?  I couldn't find relevant information regarding Azure Synapse and Redshift.
 

I heard that REFRESH ON STATEMENT of Oracle has been added after ON
COMMIT materialized view. So I suspect Oracle realizes that there are
needs/use case for ON STATEMENT, but I am not sure.

> If our only handy option is a trigger, can we minimize the overhead by doing the view maintenance at transaction
commit?

I am not sure it's worth the trouble. If it involves some form of
logging, then I think it should be used for deferred IVM first because
it has more use case than on commit IVM.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: mdclose() does not cope w/ FileClose() failure
Следующее
От: Amit Langote
Дата:
Сообщение: Re: unsupportable composite type partition keys