Обсуждение: [PERFORM] max partitions behind a view?

Поиск
Список
Период
Сортировка

[PERFORM] max partitions behind a view?

От
Rick Otten
Дата:
I use materialized views to cache results from a foreign data wrapper to a high latency, fairly large (cloud) Hadoop instance.  In order to boost refresh times I split the FDW and materialized views up into partitions.  

Note:  I can't use pg_partman or native partitioning because those don't really work with this architecture - they are designed for "real" tables.  I can't really use citus because it isn't FDW/matview aware at this time either.

I then join the various materialized views together with a regular view made up of a bunch of 'union all' statements.

I have a set of functions which automatically create the new partitions and then replace the top level view to add them in on the fly.  At this time I probably have about 60 partitions.

With that approach I can refresh individual chunks of data, or I can refresh several chunks in parallel.  Generally this has been working pretty well.  One side effect is that because this is not a real partition, the planner does have to check each partition whenever I run a query to see if it has the data I need.  With appropriate indexes, this is ok, checking the partitions that don't have the data is very quick.  It does make for some long explain outputs though.

The challenge is that because of an exponential rate of data growth, I might have to significantly increase the number of partitions I'm working with - to several hundred at a minimum and potentially more than 1000...

This leads me to the question how many 'union all' statements can I have in one view?   Should I create a hierarchy of views to gradually roll the data up instead of putting them all in one top-level view?

Re: [PERFORM] max partitions behind a view?

От
Tom Lane
Дата:
Rick Otten <rottenwindfish@gmail.com> writes:
> The challenge is that because of an exponential rate of data growth, I
> might have to significantly increase the number of partitions I'm working
> with - to several hundred at a minimum and potentially more than 1000...

> This leads me to the question how many 'union all' statements can I have in
> one view?

I don't think there's a hard limit short of INT32_MAX or so, but I'd be
worried about whether there are any O(N^2) algorithms that would start
to be noticeable at the O(1000) level.

> Should I create a hierarchy of views to gradually roll the data
> up instead of putting them all in one top-level view?

That would likely make things worse not better; the planner would flatten
them anyway and would expend extra cycles doing so.  You could perhaps
stop the flattening with optimization fences (OFFSET 0) but I really doubt
you want the side-effects of that.
        regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance