Обсуждение: CREATE OR REPLACE MATERIALIZED VIEW

Поиск
Список
Период
Сортировка

CREATE OR REPLACE MATERIALIZED VIEW

От
Erik Wienhold
Дата:
I like to add CREATE OR REPLACE MATERIALIZED VIEW with the attached
patches.

Patch 0001 adds CREATE OR REPLACE MATERIALIZED VIEW similar to CREATE OR
REPLACE VIEW.  It also includes regression tests and changes to docs.

Patch 0002 deprecates CREATE MATERIALIZED VIEW IF NOT EXISTS because it
no longer seems necessary with patch 0001.  Tom Lane commented[1] about
the general dislike of IF NOT EXISTS, to which I agree, but maybe this
was meant only in response to adding new commands.  Anyway, my idea is
to deprecate that usage in PG18 and eventually remove it in PG19, if
there's consensus for it.  We can drop that clause without violating any
standard because matviews are a Postgres extension.  I'm not married to
the idea, just want to put it on the table for discussion.

Motivation
----------

At $JOB we use materialized views for caching a couple of expensive
views.  But every now and then those views have to be changed, e.g., new
logic, new columns, etc.  The matviews have to be dropped and re-created
to include new columns.  (Just changing the underlying view logic
without adding new columns is trivial because the matviews are just thin
wrappers that just have to be refreshed.)

We also have several views that depend on those matviews.  The views
must also be dropped in order to re-create the matviews.  We've already
automated this with two procedures that stash and re-create dependent
view definitions.

Native support for replacing matviews would simplify our setup and it
would make CREATE MATERIALIZED VIEW more complete when compared to
CREATE VIEW.

I searched the lists for previous discussions on this topic but couldn't
find any.  So, I don't know if this was ever tried, but rejected for
some reason.  I've found slides[2] from 2013 (when matviews landed in
9.3) which have OR REPLACE on the roadmap:

> Materialised Views roadmap
>
> * CREATE **OR REPLACE** MATERIALIZED VIEW
>   * Just an oversight that it wasn't added
>  [...]

Replacing Matviews
------------------

With patch 0001, a matview can be replaced without having to drop it and
its dependent objects.  In our use case it is no longer necessary to
define the actual query in a separate view.  Replacing a matview works
analogous to CREATE OR REPLACE VIEW:

* the new query may change SELECT list expressions of existing columns
* new columns can be added to the end of the SELECT list
* existing columns cannot be renamed
* the data type of existing columns cannot be changed

In addition to that, CREATE OR REPLACE MATERIALIZED VIEW also replaces
access method, tablespace, and storage parameters if specified.  The
clause WITH [NO] DATA works as expected: it either populates the matview
or leaves it in an unscannable state.

It is an error to specify both OR REPLACE and IF NOT EXISTS.

Example
-------

postgres=# CREATE MATERIALIZED VIEW test AS SELECT 1 AS a;
SELECT 1
postgres=# SELECT * FROM test;
 a
---
 1
(1 row)

postgres=# CREATE OR REPLACE MATERIALIZED VIEW test AS SELECT 2 AS a, 3 AS b;
CREATE MATERIALIZED VIEW
postgres=# SELECT * FROM test;
 a | b
---+---
 2 | 3
(1 row)

Implementation Details
----------------------

Patch 0001 extends create_ctas_internal in order to adapt an existing
matview to the new tuple descriptor, access method, tablespace, and
storage parameters.  This logic is mostly based on DefineViewRelation.
This also reuses checkViewColumns, but adds argument is_matview in order
to tell if we want error messages for a matview (true) or view (false).
I'm not sure if that flag is the correct way to do that, or if I should
just create a separate function just for matviews with the same logic.
Do we even need to distinguish between view and matview in those error
messages?

The patch also adds tab completion in psql for CREATE OR REPLACE
MATERIALIZED VIEW.

[1] https://www.postgresql.org/message-id/226806.1693430777%40sss.pgh.pa.us
[2] https://wiki.postgresql.org/images/a/ad/Materialised_views_now_and_the_future-pgconfeu_2013.pdf#page=23

-- 
Erik

Вложения

Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Aleksander Alekseev
Дата:
Hi,

> Patch 0002 deprecates CREATE MATERIALIZED VIEW IF NOT EXISTS because it
> no longer seems necessary with patch 0001.  Tom Lane commented[1] about
> the general dislike of IF NOT EXISTS, to which I agree, but maybe this
> was meant only in response to adding new commands.  Anyway, my idea is
> to deprecate that usage in PG18 and eventually remove it in PG19, if
> there's consensus for it.  We can drop that clause without violating any
> standard because matviews are a Postgres extension.  I'm not married to
> the idea, just want to put it on the table for discussion.

I can imagine how this may impact many applications and upset many
software developers worldwide. Was there even a precedent (in the
recent decade or so) when PostgreSQL broke the SQL syntax?

To clarify, I'm not opposed to this idea. If we are fine with breaking
backward compatibility on the SQL level, this would allow dropping the
support of inherited tables some day, a feature that in my humble
opinion shouldn't exist (I realize this is another and very debatable
question though). I just don't think this is something we ever do in
this project. But I admit that this information may be incorrect
and/or outdated.

-- 
Best regards,
Aleksander Alekseev



Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Daniel Gustafsson
Дата:
> On 2 Jul 2024, at 03:22, Erik Wienhold <ewie@ewie.name> wrote:

> Patch 0002 deprecates CREATE MATERIALIZED VIEW IF NOT EXISTS because it
> no longer seems necessary with patch 0001.  Tom Lane commented[1] about
> the general dislike of IF NOT EXISTS, to which I agree, but maybe this
> was meant only in response to adding new commands.  Anyway, my idea is
> to deprecate that usage in PG18 and eventually remove it in PG19, if
> there's consensus for it.

Considering the runway we typically give for deprecations, that seems like a
fairly short timeframe for a SQL level command which isn't unlikely to exist
in application code.

--
Daniel Gustafsson




Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Erik Wienhold
Дата:
I wrote:
> Patch 0002 deprecates CREATE MATERIALIZED VIEW IF NOT EXISTS because it
> no longer seems necessary with patch 0001.  Tom Lane commented[1] about
> the general dislike of IF NOT EXISTS, to which I agree, but maybe this
> was meant only in response to adding new commands.

One could also argue that since matviews are a hybrid of tables and
views, that CREATE MATERIALIZED VIEW should accept both OR REPLACE (as
in CREATE VIEW) and IF NOT EXISTS (as in CREATE TABLE).  But not in the
same invocation of course.

On 2024-07-02 12:46 +0200, Aleksander Alekseev wrote:
> > Anyway, my idea is to deprecate that usage in PG18 and eventually
> > remove it in PG19, if there's consensus for it.  We can drop that
> > clause without violating any standard because matviews are a
> > Postgres extension.  I'm not married to the idea, just want to put
> > it on the table for discussion.
> 
> I can imagine how this may impact many applications and upset many
> software developers worldwide. Was there even a precedent (in the
> recent decade or so) when PostgreSQL broke the SQL syntax?

A quick spelunking through the changelog with

    git log --grep deprecat -i --since '10 years ago'

turned up two commits:

578b229718 "Remove WITH OIDS support, change oid catalog column visibility."
e8d016d819 "Remove deprecated COMMENT ON RULE syntax"

Both were committed more than 10 years after deprecating the respective
feature.  My proposed one-year window seems a bit harsh in comparison.

On 2024-07-02 14:27 +0200, Daniel Gustafsson wrote:
> Considering the runway we typically give for deprecations, that seems like a
> fairly short timeframe for a SQL level command which isn't unlikely to exist
> in application code.

Is there some general agreed upon timeframe, or is decided on a
case-by-case basis?  I can imagine waiting at least until the last
release without the deprecation reaches EOL.  That would be 5 years with
the current versioning policy.

-- 
Erik



Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Daniel Gustafsson
Дата:
> On 2 Jul 2024, at 15:58, Erik Wienhold <ewie@ewie.name> wrote:
> On 2024-07-02 14:27 +0200, Daniel Gustafsson wrote:
>> Considering the runway we typically give for deprecations, that seems like a
>> fairly short timeframe for a SQL level command which isn't unlikely to exist
>> in application code.
>
> Is there some general agreed upon timeframe, or is decided on a
> case-by-case basis?  I can imagine waiting at least until the last
> release without the deprecation reaches EOL.  That would be 5 years with
> the current versioning policy.

AFAIK it's all decided on a case-by-case basis depending on impact.  There are
for example the removals you listed, and there are functions in libpq which
were deprecated in the postgres 6.x days which are still around to avoid
breaking ABI.

--
Daniel Gustafsson




Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Said Assemlal
Дата:

Hi,


+1 for this feature.

Replacing Matviews
------------------

With patch 0001, a matview can be replaced without having to drop it and
its dependent objects.  In our use case it is no longer necessary to
define the actual query in a separate view.  Replacing a matview works
analogous to CREATE OR REPLACE VIEW:

* the new query may change SELECT list expressions of existing columns
* new columns can be added to the end of the SELECT list
* existing columns cannot be renamed
* the data type of existing columns cannot be changed

In addition to that, CREATE OR REPLACE MATERIALIZED VIEW also replaces
access method, tablespace, and storage parameters if specified.  The
clause WITH [NO] DATA works as expected: it either populates the matview
or leaves it in an unscannable state.

It is an error to specify both OR REPLACE and IF NOT EXISTS.

I noticed replacing the materialized view is blocking all reads. Is that expected ? Even if there is a unique index ?


Best,
Saïd

Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Erik Wienhold
Дата:
On 2024-07-04 22:18 +0200, Said Assemlal wrote:
> +1 for this feature.

Thanks!

> I noticed replacing the materialized view is blocking all reads. Is that
> expected ? Even if there is a unique index ?

That is expected because AccessExclusiveLock is acquired on the existing
matview.  This is also the case for CREATE OR REPLACE VIEW.

My initial idea, while writing the patch, was that one could replace the
matview without populating it and then run the concurrent refresh, like
this:

    CREATE OR REPLACE MATERIALIZED VIEW foo AS ... WITH NO DATA;
    REFRESH MATERIALIZED VIEW CONCURRENTLY foo;

But that won't work because concurrent refresh requires an already
populated matview.

Right now the patch either populates the replaced matview or leaves it
in an unscannable state.  Technically, it's also possible to skip the
refresh and leave the old data in place, perhaps by specifying
WITH *OLD* DATA.  New columns would just be null.  Of course you can't
tell if you got stale data without knowing how the matview was replaced.
Thoughts?

-- 
Erik



Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Said Assemlal
Дата:
> That is expected because AccessExclusiveLock is acquired on the existing
> matview.  This is also the case for CREATE OR REPLACE VIEW.

Right, had this case many times.

>
> My initial idea, while writing the patch, was that one could replace the
> matview without populating it and then run the concurrent refresh, like
> this:
>
>      CREATE OR REPLACE MATERIALIZED VIEW foo AS ... WITH NO DATA;
>      REFRESH MATERIALIZED VIEW CONCURRENTLY foo;
>
> But that won't work because concurrent refresh requires an already
> populated matview.
>
> Right now the patch either populates the replaced matview or leaves it
> in an unscannable state.  Technically, it's also possible to skip the
> refresh and leave the old data in place, perhaps by specifying
> WITH *OLD* DATA.  New columns would just be null.  Of course you can't
> tell if you got stale data without knowing how the matview was replaced.
> Thoughts?


I believe the expectation is to get materialized views updated whenever 
it gets replaced so likely to confuse users ?







Re: CREATE OR REPLACE MATERIALIZED VIEW

От
Erik Wienhold
Дата:
On 2024-07-12 16:49 +0200, Said Assemlal wrote:
> > My initial idea, while writing the patch, was that one could replace the
> > matview without populating it and then run the concurrent refresh, like
> > this:
> > 
> >      CREATE OR REPLACE MATERIALIZED VIEW foo AS ... WITH NO DATA;
> >      REFRESH MATERIALIZED VIEW CONCURRENTLY foo;
> > 
> > But that won't work because concurrent refresh requires an already
> > populated matview.
> > 
> > Right now the patch either populates the replaced matview or leaves it
> > in an unscannable state.  Technically, it's also possible to skip the
> > refresh and leave the old data in place, perhaps by specifying
> > WITH *OLD* DATA.  New columns would just be null.  Of course you can't
> > tell if you got stale data without knowing how the matview was replaced.
> > Thoughts?
> 
> I believe the expectation is to get materialized views updated whenever it
> gets replaced so likely to confuse users ?

I agree, that could be confusing -- unless it's well documented.  The
attached 0003 implements WITH OLD DATA and states in the docs that this
is intended to be used before a concurrent refresh.

Patch 0001 now covers all matview cases in psql's tab completion.  I
missed some of them with v1.

-- 
Erik

Вложения