Обсуждение: pg_dumpall --clean versus roles and shared dependencies

Поиск
Список
Период
Сортировка

pg_dumpall --clean versus roles and shared dependencies

От
Tom Lane
Дата:
I've been trying to figure out what to do about pg_dumpall's --clean
option in view of our recent changes.  The problem is that pg_dumpall
tries to delete existing users and groups by putting this in its
output script:
DELETE FROM pg_shadow  WHERE usesysid <> (SELECT datdba FROM pg_database                     WHERE datname =
'template0');DELETEFROM pg_group;
 

CVS tip of course will just respond to these withERROR:  cannot delete from a viewHINT:  You need an unconditional ON
DELETEDO INSTEAD rule.
 

So we have two problems: what do we want 8.1 pg_dumpall to do instead,
and what are we going to do about legacy pg_dump scripts that already
contain these commands?

A couple of relevant points:

* Nowhere else do pg_dump and pg_dumpall interpret --clean as a license
for a scorched-earth policy; rather, it means "drop the specific objects
you are going to re-create".  One could argue that these commands are
therefore wrong by design, and what we should emit instead is DROP ROLE
commands for just the individual roles we are going to create.

* In view of the shared-dependencies patch, it is *highly* likely that
some or all of the deletions would fail anyway, due to the users owning
objects or permissions that haven't been deleted (yet).  We could reduce
the risk of this by emitting DROP DATABASE commands before the DROP ROLE
commands, but of course this doesn't fix things if there are additional
databases in the target installation.

I am strongly tempted to propose that --clean is wrongheaded when it
comes to roles, and that pg_dumpall should just always emit all role
information in the styleCREATE ROLE foo;ALTER ROLE foo WITH ... options ... ;
which will have the effect of ensuring that the role exists with all the
desired settings whether it pre-existed or not.

Role membership data is a different story.  Part of the effect ofDELETE FROM pg_group;
was to eliminate group membership data as well as the groups themselves.
As of CVS tip, I have the code doing this instead:DELETE FROM pg_auth_members;
followed by GRANT commands to restore membership links.  I do not like
this solution though.  In the first place, it is still following a
scorched-earth policy, which will completely mess up any pre-existing
groups in the destination installation, even (or especially) if they
are unrelated to what the dump script is loading.  In the second place,
this is certainly failing to learn from experience: we should not have
the output scripts presuming such familiarity with system catalogs of
future Postgres releases.

One possibility is to invent a "REVOKE role FROM *" kind of command
and have --clean mode issue that for each role being reloaded.

This still leaves us with the question of "what about the DELETE
commands in existing scripts?".  I thought about adding rules and
triggers to try to make those operations do approximately what is
intended, but I fear it's a hopeless cause: because of shared
dependencies there is little or no likelihood that *all* of the
individual user drops will succeed, and since they would necessarily
be happening in a single transaction, that means none of them will.
So my current idea is to just ignore the problem: those commands
won't do what is intended but they should be relatively harmless.

Thoughts?
        regards, tom lane


Re: pg_dumpall --clean versus roles and shared dependencies

От
Tom Lane
Дата:
Some time ago I wrote:
> I've been trying to figure out what to do about pg_dumpall's --clean
> option in view of our recent changes.

(for the rest, see
http://archives.postgresql.org/pgsql-hackers/2005-07/msg01143.php

Since we haven't come up with any bright ideas, and it seems far too
late in the cycle to invent new features like "revoke <role> from *",
I propose that we just punt and make "pg_dumpall --clean" do the
straightforward thing of emitting a "DROP ROLE" command for each role
(user or group) it intends to re-create.

This eliminates the previous behavior of attempting to drop every user
and group in the destination installation.  I'm of the opinion that
that's a good thing; there was no such "scorched earth" policy for other
object types, and indeed one would wonder what's the point of having a
--clean option to do that.  You might as well just initdb before loading
the dump file.

There will not be a need to special-case the superuser, either, since
any attempt to drop the owner of template0 will of course fail.

The principal problem with this approach is that since we lack "DROP
ROLE CASCADE", the drops are entirely likely to fail (they will fail
if the roles hold any privileges or own any objects).  This makes the
whole feature rather pointless.  However, that's a problem we will not
be able to solve in the 8.1 time frame, so I propose we live with it.
It's not clear to me that pg_dumpall --clean is a sufficiently widely
used feature to be worth agonizing over.
        regards, tom lane