Обсуждение: Copying databases with extensions - pg_dump question

Поиск
Список
Период
Сортировка

Copying databases with extensions - pg_dump question

От
Ivan Voras
Дата:
A fairly frequent operation I do is copying a database between servers,
for which I use pg_dump. Since the database contains some extensions -
most notably hstore and tsearch2, which need superuser privileges to
install, I have a sort of a chicken-and-egg problem: the owner of the
database (and all its objects) should be a non-superuser account so I
can't simply use the output from pg_dump and expect everything to be
correct after restoring it.

So far, I've used this workaround: install all superuser-requiring
extensions in template1 on the destination server and then restore from
pg_dump, ignoring the occasional "duplicate object" errors. This would
work out of the box but pg_dump's "create database" commands (outputted
with -C) includes the "WITH TEMPLATE=template0" clause so I made a small
utility which modifies these dumps to change the one byte so the
template becomes template1. (-C is useful so I can do "psql template1
pgsql < my_dump.sql" and get it all done).

Anyway, this works "good enough" but I wonder now if there is a better
solution for this? As a feature request, I'd like a "template database"
argument to use with "-C" so I don't have to modify the dumps, but there
could be a better solution which side-steps this.

Is there a canonical way to deal with this problem?

Re: Copying databases with extensions - pg_dump question

От
Bill Moran
Дата:
In response to Ivan Voras <ivoras@freebsd.org>:

> A fairly frequent operation I do is copying a database between servers,
> for which I use pg_dump. Since the database contains some extensions -
> most notably hstore and tsearch2, which need superuser privileges to
> install, I have a sort of a chicken-and-egg problem: the owner of the
> database (and all its objects) should be a non-superuser account so I
> can't simply use the output from pg_dump and expect everything to be
> correct after restoring it.

Why not?  If the ownership on the original database is non-superuser, then
that will be faithfully preserved when the database is restored.  What
are you doing to cause it to behave differently?

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Copying databases with extensions - pg_dump question

От
Ivan Voras
Дата:
On 21/01/2011 14:39, Bill Moran wrote:
> In response to Ivan Voras<ivoras@freebsd.org>:
>
>> A fairly frequent operation I do is copying a database between servers,
>> for which I use pg_dump. Since the database contains some extensions -
>> most notably hstore and tsearch2, which need superuser privileges to
>> install, I have a sort of a chicken-and-egg problem: the owner of the
>> database (and all its objects) should be a non-superuser account so I
>> can't simply use the output from pg_dump and expect everything to be
>> correct after restoring it.
>
> Why not?  If the ownership on the original database is non-superuser, then
> that will be faithfully preserved when the database is restored.  What
> are you doing to cause it to behave differently?

I have reviewed my operations and it looks like these are the important
differences:

* The database copy might be from a development machine to production so
I use pg_dump -O to remove any accidentally entered unwanted user
ownership data
* The database restore on the target machine is done as a nonprivileged
user (the target owner of the database)

Are there better ways to do this?

Re: Copying databases with extensions - pg_dump question

От
Bill Moran
Дата:
In response to Ivan Voras <ivoras@freebsd.org>:

> On 21/01/2011 14:39, Bill Moran wrote:
> > In response to Ivan Voras<ivoras@freebsd.org>:
> >
> >> A fairly frequent operation I do is copying a database between servers,
> >> for which I use pg_dump. Since the database contains some extensions -
> >> most notably hstore and tsearch2, which need superuser privileges to
> >> install, I have a sort of a chicken-and-egg problem: the owner of the
> >> database (and all its objects) should be a non-superuser account so I
> >> can't simply use the output from pg_dump and expect everything to be
> >> correct after restoring it.
> >
> > Why not?  If the ownership on the original database is non-superuser, then
> > that will be faithfully preserved when the database is restored.  What
> > are you doing to cause it to behave differently?
>
> I have reviewed my operations and it looks like these are the important
> differences:
>
> * The database copy might be from a development machine to production so
> I use pg_dump -O to remove any accidentally entered unwanted user
> ownership data
> * The database restore on the target machine is done as a nonprivileged
> user (the target owner of the database)
>
> Are there better ways to do this?

In my experience, the answer is yes and no.

On the "yes" side, I would force developers to keep schema information in
sql files in subversion.  Then "copy" your database using a two-step
process (schema and data separately).  If your developers are not already
doing this, then there will likely be a lot of resistance, but it's
worth it in the long run.  You can use placeholders in the schema file
for things like ROLE_OWNER, etc ... then use sed or any number of
simple tools to replace those placeholders as needed.  At work, we
have some pretty awesome tools to make this easier that we're trying to
open-source for PGCon.

On the "no" side, doing this kind of thing is always complex.  We have a
slew of other, very specialized scripts that do things like convert a
production database to a development database by sanitizing sensitive
data, or automatically deploy new database or upgrades to either our
production, development, or lab environments.  Build tools like phing,
make, ant, etc make this kind of thing easier to create, but it's still
a lot of work because each situation is special.

So, my overall answer is that you're probably on the right track, you
probably don't realize how much work is involved to get this really
working well, and I would change just a few things about your approach
based on the information you've provided so far.  The requirement to
use an unprivileged user to restore is going to make the tools you
use to prepare the input files more important, but that's not insurmountable
if you keep control over them (i.e. use schema files and a tool for
normalizing them and do the data dump separately)

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Copying databases with extensions - pg_dump question

От
Ivan Voras
Дата:
On 21/01/2011 15:55, Bill Moran wrote:

> On the "no" side, doing this kind of thing is always complex.  We have a
> slew of other, very specialized scripts that do things like convert a
> production database to a development database by sanitizing sensitive
> data, or automatically deploy new database or upgrades to either our
> production, development, or lab environments.  Build tools like phing,
> make, ant, etc make this kind of thing easier to create, but it's still
> a lot of work because each situation is special.

Yes, I've also made a set of migration and sanitizing scripts so that
aspect is covered. I was only wondering if there is some way I'm missing
that would solve this one step more elegantly.

> So, my overall answer is that you're probably on the right track, you
> probably don't realize how much work is involved to get this really
> working well, and I would change just a few things about your approach
> based on the information you've provided so far.  The requirement to
> use an unprivileged user to restore is going to make the tools you
> use to prepare the input files more important, but that's not insurmountable
> if you keep control over them (i.e. use schema files and a tool for
> normalizing them and do the data dump separately)

Thanks, I'll consider that approach.