Обсуждение: [GENERAL] Using cp to back up a database?

Поиск
Список
Период
Сортировка

[GENERAL] Using cp to back up a database?

От
Ron Johnson
Дата:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


Should it use rsync or pg_dump instead?

Thanks

-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Larry Rosenman
Дата:

If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.

 

-- 

Larry Rosenman                     http://www.lerctr.org/~ler

Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org

US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

 

 

From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?

 

Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"

cp -r /var/lib/pgsql/data/* $dumpdir/data/

/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"



Should it use rsync or pg_dump instead?

Thanks


-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Melvin Davidson
Дата:


On Mon, Oct 9, 2017 at 9:51 AM, Larry Rosenman <ler@lerctr.org> wrote:

If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.

 

-- 

Larry Rosenman                     http://www.lerctr.org/~ler

Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org

US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

 

 

From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?

 

Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"

cp -r /var/lib/pgsql/data/* $dumpdir/data/

/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"



Should it use rsync or pg_dump instead?

Thanks


-- 
World Peace Through Nuclear Pacification

>Should it use rsync or pg_dump instead?

It is a matter of choice. What you are doing with the script is making a backup of the entire PostgreSQL data file directory. That includes all PostgreSQL system catalogs as well as user objects. A restore of your data directory would be faster than a full restore of pg_dump as indexes and constraints would not have to be recreated. However, it does not give you the option to restore one or more selected objects.
pg_dump, on the other hand, gives you the flexibility of selecting what to backup and what to restore. FWIW, I much prefer pg_dump.

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Using cp to back up a database?

От
Achilleas Mantzios
Дата:
On 09/10/2017 16:51, Larry Rosenman wrote:

If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.


In all pg conferences I have been, ppl scream : do not use pg_dump for backups :)

 

-- 

Larry Rosenman                     http://www.lerctr.org/~ler

Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org

US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

 

 

From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?

 

Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"

cp -r /var/lib/pgsql/data/* $dumpdir/data/

/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"



Should it use rsync or pg_dump instead?

Thanks


-- 
World Peace Through Nuclear Pacification


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

Re: [GENERAL] Using cp to back up a database?

От
Michael Paquier
Дата:
On Mon, Oct 9, 2017 at 11:09 PM, Achilleas Mantzios
<achill@matrix.gatewaynet.com> wrote:
> On 09/10/2017 16:51, Larry Rosenman wrote:
>
> If you want a consistent database (you *REALLY* do), pg_dump is the correct
> tool.
>
> In all pg conferences I have been, ppl scream : do not use pg_dump for
> backups :)

It depends on what you are trying to achieve, pg_dump can be fine for
small-ish databases. By relying on both logical (pg_dump) and physical
backups (base backups) brings more insurance in face of a disaster.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using cp to back up a database?

От
Ron Johnson
Дата:

Sure I want a consistent database.  Why doesn't?

But log shipping requires you to rsync/var/lib/pgsql/data  to the remote server, and that's consistent, so why wouldn't rsync to a local directory also be consistent?

On 10/09/2017 08:51 AM, Larry Rosenman wrote:

If you want a consistent database (you *REALLY* do), pg_dump is the correct tool.

 

-- 

Larry Rosenman                     http://www.lerctr.org/~ler

Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org

US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

 

 

From: <pgsql-general-owner@postgresql.org> on behalf of Ron Johnson <ron.l.johnson@cox.net>
Date: Monday, October 9, 2017 at 8:41 AM
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] Using cp to back up a database?

 

Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"

cp -r /var/lib/pgsql/data/* $dumpdir/data/

/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"



Should it use rsync or pg_dump instead?

Thanks



-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Achilleas Mantzios
Дата:
On 09/10/2017 17:13, Michael Paquier wrote:
> On Mon, Oct 9, 2017 at 11:09 PM, Achilleas Mantzios
> <achill@matrix.gatewaynet.com> wrote:
>> In all pg conferences I have been, ppl scream : do not use pg_dump for
>> backups :)
> It depends on what you are trying to achieve, pg_dump can be fine for
> small-ish databases. By relying on both logical (pg_dump) and physical
> backups (base backups) brings more insurance in face of a disaster.

Basically yes, it depends on size. Where I work in order to do a proper pg_dump would take ~ one week. Business can't
waitfor that long.
 


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using cp to back up a database?

От
Jeff Janes
Дата:
On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


That's fine, as long as you have a wal archive.  Although I don't know what is "Incremental" about it.  If you upgrade to a version which wasn't quite so ancient, you could use pg_basebackup.
 

Should it use rsync or pg_dump instead?

rsync is dangerous if not used with great care, and probably isn't going to get you much for just doing a backup.

pg_dump is also fine, but it does something different, it creates a logical backup, not a physical backup.  The backup from pg_dump cannot be used to seed a PITR or streaming replica.  On the other hand, it can restored into a database from a different version and different architecture.   And with pg_dump the pg_start_backup and pg_stop_backup are useless and confusing.

Cheers

Re: [GENERAL] Using cp to back up a database?

От
Ron Johnson
Дата:
On 10/09/2017 11:33 AM, Jeff Janes wrote:
On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


That's fine, as long as you have a wal archive.  Although I don't know what is "Incremental" about it.

From reading the docs, that field is just a label with no intrinsic meaning.


-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Darren Douglas
Дата:
Ron:

Here is an explanation that may help a bit.

Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large. 

In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.

As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.

As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.

Darren



The first

On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


Should it use rsync or pg_dump instead?

Thanks

-- 
World Peace Through Nuclear Pacification



--
Darren Douglas
Synse Solutions
520-661-5885

Re: [GENERAL] Using cp to back up a database?

От
Jeff Janes
Дата:
On Mon, Oct 9, 2017 at 7:15 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:

Sure I want a consistent database.  Why doesn't?

But log shipping requires you to rsync/var/lib/pgsql/data  to the remote server, and that's consistent, so why wouldn't rsync to a local directory also be consistent?

But it isn't consistent by itself.  That is why the log shipping is required (or an equivalent method of keeping the necessary logs around), to fix up the consistency.
 
Cheers,

Jeff

Re: [GENERAL] Using cp to back up a database?

От
Jeff Janes
Дата:
On Mon, Oct 9, 2017 at 9:41 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
On 10/09/2017 11:33 AM, Jeff Janes wrote:
On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


That's fine, as long as you have a wal archive.  Although I don't know what is "Incremental" about it.

From reading the docs, that field is just a label with no intrinsic meaning.

Its purpose is to communicate with you.  If the person who implemented my backup solution is attempting to communicate with me, and I have no idea what that communication means, then that makes me quite nervous.

Cheers,

Jeff

Re: [GENERAL] Using cp to back up a database?

От
Darren Douglas
Дата:
Sorry, forgot about your main question about cp ...

So, because the physical backup is a copy of the entire cluster data directory, ANY valid and safe method for copying that directory is ok.

In most production installations, that means that an enterprise backup tool may be used to accomplish the copy. It may mean a shell script that does backup rotation, etc. However, at its simplest, it may just be a cp or rsync command.

So yes, very simply, the cp -r command issued in the script, along with ensuring that WALs are safely archived, is a valid backup approach.

Personally, I would use cp -ar instead. The 'a' option invokes cp in "archive" mode, which will preserve file permissions.

On Mon, Oct 9, 2017 at 9:49 AM, Darren Douglas <darren@synsesolutions.com> wrote:
Ron:

Here is an explanation that may help a bit.

Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large. 

In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.

As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.

As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.

Darren



The first

On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


Should it use rsync or pg_dump instead?

Thanks

-- 
World Peace Through Nuclear Pacification



--
Darren Douglas
Synse Solutions



--
Darren Douglas
Synse Solutions
520-661-5885

Re: [GENERAL] Using cp to back up a database?

От
Ron Johnson
Дата:
Maybe my original question wasn't clear, so I'll try again: is it safe to do a physical using cp (as opposed to rsync)?


On 10/09/2017 11:49 AM, Darren Douglas wrote:
Ron:

Here is an explanation that may help a bit.

Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large. 

In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.

As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.

As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.

Darren



The first

On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


Should it use rsync or pg_dump instead?

Thanks

-- 
World Peace Through Nuclear Pacification



--
Darren Douglas
Synse Solutions
520-661-5885


-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Scott Mead
Дата:


On Mon, Oct 9, 2017 at 1:19 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Maybe my original question wasn't clear, so I'll try again: is it safe to do a physical using cp (as opposed to rsync)?

Yes -- however you must configure WAL archiving first.  If not, no backup tool, cp, rsync, etc... will provide a good backup.

Oh, and BTW -- The obligatory: You are on an ancient, EOL version of PG.  Upgrade.

Make sure that these are set:
- archive_mode
- archive_command


Then, on when you restore the backup, you need to create a recovery.conf and configure
- restore_command





 



On 10/09/2017 11:49 AM, Darren Douglas wrote:
Ron:

Here is an explanation that may help a bit.

Your script is executing a PHYSICAL backup. A physical backup is simply a full copy of the cluster (instance) data directory ($PGDATA). A physical backup is your best option when you need to backup the cluster data as well as all configuration for the cluster. Essentially, if you had to rebuild the entire computer hosting the cluster, you could just reinstall the same version of postgres, copy in the backup data directory, and the cluster would run exactly as it did before with the same data. A physical backup is also necessary when the databases get very large. 

In the backup script you posted, the 'pg_start_backup' and 'pg_stop_backup' commands fulfill two purposes. The first is to create a label for the point in time the backup was started - this is done by pg_start_backup. The second is to ensure that all WAL segments that have been written since the backup began have been safely archived. That is done by pg_stop_backup. This approach is necessary to accomplish an online physical backup.

As others have mentioned pg_dump is a LOGICAL backup tool similar to any SQL dump you've done with another DBMS. The pg_dump command will do a SQL dump to recreate everything within a single database. So, if you have multiple databases in your cluster, its not the best option. pg_dumpall is the logical backup tool that will do a logical dump of all globals (schema + roles) along with all databases in the cluster. Because the pg_dump/pg_dumpall commands are not executing a physical backup, the pg_start_backup and pg_stop_backup commands do not apply.

As for times when you would elect to do a logical backup, as others have mentioned, this is the only valid option when you are restoring to a different version of Postgres. It is also a good option to do a backup of a single small database or several small databases. And, if for any reason the backup needs to be human-readable, this is the approach of choice as well.

Darren



The first

On Mon, Oct 9, 2017 at 6:40 AM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Hi,

v8.4.20

This is what the current backup script uses:

/usr/bin/psql -U postgres -c "SELECT pg_start_backup('Incrementalbackup',true);"
cp -r /var/lib/pgsql/data/* $dumpdir/data/
/usr/bin/psql -U postgres template1 -c "SELECT pg_stop_backup();"


Should it use rsync or pg_dump instead?

Thanks

-- 
World Peace Through Nuclear Pacification



--
Darren Douglas
Synse Solutions


-- 
World Peace Through Nuclear Pacification



--
--
Scott Mead
Sr. Architect
OpenSCG

Re: [GENERAL] Using cp to back up a database?

От
Ron Johnson
Дата:
On 10/09/2017 01:02 PM, Scott Mead wrote:


On Mon, Oct 9, 2017 at 1:19 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:
Maybe my original question wasn't clear, so I'll try again: is it safe to do a physical using cp (as opposed to rsync)?

Yes -- however you must configure WAL archiving first.  If not, no backup tool, cp, rsync, etc... will provide a good backup.

Oh, and BTW -- The obligatory: You are on an ancient, EOL version of PG.  Upgrade.

Make sure that these are set:
- archive_mode
- archive_command


Then, on when you restore the backup, you need to create a recovery.conf and configure
- restore_command


This is good to know.  Thanks.

-- 
World Peace Through Nuclear Pacification

Re: [GENERAL] Using cp to back up a database?

От
Stephen Frost
Дата:
Ron,

* Ron Johnson (ron.l.johnson@cox.net) wrote:
> Maybe my original question wasn't clear, so I'll try again: is it
> safe to do a physical using cp (as opposed to rsync)?

Frankly, I'd say no.  There's nothing to guarantee that the backup is
actually sync'd out to disk.  Further, you're probably in the same boat
when it comes to WAL archiving (just using 'cp' there isn't good for the
same reason), though that's arguably worse because once you return true
from archive_command, that WAL file will be gone and you could end up
with a hole in your WAL stream.

Next, this backup method has been deprecated because of issues with the
backup_label file and what happens when the system crashes during a
backup.

So, no, you shouldn't be just using 'cp', or 'rsync', or any of those
similar, simple, tools for doing a proper PG backup.  Use a tool which
has been developed specifically for PG such as pgBackRest, barman,
WAL-E, WAL-G, etc.

Thanks!

Stephen