Обсуждение: Performance issues during backup

Поиск
Список
Период
Сортировка

Performance issues during backup

От
Dylan Luong
Дата:

Hi

 

We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.

Ie,

psql -c "SELECT pg_start_backup('${DATE}');"

tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}

psql -c "SELECT pg_stop_backup();"

 

The size of our database is about 250GB and it usually takes about 1 hour to backup.

During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.

During this time (1:30am) usage is low (less than 10 users) on the system.

 

Has anyone experience the same problem and any suggestions where to look at to resolve the problem?

 

Thanks.

Dylan

 

 

Re: Performance issues during backup

От
Laurenz Albe
Дата:
Dylan Luong wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the
instance
> into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"
>  
> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2
to3 seconds.
 
> During this time (1:30am) usage is low (less than 10 users) on the system.
>  
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?

The "tar" is probably taking up too much I/O bandwidth.

Assuming this is Linux, you could run it with

  ionice -c 2 -n 7 tar ...

or

  ionice -c 3 tar ...

Of course then you can expect the backup to take more time.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com


Re: Performance issues during backup

От
Rene Romero Benavides
Дата:
What about sending the backup to a different server? through ssh / rsync or something, that would save lots of IO activity

2018-02-20 2:02 GMT-06:00 Laurenz Albe <laurenz.albe@cybertec.at>:
Dylan Luong wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance
> into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"
>
> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.
> During this time (1:30am) usage is low (less than 10 users) on the system.
>
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?

The "tar" is probably taking up too much I/O bandwidth.

Assuming this is Linux, you could run it with

  ionice -c 2 -n 7 tar ...

or

  ionice -c 3 tar ...

Of course then you can expect the backup to take more time.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com




--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: Performance issues during backup

От
Stephen Frost
Дата:
Greetings,

* Dylan Luong (Dylan.Luong@unisa.edu.au) wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the
instanceinto back mode and then backs up (tar) the /Data directory and then takes it out of backup mode. 
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"

This doesn't do anything to ensure that the backup files were actually
written out to disk, meaning that you might lose files or portions of
files if the system crashed shortly after this completed.

You don't mention anything about archive_command or how you are handling
WAL?  Note that you must have the WAL generated between the start backup
and the stop backup for the backup to be valid.  If you do have an
archive command and it looks anything like what you have for your backup
command, you likely have the same issue there where the WAL isn't being
synced out to disk and therefore you might lose some WAL on a crash and
not be able to do PITR or possibly utilize a given backup.

You also don't meniton anything about checking that you have all of the
WAL needed between the start/stop before considering the backup valid.
I'd strongly suggest you look into a designed-for-PG backup solution
instead of trying to roll your own and trying to fix all of these
issues.  There's a few of them out there that I'd suggest considering-
pgBackRest being favorite (but I'm biased), barman, or wal-e (maybe
wal-g).

> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2
to3 seconds. 
> During this time (1:30am) usage is low (less than 10 users) on the system.
>
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?

Some of the backup tools for PG support throttling to slow down the
backup, but my recommendation here would be to stand up a replica and
have the bulk of the data pulled from the replica for a given backup
instead of doing it from the primary.  As mentioned down-thread, you're
likely saturating either your i/o bandwidth or your network bandwidth
(if you're going to an NFS mount or similar).

Thanks!

Stephen

Вложения

RE: Performance issues during backup

От
Dylan Luong
Дата:

Hi

Thanks for the rely.

I have trialed the ionice -c 2 -n 7 tar…. change to our backup script and it appears to have helped but not by much.

The affected queries are more of the update/delete/insert queries. Could pg_start_backup be causing locking of some sort.

Regards

Dylan

 

From: Rene Romero Benavides [mailto:rene.romero.b@gmail.com]
Sent: Wednesday, 21 February 2018 1:37 AM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Dylan Luong <Dylan.Luong@unisa.edu.au>; pgsql-general@lists.postgresql.org
Subject: Re: Performance issues during backup

 

What about sending the backup to a different server? through ssh / rsync or something, that would save lots of IO activity

 

2018-02-20 2:02 GMT-06:00 Laurenz Albe <laurenz.albe@cybertec.at>:

Dylan Luong wrote:
> We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the instance
> into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> Ie,
> psql -c "SELECT pg_start_backup('${DATE}');"
> tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> psql -c "SELECT pg_stop_backup();"
>
> The size of our database is about 250GB and it usually takes about 1 hour to backup.
> During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2 to 3 seconds.
> During this time (1:30am) usage is low (less than 10 users) on the system.
>
> Has anyone experience the same problem and any suggestions where to look at to resolve the problem?

The "tar" is probably taking up too much I/O bandwidth.

Assuming this is Linux, you could run it with

  ionice -c 2 -n 7 tar ...

or

  ionice -c 3 tar ...

Of course then you can expect the backup to take more time.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



 

--

El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

 

Re: Performance issues during backup

От
PT
Дата:
On Fri, 23 Feb 2018 01:46:22 +0000
Dylan Luong <Dylan.Luong@unisa.edu.au> wrote:

> Hi
> Thanks for the rely.
> I have trialed the ionice -c 2 -n 7 tar…. change to our backup script and it appears to have helped but not by much.

If "ionice -c 2 -n 7" helped some, maybe try "ionice -c 3" to set IO priority
to the lowest possible level.

However, I've used this technique many times on heavily loaded systems without
issue. If you're having to squirm this much to avoid problems, you probably
need to get better hardware or investigate the possibility that your
hardware is faulty or some other IO related issue.

> The affected queries are more of the update/delete/insert queries. Could pg_start_backup be causing locking of some
sort.

Not in my experience. And the fact that they are write queries having trouble
makes me theorize that you're saturating the write capacity of your disks.

> From: Rene Romero Benavides [mailto:rene.romero.b@gmail.com]
> Sent: Wednesday, 21 February 2018 1:37 AM
> To: Laurenz Albe <laurenz.albe@cybertec.at>
> Cc: Dylan Luong <Dylan.Luong@unisa.edu.au>; pgsql-general@lists.postgresql.org
> Subject: Re: Performance issues during backup
> 
> What about sending the backup to a different server? through ssh / rsync or something, that would save lots of IO
activity
> 
> 2018-02-20 2:02 GMT-06:00 Laurenz Albe <laurenz.albe@cybertec.at<mailto:laurenz.albe@cybertec.at>>:
> Dylan Luong wrote:
> > We perform nighty base backup of our production PostgreSQL instance. We have a script that basically puts the
instance
> > into back mode and then backs up (tar) the /Data directory and then takes it out of backup mode.
> > Ie,
> > psql -c "SELECT pg_start_backup('${DATE}');"
> > tar -cvf - ${DATA_DIR} --exclude ${DATA_DIR}/pg_log | split -d -b $TAR_SPLIT_SIZE - ${BACKUP_DIR}/${BACKUP_NAME}
> > psql -c "SELECT pg_stop_backup();"
> >
> > The size of our database is about 250GB and it usually takes about 1 hour to backup.
> > During this time, we have performance issue where queries can take up to 15secs to return where normally it takes 2
to3 seconds.
 
> > During this time (1:30am) usage is low (less than 10 users) on the system.
> >
> > Has anyone experience the same problem and any suggestions where to look at to resolve the problem?
> 
> The "tar" is probably taking up too much I/O bandwidth.
> 
> Assuming this is Linux, you could run it with
> 
>   ionice -c 2 -n 7 tar ...
> 
> or
> 
>   ionice -c 3 tar ...
> 
> Of course then you can expect the backup to take more time.
> 
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
> 
> 
> 
> --
> El genio es 1% inspiración y 99% transpiración.
> Thomas Alva Edison
> http://pglearn.blogspot.mx/
> 


-- 
Bill Moran