Обсуждение: PostgreSQL Dump rate is too slow

Поиск
Список
Период
Сортировка

PostgreSQL Dump rate is too slow

От
girish R G peetle
Дата:
Hi all,
We have a customer who has 1TB database on a production server. They are trying dump based backup of this large database. Following dump command is being used. 
Dump rate is around 12 GB/hr, which will take lot of time for the backup to complete. This is affecting their production server.  
Is there is a way to increase dump data rate ?

pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'


PostgreSQL version : 9.2.4
Platform : Linux

Thanks
Girish

Re: PostgreSQL Dump rate is too slow

От
David G Johnston
Дата:
girish R G peetle wrote
> Hi all,
> We have a customer who has 1TB database on a production server. They are
> trying dump based backup of this large database. Following dump command is
> being used.
> Dump rate is around 12 GB/hr, which will take lot of time for the backup
> to
> complete. This is affecting their production server.
> Is there is a way to increase dump data rate ?
>
> pg_dump -U
> <User>
>  -Fc -b --port=
> <Port>
>  '
> <Db-Name>
> '
>
>
> PostgreSQL version : 9.2.4
> Platform : Linux
>
> Thanks
> Girish

Use a 9.3+ version of pg_dump with the --jobs option and directory output.

David J.




--
View this message in context: http://postgresql.nabble.com/PostgreSQL-Dump-rate-is-too-slow-tp5833279p5833280.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


Re: PostgreSQL Dump rate is too slow

От
Jeff Frost
Дата:
> On Jan 7, 2015, at 21:35, girish R G peetle <giri.anamika0@gmail.com> wrote:
>
> Hi all,
> We have a customer who has 1TB database on a production server. They are trying dump based backup of this large
database.Following dump command is being used.  
> Dump rate is around 12 GB/hr, which will take lot of time for the backup to complete. This is affecting their
productionserver.   
> Is there is a way to increase dump data rate ?
>
> pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'
>

You could reduce the compression ratio. Gzip defaults to -Z6, but -Z3 is quite a bit quicker with not that much larger
dumpfile.   

I did a few tests here: http://frosty-postgres.blogspot.com/2011/12/pgdump-compression-settings.html

Also, if it's is impacting production, you could set up a streaming replica and take the dump from the replica.

Re: PostgreSQL Dump rate is too slow

От
girish R G peetle
Дата:
Thanks David and Jeff.

@David:  In our case we don't want to redirect the dump data to local disk. We will redirect the dump it to a FIFO file ( pipe ). From FIFO file we read the dump data and push to backup Media ( Tape, SAN etc. ).  So we can't use directory output.

@Jeff: Sure, I will reduce compression ratio and update the thread with outcome. Also I'll recommend to setup standby with streaming replication.




On Thu, Jan 8, 2015 at 11:20 AM, Jeff Frost <jeff@pgexperts.com> wrote:

> On Jan 7, 2015, at 21:35, girish R G peetle <giri.anamika0@gmail.com> wrote:
>
> Hi all,
> We have a customer who has 1TB database on a production server. They are trying dump based backup of this large database. Following dump command is being used.
> Dump rate is around 12 GB/hr, which will take lot of time for the backup to complete. This is affecting their production server.
> Is there is a way to increase dump data rate ?
>
> pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'
>

You could reduce the compression ratio. Gzip defaults to -Z6, but -Z3 is quite a bit quicker with not that much larger dump file.

I did a few tests here: http://frosty-postgres.blogspot.com/2011/12/pgdump-compression-settings.html

Also, if it's is impacting production, you could set up a streaming replica and take the dump from the replica.

Re: PostgreSQL Dump rate is too slow

От
Harshad Adalkonda
Дата:

On Thu, Jan 8, 2015 at 11:05 AM, girish R G peetle <giri.anamika0@gmail.com> wrote:

Hi all,
We have a customer who has 1TB database on a production server. They are trying dump based backup of this large database. Following dump command is being used. 
Dump rate is around 12 GB/hr, which will take lot of time for the backup to complete. This is affecting their production server.  
Is there is a way to increase dump data rate ?

pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'


PostgreSQL version : 9.2.4
Platform : Linux

Thanks

Hi Girish,

As the database size is too large and tweaking  few database parameters will result towards performance improvement on pg_dump to some extent and pg_dump -j option would have helped if the DB version is PostgreSQL 9.3 but unfortunately your DB version is PostgreSQL 9.2

Another option I would think of to speed up by performing pg_dump in parallel for Database Or schema level however it would increase the load on the server.

The streaming replication option is the best solution for such higher database which will result towards to reduce the higher downtime/data loss and recovery time also helpful to perform physical and logical backups from replica instead of primary to avoid the impact on the primary servers..

Hope this helps.

Thanks & Regards,

Harshad Adalkonda
Database Administrator

http://www.shreeyansh.com

Re: PostgreSQL Dump rate is too slow

От
Robert Burgholzer
Дата:
To elaborate, I have found the rsync used in my (most?) streaming replication setups is 10x or more faster than pg_dump.  If your job specs required a pg_dump output file it would be fairly simple to do the following (moree or less):
1. setup a replicant (even on the same machine but different drive if you wanted)
2. make sure the replicant is off
3. rsync the data dir to the replicant
4. turn on the replicant
5. execute pg_dump on the replicant

regards,
/r/b

Re: PostgreSQL Dump rate is too slow

От
"ktm@rice.edu"
Дата:
On Thu, Jan 08, 2015 at 11:05:54AM +0530, girish R G peetle wrote:
> Hi all,
> We have a customer who has 1TB database on a production server. They are
> trying dump based backup of this large database. Following dump command is
> being used.
> Dump rate is around 12 GB/hr, which will take lot of time for the backup to
> complete. This is affecting their production server.
> Is there is a way to increase dump data rate ?
>
> pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'
>
>
> PostgreSQL version : 9.2.4
> Platform : Linux
>
> Thanks
> Girish

Hi Girish,

The custom format is compressed by default and your dump rate is hamstrung by
the slow compression algorithm being used. I our case for a smaller DB, turning
off compression completely (-Z 0) dropped our time to dump the DB from 6 hours
to about 20 minutes. You can push the uncompressed dump through a faster compression
engine using lzo, snappy or lz4 if you need to have compressed output due to space
limitations.

Regards,
Ken


Re: PostgreSQL Dump rate is too slow

От
Scott Ribe
Дата:
On Jan 8, 2015, at 4:14 AM, Robert Burgholzer <rburghol@vt.edu> wrote:
>
> To elaborate, I have found the rsync used in my (most?) streaming replication setups is 10x or more faster than
pg_dump. If your job specs required a pg_dump output file it would be fairly simple to do the following (moree or
less):
> 1. setup a replicant (even on the same machine but different drive if you wanted)
> 2. make sure the replicant is off
> 3. rsync the data dir to the replicant
> 4. turn on the replicant
> 5. execute pg_dump on the replicant
>
> regards,
> /r/b

Don't forget to rsync between pg_start_backup & pg_stop_backup.

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: PostgreSQL Dump rate is too slow

От
Kevin Grittner
Дата:
Robert Burgholzer <rburghol@vt.edu> wrote:

> To elaborate, I have found the rsync used in my (most?) streaming
> replication setups is 10x or more faster than pg_dump.  If your
> job specs required a pg_dump output file it would be fairly simple
> to do the following (moree or less):
> 1. setup a replicant (even on the same machine but different
>    drive if you wanted)
> 2. make sure the replicant is off
> 3. rsync the data dir to the replicant
> 4. turn on the replicant
> 5. execute pg_dump on the replicant

If the original cluster is running, unless the steps for a PITR
backup are followed, that rsync could generated a corrupted
database.  The corruption may not be immediately apparent, and
might not happen every time, but the above is not safe without a
few more steps.

http://www.postgresql.org/docs/9.2/interactive/continuous-archiving.html

This is not intended as a complete list, but among other things
archiving should be working first, the rsync should exclude the
postmaster.pid file and the files under the pg_xlog directory,
rsync should be preceded by pg_start_backup() and followed by
pg_stop_backup(), and a recovery.conf file is needed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PostgreSQL Dump rate is too slow

От
David G Johnston
Дата:


On Thu, Jan 8, 2015 at 12:12 AM, Adalkonda Harshad [via PostgreSQL] <[hidden email]> wrote:

As the database size is too large and tweaking  few database parameters will result towards performance improvement on pg_dump to some extent and pg_dump -j option would have helped if the DB version is PostgreSQL 9.3 but unfortunately your DB version is PostgreSQL 9.2


​pg_dump 9.3+ can be used against the 9.2 server without problems.  Read the documentation for the --jobs option for details.

David J.
 


View this message in context: Re: PostgreSQL Dump rate is too slow
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.

Re: PostgreSQL Dump rate is too slow

От
"ktm@rice.edu"
Дата:
On Thu, Jan 08, 2015 at 08:38:37AM -0700, David G Johnston wrote:
> On Thu, Jan 8, 2015 at 12:12 AM, Adalkonda Harshad [via PostgreSQL] <
> ml-node+s1045698n5833283h97@n5.nabble.com> wrote:
>
> >
> > As the database size is too large and tweaking  few database parameters
> > will result towards performance improvement on pg_dump to some extent and
> > pg_dump -j option would have helped if the DB version is PostgreSQL 9.3 but
> > unfortunately your DB version is PostgreSQL 9.2
> >
> >
> ​pg_dump 9.3+ can be used against the 9.2 server without problems.  Read
> the documentation for the --jobs option for details.
>
> David J.
> ​
>
+1

This is a nice option for allowing you to leverage more than a single CPU core
for the dump, provided you have the CPU and I/O cycles to support it. As others
have suggested, offloading to a replica can help with that.

Regards,
Ken


Re: PostgreSQL Dump rate is too slow

От
Alvaro Herrera
Дата:
ktm@rice.edu wrote:
> On Thu, Jan 08, 2015 at 08:38:37AM -0700, David G Johnston wrote:
> > On Thu, Jan 8, 2015 at 12:12 AM, Adalkonda Harshad [via PostgreSQL] <
> > ml-node+s1045698n5833283h97@n5.nabble.com> wrote:
> >
> > >
> > > As the database size is too large and tweaking  few database parameters
> > > will result towards performance improvement on pg_dump to some extent and
> > > pg_dump -j option would have helped if the DB version is PostgreSQL 9.3 but
> > > unfortunately your DB version is PostgreSQL 9.2
> > >
> > pg_dump 9.3+ can be used against the 9.2 server without problems.  Read
> > the documentation for the --jobs option for details.
>
> This is a nice option for allowing you to leverage more than a single CPU core
> for the dump, provided you have the CPU and I/O cycles to support it. As others
> have suggested, offloading to a replica can help with that.

Keep in mind, though, that while 9.3's pg_dump can read from a 9.2
server, there is no guarantee that what it writes will be readable by
9.2 other tools (pg_restore or psql).  Maybe it is currently, but there
are no promises.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PostgreSQL Dump rate is too slow

От
Eduardo Morras
Дата:
On Thu, 8 Jan 2015 11:05:54 +0530
girish R G peetle <giri.anamika0@gmail.com> wrote:

> Hi all,
> We have a customer who has 1TB database on a production server. They
> are trying dump based backup of this large database. Following dump
> command is being used.
> Dump rate is around 12 GB/hr, which will take lot of time for the
> backup to complete. This is affecting their production server.
> Is there is a way to increase dump data rate ?
>
> pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'
>

Do not use pg_dump compression, pipe output to xz

% pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>' | xz -3 dump.xz

or pipe xz output to other program.

When I looked for the same problem in 8.3-8.4 versions, the bottleneck was in accessing TOAST tables, it's content was
decompressed,dumped and recompressed again, don't know if it has changed in current versions. 

>
> PostgreSQL version : 9.2.4
> Platform : Linux
>
> Thanks
> Girish


---   ---
Eduardo Morras <emorrasg@yahoo.es>


Re: PostgreSQL Dump rate is too slow

От
"Joshua D. Drake"
Дата:
On 01/14/2015 12:20 PM, Eduardo Morras wrote:
>
> On Thu, 8 Jan 2015 11:05:54 +0530
> girish R G peetle <giri.anamika0@gmail.com> wrote:
>
>> Hi all,
>> We have a customer who has 1TB database on a production server. They
>> are trying dump based backup of this large database. Following dump
>> command is being used.
>> Dump rate is around 12 GB/hr, which will take lot of time for the
>> backup to complete. This is affecting their production server.
>> Is there is a way to increase dump data rate ?
>>
>> pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'
>>
>
> Do not use pg_dump compression, pipe output to xz
>
> % pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>' | xz -3 dump.xz
>
> or pipe xz output to other program.
>
> When I looked for the same problem in 8.3-8.4 versions, the bottleneck was in accessing TOAST tables, it's content
wasdecompressed, dumped and recompressed again, don't know if it has changed in current versions. 

Don't do this. You are still looking at an extremely slow dump. Instead
set up a warm or hot standby or use pg_basebackup.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
              not be surprised when they come back as Romans."


Re: PostgreSQL Dump rate is too slow

От
girish R G peetle
Дата:
Hi all,
We tried pg_dump with compression level set to zero on 1TB database. Dump data rate started with 250GB/hr and gradually dropped to 30 GB/hr with 2 hours time span. We might see this behavior on standby server too, which will be undesirable.

Any explanation on why we see this behavior ?

Dump command : pg_dump -Fc -Z 0


Thanks
Girish


On Thu, Jan 15, 2015 at 2:32 AM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 01/14/2015 12:20 PM, Eduardo Morras wrote:

On Thu, 8 Jan 2015 11:05:54 +0530
girish R G peetle <giri.anamika0@gmail.com> wrote:

Hi all,
We have a customer who has 1TB database on a production server. They
are trying dump based backup of this large database. Following dump
command is being used.
Dump rate is around 12 GB/hr, which will take lot of time for the
backup to complete. This is affecting their production server.
Is there is a way to increase dump data rate ?

pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>'


Do not use pg_dump compression, pipe output to xz

% pg_dump -U <User> -Fc -b --port=<Port> '<Db-Name>' | xz -3 dump.xz

or pipe xz output to other program.

When I looked for the same problem in 8.3-8.4 versions, the bottleneck was in accessing TOAST tables, it's content was decompressed, dumped and recompressed again, don't know if it has changed in current versions.

Don't do this. You are still looking at an extremely slow dump. Instead set up a warm or hot standby or use pg_basebackup.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
             not be surprised when they come back as Romans."



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: PostgreSQL Dump rate is too slow

От
"Joshua D. Drake"
Дата:
On 01/15/2015 09:21 AM, girish R G peetle wrote:
> Hi all,
> We tried pg_dump with compression level set to zero on 1TB database.
> Dump data rate started with 250GB/hr and gradually dropped to 30 GB/hr
> with 2 hours time span. We might see this behavior on standby server
> too, which will be undesirable.
>
> Any explanation on why we see this behavior ?

Because you have a long running transaction that is causing bloat to
pile up. Using pg_dump on a production database that size is a
non-starter. You need a warm/hot standby or snapshot to do this properly.

JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
              not be surprised when they come back as Romans."


Re: PostgreSQL Dump rate is too slow

От
girish R G peetle
Дата:
Thanks Joshua. Even if we have a long transaction running on the database, pg_dump shouldn't be affected right ? As it doesn't block readers or writers.
Before getting resources to setup stand by server, I just wanna make sure that we don't this issue on stand by too.

On Thu, Jan 15, 2015 at 11:11 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 01/15/2015 09:21 AM, girish R G peetle wrote:
Hi all,
We tried pg_dump with compression level set to zero on 1TB database.
Dump data rate started with 250GB/hr and gradually dropped to 30 GB/hr
with 2 hours time span. We might see this behavior on standby server
too, which will be undesirable.

Any explanation on why we see this behavior ?

Because you have a long running transaction that is causing bloat to pile up. Using pg_dump on a production database that size is a non-starter. You need a warm/hot standby or snapshot to do this properly.


JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
             not be surprised when they come back as Romans."

Re: PostgreSQL Dump rate is too slow

От
Scott Ribe
Дата:
On Jan 15, 2015, at 10:57 AM, girish R G peetle <giri.anamika0@gmail.com> wrote:
>
> Thanks Joshua. Even if we have a long transaction running on the database, pg_dump shouldn't be affected right ? As
itdoesn't block readers or writers.  
> Before getting resources to setup stand by server, I just wanna make sure that we don't this issue on stand by too.

I believe he meant that pg_dump itself is the long-running transaction. Even though read-only transactions block
neitherreaders nor writers, they still prevent cleanup of blocks because they have to be able to see the state of the
dbas of the beginning of the transaction. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: PostgreSQL Dump rate is too slow

От
girish R G peetle
Дата:
Thanks Scott, Joshua. Got it.

On Thu, Jan 15, 2015 at 11:43 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
On Jan 15, 2015, at 10:57 AM, girish R G peetle <giri.anamika0@gmail.com> wrote:
>
> Thanks Joshua. Even if we have a long transaction running on the database, pg_dump shouldn't be affected right ? As it doesn't block readers or writers.
> Before getting resources to setup stand by server, I just wanna make sure that we don't this issue on stand by too.

I believe he meant that pg_dump itself is the long-running transaction. Even though read-only transactions block neither readers nor writers, they still prevent cleanup of blocks because they have to be able to see the state of the db as of the beginning of the transaction.

Re: PostgreSQL Dump rate is too slow

От
"Joshua D. Drake"
Дата:
On 01/15/2015 09:57 AM, girish R G peetle wrote:
> Thanks Joshua. Even if we have a long transaction running on the
> database, pg_dump shouldn't be affected right ? As it doesn't block
> readers or writers.

Correct. It does not block readers or writers, it does however block
routine maintenance most of the time, which will cause bloat to pile up,
making your reads (and writes) go slower as the pages become more
fragmented.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
              not be surprised when they come back as Romans."