Обсуждение: Performance of pg_dump on PGSQL 8.0

Поиск
Список
Период
Сортировка

Performance of pg_dump on PGSQL 8.0

От
"John E. Vincent"
Дата:
-- this is the third time I've tried sending this and I never saw it get
through to the list. Sorry if multiple copies show up.

Hi all,

I've been lurking using the web archives for a while and haven't found
an answer that seems to answer my questions about pg_dump.

We have a 206GB data warehouse running on version 8.0.3. The server is
somewhat underpowered in terms of CPU: (1) 2.8 GHz Xeon 4GB Ram and a
single HBA to our SAN (IBM DS4300). We in the process of migrating to a
new server that we've repurposed from our production OLTP database (8)
2.0 GHz Xeon, 16GB Ram and dual HBAs to the same SAN running version 8.1.

Independant of that move, we still need to get by on the old system and
I'm concerned that even on the new system, pg_dump will still perform
poorly. I can't do a full test because we're also taking advantage of
the table partitioning in 8.1 so we're not doing a dump and restore.

We backup the database using:

pg_dump -Fc -cv ${CURDB} > ${BACKDIR}/${CURDB}-${DATE}.bak

There a three different LUNs allocated to the old warehouse on the SAN -
data, wal and a dump area for the backups. The SAN has two controllers
(only 128MB of cache per) and the data is on one controller while the
WAL and dump area are on the other. Still a single HBA though.

Creating the compressed backup of this database takes 12 hours. We start
at 6PM and it's done a little after 1AM, just in time for the next day's
load. The load itself takes about 5 hours.

I've watched the backup process and I/O is not a problem. Memory isn't a
problem either. It seems that we're CPU bound but NOT in I/O wait. The
server is a dedicated PGSQL box.

Here are our settings from the conf file:

maintenance_work_mem = 524288
work_mem = 1048576 ( I know this is high but you should see some of our
sorts and aggregates)
shared_buffers = 50000
effective_cache_size = 450000
wal_buffers = 64
checkpoint_segments = 256
checkpoint_timeout = 3600

We're inserting around 3mil rows a night if you count staging, info, dim
and fact tables. The vacuum issue is a whole other problem but right now
I'm concerned about just the backup on the current hardware.

I've got some space to burn so I could go to an uncompressed backup and
compress it later during the day.

If there are any tips anyone can provide I would greatly appreciate it.
I know that the COPY performance was bumped up in 8.1 but I'm stuck on
this 8.0 box for a while longer.

Thanks,
John E. Vincent

Re: Performance of pg_dump on PGSQL 8.0

От
Tom Lane
Дата:
"John E. Vincent" <pgsql-performance@lusis.org> writes:
> I've watched the backup process and I/O is not a problem. Memory isn't a
> problem either. It seems that we're CPU bound but NOT in I/O wait.

Is it the pg_dump process, or the connected backend, that's chewing the
bulk of the CPU time?  (This should be pretty obvious in "top".)

If it's the pg_dump process, the bulk of the CPU time is likely going
into compression --- you might consider backing off the compression
level, perhaps --compress=1 or even 0 if size of the dump file isn't
a big concern.

Another possibility if your LAN is reasonably fast is to run pg_dump on
a different machine, so that you can put two CPUs to work.

            regards, tom lane

Re: Performance of pg_dump on PGSQL 8.0

От
Scott Marlowe
Дата:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> -- this is the third time I've tried sending this and I never saw it get
> through to the list. Sorry if multiple copies show up.
>
> Hi all,

BUNCHES SNIPPED

> work_mem = 1048576 ( I know this is high but you should see some of our
> sorts and aggregates)

Ummm.  That's REALLY high.  You might want to consider lowering the
global value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries could
theoretically run your machine out of memory right now.  Just put a "set
work_mem=1000000" in your script before the big query runs.

> We're inserting around 3mil rows a night if you count staging, info, dim
> and fact tables. The vacuum issue is a whole other problem but right now
> I'm concerned about just the backup on the current hardware.
>
> I've got some space to burn so I could go to an uncompressed backup and
> compress it later during the day.

That's exactly what we do.  We just do a normal backup, and have a
script that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.

Seeing as how you're CPU bound, most likely the problem is just the
compressed backup.

Re: Performance of pg_dump on PGSQL 8.0

От
"John Vincent"
Дата:


On 6/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> -- this is the third time I've tried sending this and I never saw it get
> through to the list. Sorry if multiple copies show up.
>
> Hi all,

BUNCHES SNIPPED

> work_mem = 1048576 ( I know this is high but you should see some of our
> sorts and aggregates)

Ummm.  That's REALLY high.  You might want to consider lowering the
global value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries could
theoretically run your machine out of memory right now.  Just put a "set
work_mem=1000000" in your script before the big query runs.


I know it is but that's what we need for some of our queries. Our ETL tool (informatica) and BI tool (actuate) won't let us set those things as part of our jobs. We need it for those purposes. We have some really nasty queries that will be fixed in our new server.

E.G. we have a table called loan_account_agg_fact that has 200+ million rows and it contains every possible combination of late status for a customer account (i.e. 1 day late, 2 day late, 3 day late) so it gets inserted for new customers but updated for existing records as part of our warehouse load. Part of the new layout is combining late ranges so instead of number of days we have a range of days ( i.e. 1-15,16-30....). Even with work_mem that large, the load of that loan_account_agg_fact table creates over 3GB of temp tables!


That's exactly what we do.  We just do a normal backup, and have a
script that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.

Seeing as how you're CPU bound, most likely the problem is just the
compressed backup.

I'm starting to think the same thing. I'll see how this COPY I'm doing of the single largest table does right now and make some judgement based on that.

--
John E. Vincent

Re: Performance of pg_dump on PGSQL 8.0

От
"A.M."
Дата:
On Wed, June 14, 2006 1:04 pm, John Vincent wrote:

> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things as
> part of our jobs. We need it for those purposes. We have some really nasty
> queries that will be fixed in our new server.

You could modify pgpool to insert the necessary set commands and point the
tools at pgpool.

-M


Re: Performance of pg_dump on PGSQL 8.0

От
"John Vincent"
Дата:
Out of curiosity, does anyone have any idea what the ratio of actual datasize to backup size is if I use the custom format with -Z 0 compression or the tar format?

Thanks.

On 6/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> -- this is the third time I've tried sending this and I never saw it get
> through to the list. Sorry if multiple copies show up.
>
> Hi all,

BUNCHES SNIPPED

> work_mem = 1048576 ( I know this is high but you should see some of our
> sorts and aggregates)

Ummm.  That's REALLY high.  You might want to consider lowering the
global value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries could
theoretically run your machine out of memory right now.  Just put a "set
work_mem=1000000" in your script before the big query runs.

> We're inserting around 3mil rows a night if you count staging, info, dim
> and fact tables. The vacuum issue is a whole other problem but right now
> I'm concerned about just the backup on the current hardware.
>
> I've got some space to burn so I could go to an uncompressed backup and
> compress it later during the day.

That's exactly what we do.  We just do a normal backup, and have a
script that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.

Seeing as how you're CPU bound, most likely the problem is just the
compressed backup.



--
John E. Vincent
lusis.org@gmail.com

Re: Performance of pg_dump on PGSQL 8.0

От
Scott Marlowe
Дата:
On Wed, 2006-06-14 at 12:04, John Vincent wrote:
>
> On 6/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
>         On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
>         > -- this is the third time I've tried sending this and I
>         never saw it get
>         > through to the list. Sorry if multiple copies show up.
>         >
>         > Hi all,
>
>         BUNCHES SNIPPED
>
>         > work_mem = 1048576 ( I know this is high but you should see
>         some of our
>         > sorts and aggregates)
>
>         Ummm.  That's REALLY high.  You might want to consider
>         lowering the
>         global value here, and then crank it up on a case by case
>         basis, like
>         during nighttime report generation.  Just one or two queries
>         could
>         theoretically run your machine out of memory right now.  Just
>         put a "set
>         work_mem=1000000" in your script before the big query runs.
>
>
> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things
> as part of our jobs. We need it for those purposes. We have some
> really nasty queries that will be fixed in our new server.

Description of "Queries gone wild" redacted.  hehe.

Yeah, I've seen those kinds of queries before too.  you might be able to
limit your exposure by using alter user:

alter user userwhoneedslotsofworkmem set work_mem=1000000;

and then only that user will have that big of a default.  You could even
make it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I just
get REAL nervous seeing a production machine with a work_mem set that
high.


Re: Performance of pg_dump on PGSQL 8.0

От
"John Vincent"
Дата:


On 6/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:

Description of "Queries gone wild" redacted.  hehe.

Yeah, I've seen those kinds of queries before too.  you might be able to
limit your exposure by using alter user:

alter user userwhoneedslotsofworkmem set work_mem=1000000;

Is this applicable on  8.0? We were actually LOOKING for a governor of some sort for these queries.  And something that is not explicitly stated, is that allocated up front or is that just a ceiling?

and then only that user will have that big of a default.  You could even
make it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I just
get REAL nervous seeing a production machine with a work_mem set that
high.

Which is actually how it's configured. We have a dedicated user connecting from  Actuate. The reports developers use thier own logins when developing new reports. Only when they get published do they convert to the Actuate user.




--
John E. Vincent
lusis.org@gmail.com

Re: Performance of pg_dump on PGSQL 8.0

От
Scott Marlowe
Дата:
How long does gzip take to compress this backup?

On Wed, 2006-06-14 at 15:59, John Vincent wrote:
> Okay I did another test dumping using the uncompressed backup on the
> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a
> fair bit of space left on the SAN to work with.
>
> On 6/14/06, John Vincent <pgsql-performance@lusis.org> wrote:
>         Well I did a test to answer my own question:
>
>         -rw-r--r--  1 postgres postgres 167M Jun 14 01:43
>         claDW_PGSQL-20060613170001.bak
>         -rw-r--r--  1 root     root     2.4G Jun 14 14:45
>         claDW_PGSQL.test.bak
>
>         the claDW_PGSQL database is a subset of the data in the main
>         schema that I'm dealing with.
>
>         I did several tests using -Fc -Z0 and a straight pg_dump with
>         no format option.
>
>         The file size is about 1300% larger and takes just as long to
>         dump even for that small database.
>
>         Interestingly enough gzip compresses about 1M smaller with no
>         gzip options.
>
>         I don't know that the uncompressed is really helping much. I'm
>         going to run another query when there's no other users on the
>         system and see how it goes.
>
>
>
> --
> John E. Vincent

Re: Performance of pg_dump on PGSQL 8.0

От
"Jim C. Nasby"
Дата:
On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> Out of curiosity, does anyone have any idea what the ratio of actual
> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?

-Z 0 should mean no compression.

Something you can try is piping the output of pg_dump to gzip/bzip2. On
some OSes, that will let you utilize 1 CPU for just the compression. If
you wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.

Or if you don't care about disk IO bandwidth, just compress after the
fact (though, that could just put you in a situation where pg_dump
becomes bandwidth constrained).
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance of pg_dump on PGSQL 8.0

От
"John Vincent"
Дата:
time gzip -6 claDW_PGSQL.test.bak

real    3m4.360s
user    1m22.090s
sys     0m6.050s

Which is still less time than it would take to do a compressed pg_dump.

On 6/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
How long does gzip take to compress this backup?

On Wed, 2006-06-14 at 15:59, John Vincent wrote:
> Okay I did another test dumping using the uncompressed backup on the
> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a
> fair bit of space left on the SAN to work with.

Re: Performance of pg_dump on PGSQL 8.0

От
"John Vincent"
Дата:


On 6/14/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> Out of curiosity, does anyone have any idea what the ratio of actual
> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?

-Z 0 should mean no compression.

But the custom format is still a binary backup, no?

Something you can try is piping the output of pg_dump to gzip/bzip2. On
some OSes, that will let you utilize 1 CPU for just the compression. If
you wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.

Or if you don't care about disk IO bandwidth, just compress after the
fact (though, that could just put you in a situation where pg_dump
becomes bandwidth constrained).

Unfortunately if we working with our current source box, the 1 CPU is already the bottleneck in regards to compression. If I run the pg_dump from the remote server though, I might be okay.

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance of pg_dump on PGSQL 8.0

От
"Jim C. Nasby"
Дата:
On Wed, Jun 14, 2006 at 05:18:14PM -0400, John Vincent wrote:
> On 6/14/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
> >
> >On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> >> Out of curiosity, does anyone have any idea what the ratio of actual
> >> datasize to backup size is if I use the custom format with -Z 0
> >compression
> >> or the tar format?
> >
> >-Z 0 should mean no compression.
>
>
> But the custom format is still a binary backup, no?

I fail to see what that has to do with anything...

> Something you can try is piping the output of pg_dump to gzip/bzip2. On
> >some OSes, that will let you utilize 1 CPU for just the compression. If
> >you wanted to get even fancier, there is a parallelized version of bzip2
> >out there, which should let you use all your CPUs.
> >
> >Or if you don't care about disk IO bandwidth, just compress after the
> >fact (though, that could just put you in a situation where pg_dump
> >becomes bandwidth constrained).
>
>
> Unfortunately if we working with our current source box, the 1 CPU is
> already the bottleneck in regards to compression. If I run the pg_dump from
> the remote server though, I might be okay.

Oh, right, forgot about that. Yeah, your best bet could be to use an
external machine for the dump.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance of pg_dump on PGSQL 8.0

От
"Merlin Moncure"
Дата:
Just couple of suggestions:

I think on the current server you're pretty much hosed since you are
look like you are cpu bottlenecked.  You probably should take a good
look at PITR and see if that meets your requirements.  Also you
definately want to go to 8.1...it's faster, and every bit helps.

Good luck with the new IBM server ;)

merlin