Обсуждение: Is there anything special about pg_dump's compression?

Поиск
Список
Период
Сортировка

Is there anything special about pg_dump's compression?

От
Jean-David Beyer
Дата:
When I run pg_dump, the computer spends a great amount of time in "system"
state. Like 100% of one cpu and part of another. The small part seems to be
the postgreSQL server, and the big part the client (pg_dump) compressing the
data.

Now my tape drive has built-in compression anyway (although I could turn it
off). I prefer to let the hardware compression run since it is a nuisance to
turn it on and off and I want it on for my normal backups of the rest of the
system.

Does pg_dump's compression do anything really special that it is not likely
the tape drive already does? The drive claims 2:1 compression for average
data (e.g., not already compressed stuff like .jpeg files).


--  .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 10:50:01 up 23 days, 4:08, 5 users, load average:
4.16,4.40, 4.44
 


Re: Is there anything special about pg_dump's compression?

От
Andrew Sullivan
Дата:
On Thu, Nov 15, 2007 at 11:05:44AM -0500, Jean-David Beyer wrote:
> Does pg_dump's compression do anything really special that it is not likely
> the tape drive already does? The drive claims 2:1 compression for average
> data (e.g., not already compressed stuff like .jpeg files).

It's zlib, if I recall correctly.  So probably not.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke


Re: Is there anything special about pg_dump's compression?

От
Jean-David Beyer
Дата:
Andrew Sullivan wrote:
> On Thu, Nov 15, 2007 at 11:05:44AM -0500, Jean-David Beyer wrote:
>> Does pg_dump's compression do anything really special that it is not
>> likely the tape drive already does? The drive claims 2:1 compression
>> for average data (e.g., not already compressed stuff like .jpeg files).
>> 
> 
> It's zlib, if I recall correctly.  So probably not.
> 
I turned the software compression off. It took:

524487428 bytes (524 MB) copied, 125.394 seconds, 4.2 MB/s


When I let the software compression run, it uses only 30 MBytes. So whatever
compression it uses is very good on this kind of data.

29810260 bytes (30 MB) copied, 123.145 seconds, 242 kB/s


Since the whole database like that was probably in RAM, I would not expect
much IO time. Also the data transfer light was on a lot of the time instead
of short blinks. It did not seem to lighten the CPU load much. The postgres
server process got 100% of a cpu and the client took about 12% of another
when running uncompressed. I imagined the client did the compression and
writing to tape, and the server just picked up the data from the
shared_buffers (= 253000 @ 8KB each); i.e., that holds about 2 GBytes. When
the client is compressing, the client's cpu takes about 40% of a processor.
When it is not compressing, it takes about 12% of a processor.

If I am right, it seems to take a lot of time to pick up the database from
RAM if it requires 100% of a 3.06GHz Xeon processor. The tape drive (Exabyte
VXA-2) has a 12 MB/sec transfer rate, so it should be the limiting factor
(but it does not seem to be), but I do not notice a whole lot of IO-Wait
time (though there is some).

Any idea why the server is compute-limited just reading from the shared
buffers and delivering it to the client to write to tape? Is it that I have
too many shared buffers and I should reduce it from about 2 GBytes? Does it
sequentially search the shared buffers or something? I made it large so I
could get at least all the active indices in, and preferably the hot data
pages as well.

--  .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 23:15:01 up 23 days, 16:33, 2 users, load average:
5.25,5.32, 5.34
 


Re: Is there anything special about pg_dump's compression?

От
Tom Lane
Дата:
Jean-David Beyer <jeandavid8@verizon.net> writes:
> I turned the software compression off. It took:
> 524487428 bytes (524 MB) copied, 125.394 seconds, 4.2 MB/s

> When I let the software compression run, it uses only 30 MBytes. So whatever
> compression it uses is very good on this kind of data.
> 29810260 bytes (30 MB) copied, 123.145 seconds, 242 kB/s

Seems to me the conclusion is obvious: you are writing about the same
number of bits to physical tape either way.  The physical tape speed is
surely the real bottleneck here, and the fact that the total elapsed
time is about the same both ways proves that about the same number of
bits went onto tape both ways.

The quoted MB and MB/s numbers are not too comparable because they are
before and after compression respectively.

The software compression seems to be a percent or two better than the
hardware's compression, but that's not enough to worry about really.
What you should ask yourself is whether you have other uses for the main
CPU's cycles during the time you're taking backups.  If so, offload the
compression cycles onto the tape hardware.  If not, you might as well
gain the one or two percent win.
        regards, tom lane


Re: Is there anything special about pg_dump's compression?

От
Jean-David Beyer
Дата:
Tom Lane wrote:
> Jean-David Beyer <jeandavid8@verizon.net> writes:
>> I turned the software compression off. It took:
>> 524487428 bytes (524 MB) copied, 125.394 seconds, 4.2 MB/s
> 
>> When I let the software compression run, it uses only 30 MBytes. So whatever
>> compression it uses is very good on this kind of data.
>> 29810260 bytes (30 MB) copied, 123.145 seconds, 242 kB/s
> 
> Seems to me the conclusion is obvious: you are writing about the same
> number of bits to physical tape either way. 

I guess so. I _am_ impressed by how much compression is achieved.

> The physical tape speed is
> surely the real bottleneck here, and the fact that the total elapsed
> time is about the same both ways proves that about the same number of
> bits went onto tape both ways.

I do not get that. If the physical tape speed is the bottleneck, why is it
only about 242 kB/s in the software-compressed case, and 4.2 MB/s in the
hardware-uncompressed case? The tape drive usually gives over 6 MB/s rates
when running a BRU (similar to find > cpio) when doing a backup of the rest
of my system (where not all the files compress very much)? Also, when doing
a BRU backup, the amount of cpu time is well under 100%. If I am right, the
postgres server is running 100% of the CPU and the client (pg_dump) is the
one that actually compresses (if it is enabled in software) is either 40% or
12%.
> 
> The quoted MB and MB/s numbers are not too comparable because they are
> before and after compression respectively.
> 
> The software compression seems to be a percent or two better than the
> hardware's compression, but that's not enough to worry about really.

Agreed. The times for backup (and restore) are acceptable. Being new to
postgres, I am just interested in how it works from a user's point-of-view.

> What you should ask yourself is whether you have other uses for the main
> CPU's cycles during the time you're taking backups.  If so, offload the
> compression cycles onto the tape hardware.  If not, you might as well
> gain the one or two percent win.

Sure, I always have something to do with the excess cycles, though it is not
an obsession of mine.

But from intellectual curiousity, why is the postgres _server_ taking 100%
of a cpu when doing a backup when it is the postgres _client_ that is
actually running the tape drive -- especially if it is tape IO limited?

--  .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 07:40:01 up 24 days, 58 min, 0 users, load average:
4.30,4.29, 4.21
 


Re: Is there anything special about pg_dump's compression?

От
Shane Ambler
Дата:
Jean-David Beyer wrote:
> Tom Lane wrote:
>> Jean-David Beyer <jeandavid8@verizon.net> writes:
>>> I turned the software compression off. It took:
>>> 524487428 bytes (524 MB) copied, 125.394 seconds, 4.2 MB/s
>>> When I let the software compression run, it uses only 30 MBytes. So whatever
>>> compression it uses is very good on this kind of data.
>>> 29810260 bytes (30 MB) copied, 123.145 seconds, 242 kB/s
>> Seems to me the conclusion is obvious: you are writing about the same
>> number of bits to physical tape either way. 
> 
> I guess so. I _am_ impressed by how much compression is achieved.

Plain text tends to get good compression in most algorithms, repetitive 
content tends to improve things a lot. (think of how many CREATE TABLE 
COPY FROM stdin ALTER TABLE ADD CONSTRAINT GRANT ALL ON SCHEMA REVOKE 
ALL ON SCHEMA ...... are in your backup files)

To test that create a text file with one line - "this is data\n"
Then bzip that file - the original uses 13 bytes the compressed uses 51 
bytes.

now change the file to have 4000 lines of "this is data\n"
the original is 52,000 bytes and compressed it is 76 bytes
- it uses 25 bytes to indicate the same string is repeated 4000 times

>> The physical tape speed is
>> surely the real bottleneck here, and the fact that the total elapsed
>> time is about the same both ways proves that about the same number of
>> bits went onto tape both ways.
> 
> I do not get that. If the physical tape speed is the bottleneck, why is it
> only about 242 kB/s in the software-compressed case, and 4.2 MB/s in the
> hardware-uncompressed case? The tape drive usually gives over 6 MB/s rates
> when running a BRU (similar to find > cpio) when doing a backup of the rest

It would really depend on where the speed measurement comes from and how 
they are calculated. Is it data going to the drive controller or is it 
data going to tape? Is it the uncompressed size of data going to tape?

My guess is that it is calculated as the uncompressed size going to 
tape. In the two examples you give similar times for the same original 
uncompressed data.

I would say that both methods send 30MB to tape which takes around 124 
seconds

The first example states 4.2MB/s - calculated from the uncompressed size 
of 524MB, yet the drive compresses that to 30MB which is written to 
tape. So it is saying it got 524MB and saved it to tape in 125 seconds 
(4.2MB/s), but it still only put 30MB on the tape.

524MB/125 seconds = 4.192MB per second

The second example states 242KB/s - calculated from the size sent to the 
drive - as the data the drive gets is compressed it can't compress it 
any smaller - the data received is the same size as the data written to 
tape. This would indicate your tape speed.

30MB/123 seconds = 243KB/s

To verify this -

524/30=17 - the compressed data is 1/17 the original size.

242*17=4114 - that's almost the 4.2MB/s that you get sending 
uncompressed data, I would say you get a little more compression from 
the tape hardware that gives you the slightly better transfer rate.
Or sending compressed data to the drive with it set to compress incoming 
data is causing a delay as the drive tries to compress the data without 
reducing the size sent to tape. (my guess is that if you disabled the 
drive compression and sent the compressed pg_dump to the drive you would 
get about 247KB/s)


I would also say the 6MB/s from a drive backup would come about from -
1. less overhead as data is sent directly from disk to tape. (DMA should 
reduce the software overhead as well). (pg_dump formats the data it gets 
and waits for responses from postgres - no DMA)

And maybe -
2. A variety of file contents would also offer different rates of 
compression - some of your file system contents can be compressed more 
than pg_dump output.
3. Streamed as one lot to the drive it may also allow it to treat your 
entire drive contents as one file - allowing duplicates in different 
files to be compressed the way the above example does.


-- 

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz


Re: Is there anything special about pg_dump's compression?

От
Jean-David Beyer
Дата:
Shane Ambler wrote:
> Jean-David Beyer wrote:

>>> The physical tape speed is surely the real bottleneck here, and the
>>> fact that the total elapsed time is about the same both ways proves
>>> that about the same number of bits went onto tape both ways.
>> 
>> I do not get that. If the physical tape speed is the bottleneck, why is
>> it only about 242 kB/s in the software-compressed case, and 4.2 MB/s in
>> the hardware-uncompressed case? The tape drive usually gives over 6
>> MB/s rates when running a BRU (similar to find > cpio) when doing a
>> backup of the rest
> 
> It would really depend on where the speed measurement comes from and how 
> they are calculated. Is it data going to the drive controller or is it 
> data going to tape? Is it the uncompressed size of data going to tape?

I imagine it is the speed measured by the CPU of data going to the (Linux)
operating system's write() calls.
> 
> My guess is that it is calculated as the uncompressed size going to tape.
> In the two examples you give similar times for the same original 
> uncompressed data.

True. But that tells me that it is the CPU that is the limiting factor. In
other words, if I send compressed data, it sends 30 Megabytes in about the
same time that if I send uncompressed data (for the tape drive hardware to
compress -- the SCSI controller driving the tape drive sure does not
compress anything much). I originally started this thread because I wanted
to know if the compression in pg_dump was anything special, and I was told
that it was probably not. And this seems to be the case as it takes about
the same amount of time do dump the database whether I compress it in
pg_dump or in the tape drive. But then it seemed, and still seems to me,
that instead of being limited by the tape speed, it is limited by the CPU
speed of the CPU running the postgres server -- and that confuses me, since
intuitively it is not doing much.
> 
> I would say that both methods send 30MB to tape which takes around 124 
> seconds

You are right about this. In other words, the time to send the data to the
tape drive, whether it is 30 Megabytes (compressed by the program) or 524
megabytes (compressed by the drive) will put down about the same number of
bytes onto the tape. I.e., the tape head sees (about) the same number of
bytes either way. This means the transmission speed of the SCSI controller
is certainly fast enough to handle what is going on (though I do not think
there was any questioning of that). But since the tape drive can take 6
uncompressed megabytes per second (and it does -- this is not advertizing
hype: I get that when doing normal backups of my system), and is getting
only 4.3, that means the bottleneck is _before_ the SCSI controller.

Here is an typical example. Bru does a backup of my entire system (except
for the
postgres stuff), rewinds the tape, and reads it all back in, verifying the
checksum of every block on the tape. It does not (although it could) do any
compression.

**** bru: execution summary ****

Started:                Wed Nov 14 01:04:16 2007
Completed:              Wed Nov 14 02:02:56 2007
Archive id:             473a8fe017a4
Messages:               0 warnings,  0 errors
Archive I/O:            5588128 blocks (11176256Kb) written
Archive I/O:            5588128 blocks (11176256Kb) read
Files written:          202527 files (170332 regular, 32195 other)

So we wrote 11.176Gb, rewound the tape, and then read 11.176Gb and then
rewound the tape again in about an hour. Ignoring rewind times, this would
say it wrote or read 6.2 uncompressed megabytes/second. It would be a little
faster if we consider that the rewind times are not really important in this
discussion. This is the rate of stuff going to the interface. This just
shows that the 6 Megabytes/second claimed by the manufacturer is realistic
-- that you actually get this in a real application.

Now what is on this machine? A lot of binary program files that probably do
not compress much. Quite a bunch of .jpeg files that are already compressed,
so they probably do not compress much. Some .mp3 files: I do not know how
much they compress. Program source files (but not lots of them). _Lots_ of
files that have been zipped, so they probably do not compress much;
1,347,184 blocks worth of that stuff.
> 
> The first example states 4.2MB/s - calculated from the uncompressed size 
> of 524MB, yet the drive compresses that to 30MB which is written to tape.
> So it is saying it got 524MB and saved it to tape in 125 seconds 
> (4.2MB/s), but it still only put 30MB on the tape.
> 
> 524MB/125 seconds = 4.192MB per second
> 
> The second example states 242KB/s - calculated from the size sent to the 
> drive - as the data the drive gets is compressed it can't compress it any
> smaller - the data received is the same size as the data written to tape.
> This would indicate your tape speed.
> 
> 30MB/123 seconds = 243KB/s
> 
> To verify this -
> 
> 524/30=17 - the compressed data is 1/17 the original size.
> 
> 242*17=4114 - that's almost the 4.2MB/s that you get sending uncompressed
> data, I would say you get a little more compression from the tape
> hardware that gives you the slightly better transfer rate. Or sending
> compressed data to the drive with it set to compress incoming data is
> causing a delay as the drive tries to compress the data without reducing
> the size sent to tape. (my guess is that if you disabled the drive
> compression and sent the compressed pg_dump to the drive you would get
> about 247KB/s)
> 
I suppose so too, but it is too much bother to turn the compression off, so
I do not propose to test that.
> 
> I would also say the 6MB/s from a drive backup would come about from - 1.
> less overhead as data is sent directly from disk to tape. (DMA should 
> reduce the software overhead as well). (pg_dump formats the data it gets 
> and waits for responses from postgres - no DMA)

Well, the tape drive is run by an Ultra/320 LVD SCSI controller on a
dedicated PCI-X bus, and that is sent in 65536-byte blocks, so the software
overhead is minimal -- as can be seen by the fact that the client (pg_dump)
only runs at 12% of a CPU when writing uncompressed. It is the postgres
_server_ that runs at 100% of a CPU, so it is the bottleneck. The question
is What is the postgres server doing that needs 100% of a 3.06 GigaHertz
Xeon processor? Recall that this database is by no means fully loaded and
everything is in RAM.
> 
> And maybe - 2. A variety of file contents would also offer different
> rates of compression - some of your file system contents can be
> compressed more than pg_dump output.

If I am getting 17:1 compression on the database, I would say that is far
more compression than I get on the average files in my system. Manufacturers
normally claim you get about 2:1 compression on what they consider typical
data. Some claim even that is optimistic. Some vendors claim 2.5:1
compression, and maybe for their data, whatever it is, they do get that.

> 3. Streamed as one lot to the drive it may also allow it to treat your 
> entire drive contents as one file - allowing duplicates in different 
> files to be compressed the way the above example does.
> 
I am not sure what you mean by streamed in this case. My tape drive can
start and stop while running if required. In fact, if the computer fails to
keep up, it slows down the writing speed so that the tape need not start and
stop -- so it never has to backspace and restart if the computer has trouble
keeping up like the old DDS-2 tapes had to do all the time. The tape drive
writes one block at a time. It also compresses (or decompresses) one block
at a time.

--  .~.  Jean-David Beyer          Registered Linux User 85642. /V\  PGP-Key: 9A2FC99A         Registered Machine
241939./()\ Shrewsbury, New Jersey    http://counter.li.org^^-^^ 14:45:01 up 24 days, 8:03, 0 users, load average:
4.67,4.43, 4.18