Re: libpq compression

Поиск

Список

Период

Сортировка

От	Florian Pflug
Тема	Re: libpq compression
Дата	25 июня 2012 г. 13:13:13
Msg-id	73447F47-E9A3-420B-8903-9F6A4513E229@phlo.org обсуждение исходный текст
Ответ на	Re: libpq compression (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: libpq compression ("ktm@rice.edu" <ktm@rice.edu>)
Список	pgsql-hackers

Дерево обсуждения

On Jun25, 2012, at 04:04 , Robert Haas wrote:
> If, for
> example, someone can demonstrate that an awesomebsdlz compresses 10x
> as fast as OpenSSL... that'd be pretty compelling.

That, actually, is demonstrably the case for at least Google's snappy.
(and LZO, but that's not an option since its license is GPL) They state in
their documentation that
In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ, etc.) while
achievingcomparable compression ratios.

The only widely supported compression method for SSL seems to be DEFLATE,
which is also what gzip/zlib uses. I've benchmarked LZO against gzip/zlib
a few months ago, and LZO outperformed zlib in fast mode (i.e. gzip -1) by
an order of magnitude.

The compression ratio achieved by DEFLATE/gzip/zlib is much better, though.
The snappy documentation states
Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and
ofcourse 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are
2.6-2.8x,3-7x and 1.0x, respectively.

Here are a few numbers for LZO vs. gzip. Snappy should be comparable to
LZO - I tested LZO because I still had the command-line compressor lzop
lying around on my machine, whereas I'd have needed to download and compile
snappy first.

$ dd if=/dev/random of=data bs=1m count=128
$ time gzip -1 < data > data.gz
real 0m6.189s
user 0m5.947s
sys 0m0.224s
$ time lzop < data > data.lzo
real 0m2.697s
user 0m0.295s
sys 0m0.224s
$ ls -lh data*
-rw-r--r-- 1 fgp staff 128M Jun 25 14:43 data
-rw-r--r-- 1 fgp staff 128M Jun 25 14:44 data.gz
-rw-r--r-- 1 fgp staff 128M Jun 25 14:44 data.lzo

$ dd if=/dev/zero of=zeros bs=1m count=128
$ time gzip -1 < zeros > zeros.gz
real 0m1.083s
user 0m1.019s
sys 0m0.052s
$ time lzop < zeros > zeros.lzo
real 0m0.186s
user 0m0.123s
sys 0m0.053s
$ ls -lh zeros*
-rw-r--r-- 1 fgp staff 128M Jun 25 14:47 zeros
-rw-r--r-- 1 fgp staff 572K Jun 25 14:47 zeros.gz
-rw-r--r-- 1 fgp staff 598K Jun 25 14:47 zeros.lzo

To summarize, on my 2.66 Ghz Core2 Duo Macbook Pro, LZO compresses about
350MB/s if the data is purely random, and about 800MB/s if the data
compresses extremely well. (Numbers based on user time since that indicates
the CPU time used, and ignores the IO overhead, which is substantial)

IMHO, the only compelling argument (and a very compelling one) to use
SSL compression was that it requires very little code on our side. We've
since discovered that it's not actually that simple, at least if we want
to support compression without authentication or encryption, and don't
want to restrict ourselves to using OpenSSL forever. So unless we give
up at least one of those requirements, the arguments for using
SSL-compression are rather thin, I think.

best regards,
Florian Pflug

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Kohei KaiGai
Дата: 25 июня 2012 г., 12:49:41
Сообщение: Re: WIP Patch: Selective binary conversion of CSV file foreign tables

Следующее

От: "ktm@rice.edu"
Дата: 25 июня 2012 г., 13:20:15
Сообщение: Re: libpq compression

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: libpq compression

Предыдущее

Следующее