Re: Zedstore - compressed in-core columnar storage

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Zedstore - compressed in-core columnar storage
Дата
Msg-id ba7fcda3-9b7a-3ccf-d486-bd02070d482f@iki.fi
обсуждение исходный текст
Ответ на Re: Zedstore - compressed in-core columnar storage  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
On 20/08/2019 05:04, Justin Pryzby wrote:
>>> it looks like zedstore
>>>     with lz4 gets ~4.6x for our largest customer's largest table.  zfs using
>>>     compress=gzip-1 gives 6x compression across all their partitioned
>>> tables,
>>>     and I'm surprised it beats zedstore .

I did a quick test, with 10 million random IP addresses, in text format. 
I loaded it into a zedstore table ("create table ips (ip text) using 
zedstore"), and poked around a little bit to see how the space is used.

postgres=# select lokey, nitems, ncompressed, totalsz, uncompressedsz, 
freespace from  pg_zs_btree_pages('ips') where attno=1 and level=0 limit 10;
  lokey | nitems | ncompressed | totalsz | uncompressedsz | freespace
-------+--------+-------------+---------+----------------+-----------
      1 |      4 |           4 |    6785 |           7885 |      1320
    537 |      5 |           5 |    7608 |           8818 |       492
   1136 |      4 |           4 |    6762 |           7888 |      1344
   1673 |      5 |           5 |    7548 |           8776 |       540
   2269 |      4 |           4 |    6841 |           7895 |      1256
   2807 |      5 |           5 |    7555 |           8784 |       540
   3405 |      5 |           5 |    7567 |           8772 |       524
   4001 |      4 |           4 |    6791 |           7899 |      1320
   4538 |      5 |           5 |    7596 |           8776 |       500
   5136 |      4 |           4 |    6750 |           7875 |      1360
(10 rows)

There's on average about 10% of free space on the pages. We're losing 
quite a bit to to ZFS compression right there. I'm sure there's some 
free space on the heap pages as well, but ZFS compression will squeeze 
it out.

The compression ratio is indeed not very good. I think one reason is 
that zedstore does LZ4 in relatively small chunks, while ZFS surely 
compresses large blocks in one go. Looking at the above, there is on 
average 125 datums packed into each "item" (avg(hikey-lokey) / nitems). 
I did a quick test with the "lz4" command-line utility, compressing flat 
files containing random IP addresses.

$ lz4 /tmp/125-ips.txt
Compressed filename will be : /tmp/125-ips.txt.lz4
Compressed 1808 bytes into 1519 bytes ==> 84.02% 

$ lz4 /tmp/550-ips.txt
Compressed filename will be : /tmp/550-ips.txt.lz4
Compressed 7863 bytes into 6020 bytes ==> 76.56% 

$ lz4 /tmp/750-ips.txt
Compressed filename will be : /tmp/750-ips.txt.lz4
Compressed 10646 bytes into 8035 bytes ==> 75.47%

The first case is roughly what we do in zedstore currently: we compress 
about 125 datums as one chunk. The second case is roughty what we would 
get, if we collected on 8k worth of datums and compressed them all as 
one chunk. And the third case simulates the case we would allow the 
input to be larger than 8k, so that the compressed chunk just fits on an 
8k page. Not too much difference between the second and third case, but 
its pretty clear that we're being hurt by splitting the input into such 
small chunks.

The downside of using a larger compression chunk size is that random 
access becomes more expensive. Need to give the on-disk format some more 
thought. Although I actually don't feel too bad about the current 
compression ratio, perfect can be the enemy of good.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Make SQL/JSON error code names match SQL standard
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: POC: Cleaning up orphaned files using undo logs