Re: Zedstore - compressed in-core columnar storage
От | Heikki Linnakangas |
---|---|
Тема | Re: Zedstore - compressed in-core columnar storage |
Дата | |
Msg-id | ba7fcda3-9b7a-3ccf-d486-bd02070d482f@iki.fi обсуждение исходный текст |
Ответ на | Re: Zedstore - compressed in-core columnar storage (Justin Pryzby <pryzby@telsasoft.com>) |
Список | pgsql-hackers |
On 20/08/2019 05:04, Justin Pryzby wrote: >>> it looks like zedstore >>> with lz4 gets ~4.6x for our largest customer's largest table. zfs using >>> compress=gzip-1 gives 6x compression across all their partitioned >>> tables, >>> and I'm surprised it beats zedstore . I did a quick test, with 10 million random IP addresses, in text format. I loaded it into a zedstore table ("create table ips (ip text) using zedstore"), and poked around a little bit to see how the space is used. postgres=# select lokey, nitems, ncompressed, totalsz, uncompressedsz, freespace from pg_zs_btree_pages('ips') where attno=1 and level=0 limit 10; lokey | nitems | ncompressed | totalsz | uncompressedsz | freespace -------+--------+-------------+---------+----------------+----------- 1 | 4 | 4 | 6785 | 7885 | 1320 537 | 5 | 5 | 7608 | 8818 | 492 1136 | 4 | 4 | 6762 | 7888 | 1344 1673 | 5 | 5 | 7548 | 8776 | 540 2269 | 4 | 4 | 6841 | 7895 | 1256 2807 | 5 | 5 | 7555 | 8784 | 540 3405 | 5 | 5 | 7567 | 8772 | 524 4001 | 4 | 4 | 6791 | 7899 | 1320 4538 | 5 | 5 | 7596 | 8776 | 500 5136 | 4 | 4 | 6750 | 7875 | 1360 (10 rows) There's on average about 10% of free space on the pages. We're losing quite a bit to to ZFS compression right there. I'm sure there's some free space on the heap pages as well, but ZFS compression will squeeze it out. The compression ratio is indeed not very good. I think one reason is that zedstore does LZ4 in relatively small chunks, while ZFS surely compresses large blocks in one go. Looking at the above, there is on average 125 datums packed into each "item" (avg(hikey-lokey) / nitems). I did a quick test with the "lz4" command-line utility, compressing flat files containing random IP addresses. $ lz4 /tmp/125-ips.txt Compressed filename will be : /tmp/125-ips.txt.lz4 Compressed 1808 bytes into 1519 bytes ==> 84.02% $ lz4 /tmp/550-ips.txt Compressed filename will be : /tmp/550-ips.txt.lz4 Compressed 7863 bytes into 6020 bytes ==> 76.56% $ lz4 /tmp/750-ips.txt Compressed filename will be : /tmp/750-ips.txt.lz4 Compressed 10646 bytes into 8035 bytes ==> 75.47% The first case is roughly what we do in zedstore currently: we compress about 125 datums as one chunk. The second case is roughty what we would get, if we collected on 8k worth of datums and compressed them all as one chunk. And the third case simulates the case we would allow the input to be larger than 8k, so that the compressed chunk just fits on an 8k page. Not too much difference between the second and third case, but its pretty clear that we're being hurt by splitting the input into such small chunks. The downside of using a larger compression chunk size is that random access becomes more expensive. Need to give the on-disk format some more thought. Although I actually don't feel too bad about the current compression ratio, perfect can be the enemy of good. - Heikki
В списке pgsql-hackers по дате отправления: