Re: jsonb format is pessimal for toast compression

Поиск
Список
Период
Сортировка
От Marti Raudsepp
Тема Re: jsonb format is pessimal for toast compression
Дата
Msg-id CABRT9RDKfOF7+8gonQggcPXSvu8TwXOTGJKvV4=u=SHBq8Dspg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: jsonb format is pessimal for toast compression  (Hannu Krosing <hannu@2ndQuadrant.com>)
Список pgsql-hackers
On Fri, Aug 8, 2014 at 10:50 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> How hard and how expensive would it be to teach pg_lzcompress to
> apply a delta filter on suitable data ?
>
> So that instead of integers their deltas will be fed to the "real"
> compressor

Has anyone given this more thought? I know this might not be 9.4
material, but to me it sounds like the most promising approach, if
it's workable. This isn't a made up thing, the 7z and LZMA formats
also have an optional delta filter.

Of course with JSONB the problem is figuring out which parts to apply
the delta filter to, and which parts not.

This would also help with integer arrays, containing for example
foreign key values to a serial column. There's bound to be some
redundancy, as nearby serial values are likely to end up close
together. In one of my past projects we used to store large arrays of
integer fkeys, deliberately sorted for duplicate elimination.

For an ideal case comparison, intar2 could be as large as intar1 when
compressed with a 4-byte wide delta filter:

create table intar1 as select array(select 1::int from
generate_series(1,1000000)) a;
create table intar2 as select array(select generate_series(1,1000000)::int) a;

In PostgreSQL 9.3 the sizes are:
select pg_column_size(a) from intar1;         45810
select pg_column_size(a) from intar2;       4000020

So a factor of 87 difference.

Regards,
Marti



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: SSL regression test suite
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: SSL regression test suite