Re: [HACKERS] Custom compression methods

Поиск
Список
Период
Сортировка
От Ildus Kurbangaliev
Тема Re: [HACKERS] Custom compression methods
Дата
Msg-id 20170912175505.4afa11fd@wp.localdomain
обсуждение исходный текст
Ответ на [HACKERS] Custom compression methods  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Ответы Re: [HACKERS] Custom compression methods  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: [HACKERS] Custom compression methods  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Список pgsql-hackers
On Thu, 7 Sep 2017 19:42:36 +0300
Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote:

> Hello hackers!
> 
> I've attached a patch that implements custom compression
> methods. This patch is based on Nikita Glukhov's code (which he hasn't
> publish in mailing lists) for jsonb compression. This is early but
> working version of the patch, and there are still few fixes and
> features that should be implemented (like pg_dump support and support
> of compression options for types), and it requires more testing. But
> I'd like to get some feedback at the current stage first.
> 
> There's been a proposal [1] of Alexander Korotkov and some discussion
> about custom compression methods before. This is an implementation of
> per-datum compression. Syntax is similar to the one in proposal but
> not the same.
> 
> Syntax:
> 
> CREATE COMPRESSION METHOD <cmname> HANDLER <compression_handler>;
> DROP COMPRESSION METHOD <cmname>;
> 
> Compression handler is a function that returns a structure containing
> compression routines:
> 
> - configure - function called when the compression method applied to
>   an attribute
> - drop - called when the compression method is removed from an
> attribute
> - compress - compress function
> - decompress - decompress function
> 
> User can create compressed columns with the commands below:
> 
> CREATE TABLE t(a tsvector COMPRESSED <cmname> WITH <options>);
> ALTER TABLE t ALTER COLUMN a SET COMPRESSED <cmname> WITH <options>; 
> ALTER TABLE t ALTER COLUMN a SET NOT COMPRESSED;
> 
> Also there is syntax of binding compression methods to types:
> 
> ALTER TYPE <type> SET COMPRESSED <cmname>;
> ALTER TYPE <type> SET NOT COMPRESSED;
> 
> There are two new tables in the catalog, pg_compression and
> pg_compression_opt. pg_compression is used as storage of compression
> methods, and pg_compression_opt is used to store specific compression
> options for particular column.
> 
> When user binds a compression method to some column a new record in
> pg_compression_opt is created and all further attribute values will
> contain compression options Oid while old values will remain
> unchanged. And when we alter a compression method for
> the attribute it won't change previous record in pg_compression_opt.
> Instead it'll create a new one and new values will be stored
> with new Oid. That way there is no need of recompression of the old
> tuples. And also tuples containing compressed datums can be copied to
> other tables so records in pg_compression_opt shouldn't be removed. In
> the current patch they can be removed with DROP COMPRESSION METHOD
> CASCADE, but after that decompression won't be possible on compressed
> tuples. Maybe CASCADE should keep compression options.
> 
> I haven't changed the base logic of working with compressed datums. It
> means that custom compressed datums behave exactly the same as current
> LZ compressed datums, and the logic differs only in
> toast_compress_datum and toast_decompress_datum.
> 
> This patch doesn't break backward compability and should work
> seamlessly with older version of database. I used one of two free
> bits in `va_rawsize` from `varattrib_4b->va_compressed` as flag of
> custom compressed datums. Also I renamed it to `va_info` since it
> contains not only rawsize now.
> 
> The patch also includes custom compression method for tsvector which
> is used in tests.
> 
> [1]
> https://www.postgresql.org/message-id/CAPpHfdsdTA5uZeq6MNXL5ZRuNx%2BSig4ykWzWEAfkC6ZKMDy6%3DQ%40mail.gmail.com

Attached rebased version of the patch. Added support of pg_dump, the
code was simplified, and a separate cache for compression options was
added.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chris Travers
Дата:
Сообщение: [HACKERS] pg_rewind proposed scope and interface changes
Следующее
От: Andreas Karlsson
Дата:
Сообщение: Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12