Re: Compress ReorderBuffer spill files using LZ4

Поиск
Список
Период
Сортировка
От Julien Tachoires
Тема Re: Compress ReorderBuffer spill files using LZ4
Дата
Msg-id CAFEQCbHJvU_paNcmWw2A0tmSGt8yNgjT4LWO7DFH41V=YSmqZg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Compress ReorderBuffer spill files using LZ4  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
Le jeu. 6 juin 2024 à 07:24, Alvaro Herrera <alvherre@alvh.no-ip.org> a écrit :
>
> On 2024-Jun-06, Amit Kapila wrote:
>
> > On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon@gmail.com> wrote:
> > >
> > > When the content of a large transaction (size exceeding
> > > logical_decoding_work_mem) and its sub-transactions has to be
> > > reordered during logical decoding, then, all the changes are written
> > > on disk in temporary files located in pg_replslot/<slot_name>.
> > > Decoding very large transactions by multiple replication slots can
> > > lead to disk space saturation and high I/O utilization.
>
> I like the general idea of compressing the output of logical decoding.
> It's not so clear to me that we only want to do so for spilling to disk;
> for instance, if the two nodes communicate over a slow network, it may
> even be beneficial to compress when streaming, so to this question:
>
> > Why can't one use 'streaming' option to send changes to the client
> > once it reaches the configured limit of 'logical_decoding_work_mem'?
>
> I would say that streaming doesn't necessarily have to mean we don't
> want compression, because for some users it might be beneficial.

Interesting idea, will try to evaluate how to compress/decompress data
transiting via streaming and how good the compression ratio would be.

> I think a GUC would be a good idea.  Also, what if for whatever reason
> you want a different compression algorithm or different compression
> parameters?  Looking at the existing compression UI we offer in
> pg_basebackup, perhaps you could add something like this:
>
> compress_logical_decoding = none
> compress_logical_decoding = lz4:42
> compress_logical_decoding = spill-zstd:99
>
> "none" says to never use compression (perhaps should be the default),
> "lz4:42" says to use lz4 with parameters 42 on both spilling and
> streaming, and "spill-zstd:99" says to use Zstd with parameter 99 but
> only for spilling to disk.

I agree, if the server was compiled with support of multiple
compression libraries, users should be able to choose which one they
want to use.

> (I don't mean to say that you should implement Zstd compression with
> this patch, only that you should choose the implementation so that
> adding Zstd support (or whatever) later is just a matter of adding some
> branches here and there.  With the current #ifdef you propose, it's hard
> to do that.  Maybe separate the parts that depend on the specific
> algorithm to algorithm-agnostic functions.)

Makes sense, will rework this patch in that way.

Thank you!

Regards,

JT



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: problems with "Shared Memory and Semaphores" section of docs
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: problems with "Shared Memory and Semaphores" section of docs