Re: Compress ReorderBuffer spill files using LZ4

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Compress ReorderBuffer spill files using LZ4
Дата
Msg-id 87060757-4936-4e3b-8d49-418f71d07eb3@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Compress ReorderBuffer spill files using LZ4  (Julien Tachoires <julmon@gmail.com>)
Ответы Re: Compress ReorderBuffer spill files using LZ4
Re: Compress ReorderBuffer spill files using LZ4
Список pgsql-hackers
On 7/15/24 20:50, Julien Tachoires wrote:
> Hi,
> 
> Le ven. 7 juin 2024 à 06:18, Julien Tachoires <julmon@gmail.com> a écrit :
>>
>> Le ven. 7 juin 2024 à 05:59, Tomas Vondra
>> <tomas.vondra@enterprisedb.com> a écrit :
>>>
>>> On 6/6/24 12:58, Julien Tachoires wrote:
>>>> ...
>>>>
>>>> When compiled with LZ4 support (--with-lz4), this patch enables data
>>>> compression/decompression of these temporary files. Each transaction
>>>> change that must be written on disk (ReorderBufferDiskChange) is now
>>>> compressed and encapsulated in a new structure.
>>>>
>>>
>>> I'm a bit confused, but why tie this to having lz4? Why shouldn't this
>>> be supported even for pglz, or whatever algorithms we add in the future?
>>
>> That's right, reworking this patch in that sense.
> 
> Please find a new version of this patch adding support for LZ4, pglz
> and ZSTD. It introduces the new GUC logical_decoding_spill_compression
> which is used to set the compression method. In order to stay aligned
> with the other server side GUCs related to compression methods
> (wal_compression, default_toast_compression), the compression level is
> not exposed to users.
> 

Sounds reasonable. I wonder if it might be useful to allow specifying
the compression level in those places, but that's clearly not something
this patch needs to do.

> The last patch of this set is still in WIP, it adds the machinery
> required for setting the compression methods as a subscription option:
> CREATE SUBSCRIPTION ... WITH (spill_compression = ...);
> I think there is a major problem with this approach: the logical
> decoding context is tied to one replication slot, but multiple
> subscriptions can use the same replication slot. How should this work
> if 2 subscriptions want to use the same replication slot but different
> compression methods?
> 

Do we really support multiple subscriptions sharing the same slot? I
don't think we do, but maybe I'm missing something.

> At this point, compression is only available for the changes spilled
> on disk. It is still not clear to me if the compression of data
> transiting through the streaming protocol should be addressed by this
> patch set or by another one. Thought ?
> 

I'd stick to only compressing the data spilled to disk. It might be
useful to compress the streamed data too, but why shouldn't we compress
the regular (non-streamed) transactions too? Yeah, it's more efficient
to compress larger chunks, but we can fit quite large transactions into
logical_decoding_work_mem without spilling.

FWIW I'd expect that to be handled at the libpq level - there's already
a patch for that, but I haven't checked if it would handle this. But
maybe more importantly, I think compressing streamed data might need to
handle some sort of negotiation of the compression algorithm, which
seems fairly complex.

To conclude, I'd leave this out of scope for this patch.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nitin Motiani
Дата:
Сообщение: Re: long-standing data loss bug in initial sync of logical replication
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Upgrade Debian CI images to Bookworm