Re: Compress ReorderBuffer spill files using LZ4

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Compress ReorderBuffer spill files using LZ4
Дата
Msg-id CAA4eK1+bJmDMy4ki7_3ETgMbvsnfkPgLb6fUx6qu__Ws3Sg7kg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Compress ReorderBuffer spill files using LZ4  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Compress ReorderBuffer spill files using LZ4
Список pgsql-hackers
On Tue, Jul 16, 2024 at 7:31 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 7/16/24 14:52, Amit Kapila wrote:
> > On Tue, Jul 16, 2024 at 12:58 AM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> FWIW I'd expect that to be handled at the libpq level - there's already
> >> a patch for that, but I haven't checked if it would handle this. But
> >> maybe more importantly, I think compressing streamed data might need to
> >> handle some sort of negotiation of the compression algorithm, which
> >> seems fairly complex.
> >>
> >> To conclude, I'd leave this out of scope for this patch.
> >>
> >
> > Your point sounds reasonable to me. OTOH, if we want to support
> > compression for spill case then shouldn't there be a question how
> > frequent such an option would be required? Users currently have an
> > option to stream large transactions for parallel apply or otherwise in
> > which case no spilling is required. I feel sooner or later we will
> > make such behavior (streaming=parallel) as default, and then spilling
> > should happen in very few cases. Is it worth adding this new option
> > and GUC if that is true?
> >
>
> I don't know, but streaming is 'off' by default, and I'm not aware of
> any proposals to change this, so when you suggest "sooner or later"
> we'll change this, I'd probably bet on "later or never".
>
> I haven't been following the discussions about parallel apply very
> closely, but my impression from dealing with similar stuff in other
> tools is that it's rather easy to run into issues with some workloads,
> which just makes me more skeptical about "streamin=parallel" by default.
> But as I said, I'm out of the loop so I may be wrong ...
>

It is difficult to say whether enabling it by default will have issues
or not but till now we haven't seen many reports for the streaming =
'parallel' option. It could be due to the reason that not many people
enable it in their workloads. We can probably find out by enabling it
by default.

> As for whether the GUC is needed, I don't know. I guess we might do the
> same thing we do for streaming - we don't have a GUC to enable this, but
> we default to 'off' and the client has to request that when opening the
> replication connection. So it'd be specified at the subscription level,
> more or less.
>
> But then how would we specify compression for cases that invoke decoding
> directly by pg_logical_slot_get_changes()? Through options?
>

If we decide to go with this then yeah that is one way, another
possibility is to make it a slot's property, so we can allow to take a
new parameter in pg_create_logical_replication_slot(). We can even
think of inventing a new API to alter the slot's properties if we
decide to go this route.

> BTW if we specify this at subscription level, will it be possible to
> change the compression method?
>

This needs analysis but offhand I can't see the problems with it.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: pgsql: Add more SQL/JSON constructor functions
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Use read streams in CREATE DATABASE command when the strategy is wal_log