Re: [PATCH] Compression dictionaries for JSONB

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [PATCH] Compression dictionaries for JSONB
Дата
Msg-id 20230204133123.mv6rkxloxnkfakww@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [PATCH] Compression dictionaries for JSONB  (Pavel Borisov <pashkin.elfe@gmail.com>)
Ответы Re: [PATCH] Compression dictionaries for JSONB  (Aleksander Alekseev <aleksander@timescale.com>)
Список pgsql-hackers
Hi,

On 2023-02-03 14:39:31 +0400, Pavel Borisov wrote:
> On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > This patch came up at the developer meeting in Brussels yesterday.
> > https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage
> >
> > First, as far as I can tell, there is a large overlap between this patch
> > and "Pluggable toaster" patch.  The approaches are completely different,
> > but they seem to be trying to fix the same problem: the fact that the
> > default TOAST stuff isn't good enough for JSONB.  I think before asking
> > developers of both patches to rebase over and over, we should take a
> > step back and decide which one we dislike the less, and how to fix that
> > one into a shape that we no longer dislike.
> >
> > (Don't get me wrong.  I'm all for having better JSONB compression.
> > However, for one thing, both patches require action from the user to set
> > up a compression mechanism by hand.  Perhaps it would be even better if
> > the system determines that a JSONB column uses a different compression
> > implementation, without the user doing anything explicitly; or maybe we
> > want to give the user *some* agency for specific columns if they want,
> > but not force them into it for every single jsonb column.)
> >
> > Now, I don't think either of these patches can get to a committable
> > shape in time for v16 -- even assuming we had an agreed design, which
> > AFAICS we don't.  But I encourage people to continue discussion and try
> > to find consensus.
> >
> Hi, Alvaro!
>
> I'd like to give my +1 in favor of implementing a pluggable toaster
> interface first. Then we can work on custom toast engines for
> different scenarios, not limited to JSON(b).

I don't think the approaches in either of these threads is
promising. They add a lot of complexity, require implementation effort
for each type, manual work by the administrator for column, etc.


One of the major justifications for work in this area is the cross-row
redundancy for types like jsonb. I think there's ways to improve that
across types, instead of requiring per-type work. We could e.g. use
compression dictionaries to achieve much higher compression
rates. Training of the dictionairy could even happen automatically by
analyze, if we wanted to.  It's unlikely to get you everything a very
sophisticated per-type compression is going to give you, but it's going
to be a lot better than today, and it's going to work across types.


> For example, I find it useful to decrease WAL overhead on the
> replication of TOAST updates. It is quite a pain now that we need to
> rewrite all toast chunks at any TOAST update. Also, it could be good
> for implementing undo access methods etc., etc. Now, these kinds of
> activities in extensions face the fact that core has only one TOAST
> which is quite inefficient in many scenarios.
>
> So overall I value the extensibility part of this activity as the most
> important one and will be happy to see it completed first.

I think the complexity will just make improving toast in-core harder,
without much benefit.


Regards,

Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: undersized unions
Следующее
От: "Daniel Verite"
Дата:
Сообщение: Re: Allow tailoring of ICU locales with custom rules