Обсуждение: TOAST Fields serialisation/deserialization performance

Поиск

Список

Период

Сортировка

TOAST Fields serialisation/deserialization performance

От

Piyush Katariya

Дата:

26 июля 2023 г., 12:45:38

Hello Gents,

I have a few queries regarding the TOAST Fields serialisation/deserialization performance.

The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field.

I wish to know if i declare certain column in table as "array of UDT/JSONB" and enable either lz4 or zstd compression on it, does appending or prepending to that array or even changing the intermediate fields of UDT/JSONB objects. in that array has a runtime cost of full array data de-serialization every single time. If i perform any UPDATE operation on its elements or add/remove new elements from any position, does PG rewrites the new version of the column value regardless of its size.

Let me know if more inputs are required

Thanks,

Piyush Katariya

Re: TOAST Fields serialisation/deserialization performance

От

Laurenz Albe

Дата:

26 июля 2023 г., 19:39:25

On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote:
> I have a few queries regarding the TOAST Fields serialisation/deserialization performance.
>
> The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field.
>
> I wish to know if i declare certain column in table as "array of UDT/JSONB" and enable
> either lz4 or zstd compression on it, does appending or prepending to that array or even
> changing the intermediate fields of UDT/JSONB objects. in that array has a runtime cost
> of full array data de-serialization every single time. If i perform any UPDATE operation
> on its elements or add/remove new elements from any position, does PG rewrites the new
> version of the column value regardless of its size.

Updating even a small part of a large JSONB value requires that the entire thing is
read and written, causing a lot of data churn.

This is inefficient, and you shouldn't use large JSONB values if you plan to do that.

If the data have a regular structure, use a regular relational data model.
Otherwise, one idea might be to split the JSONB in several parts and store each
of those parts in a different table row.  That would reduce the impact.

Yours,
Laurenz Albe

Re: TOAST Fields serialisation/deserialization performance

От

Piyush Katariya

Дата:

26 июля 2023 г., 19:48:17

Thanks for the feedback. Appreciate it.

On Thu, 27 Jul, 2023, 01:09 Laurenz Albe, <laurenz.albe@cybertec.at> wrote:

On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote:
> I have a few queries regarding the TOAST Fields serialisation/deserialization performance.
>
> The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field.
>
> I wish to know if i declare certain column in table as "array of UDT/JSONB" and enable
> either lz4 or zstd compression on it, does appending or prepending to that array or even
> changing the intermediate fields of UDT/JSONB objects. in that array has a runtime cost
> of full array data de-serialization every single time. If i perform any UPDATE operation
> on its elements or add/remove new elements from any position, does PG rewrites the new
> version of the column value regardless of its size.

Updating even a small part of a large JSONB value requires that the entire thing is
read and written, causing a lot of data churn.

This is inefficient, and you shouldn't use large JSONB values if you plan to do that.

If the data have a regular structure, use a regular relational data model.
Otherwise, one idea might be to split the JSONB in several parts and store each
of those parts in a different table row. That would reduce the impact.

Yours,
Laurenz Albe

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: TOAST Fields serialisation/deserialization performance

TOAST Fields serialisation/deserialization performance

Re: TOAST Fields serialisation/deserialization performance

Re: TOAST Fields serialisation/deserialization performance