Обсуждение: TOAST Fields serialisation/deserialization performance
Hello Gents,
I have a few queries regarding the TOAST Fields serialisation/deserialization performance.
The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field.
Let me know if more inputs are required
Thanks,
Piyush Katariya
On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote: > I have a few queries regarding the TOAST Fields serialisation/deserialization performance. > > The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field. > > I wish to know if i declare certain column in table as "array of UDT/JSONB" and enable > either lz4 or zstd compression on it, does appending or prepending to that array or even > changing the intermediate fields of UDT/JSONB objects. in that array has a runtime cost > of full array data de-serialization every single time. If i perform any UPDATE operation > on its elements or add/remove new elements from any position, does PG rewrites the new > version of the column value regardless of its size. Updating even a small part of a large JSONB value requires that the entire thing is read and written, causing a lot of data churn. This is inefficient, and you shouldn't use large JSONB values if you plan to do that. If the data have a regular structure, use a regular relational data model. Otherwise, one idea might be to split the JSONB in several parts and store each of those parts in a different table row. That would reduce the impact. Yours, Laurenz Albe
Thanks for the feedback. Appreciate it.
On Thu, 27 Jul, 2023, 01:09 Laurenz Albe, <laurenz.albe@cybertec.at> wrote:
On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote:
> I have a few queries regarding the TOAST Fields serialisation/deserialization performance.
>
> The use case i am trying to solve here is to have millions of partitions and aggregate the data in array field.
>
> I wish to know if i declare certain column in table as "array of UDT/JSONB" and enable
> either lz4 or zstd compression on it, does appending or prepending to that array or even
> changing the intermediate fields of UDT/JSONB objects. in that array has a runtime cost
> of full array data de-serialization every single time. If i perform any UPDATE operation
> on its elements or add/remove new elements from any position, does PG rewrites the new
> version of the column value regardless of its size.
Updating even a small part of a large JSONB value requires that the entire thing is
read and written, causing a lot of data churn.
This is inefficient, and you shouldn't use large JSONB values if you plan to do that.
If the data have a regular structure, use a regular relational data model.
Otherwise, one idea might be to split the JSONB in several parts and store each
of those parts in a different table row. That would reduce the impact.
Yours,
Laurenz Albe