Обсуждение: Increased storage size of jsonb in pg15

Поиск
Список
Период
Сортировка

Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
Hello,

We are seeing an increase in the reported storage size of our jsonb columns (using pg_column_size) since upgrading to Postgres 15.5 from version 14.  The increase in size is enough to push us over the size threshold and more of our data to get written to the TOAST table.  

I can't find any information online, is anyone aware of what would cause this change in storage size for essentially the same data?

Kind regards,
Sean

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/13/23 14:07, Sean Flaherty wrote:
> Hello,
> 
> We are seeing an increase in the reported storage size of our jsonb 
> columns (using pg_column_size) since upgrading to Postgres 15.5 from 
> version 14.  The increase in size is enough to push us over the size 
> threshold and more of our data to get written to the TOAST table.

More information needed:

1) Table definition.

2) An example of reported size for the 14.? and 15.5 cases.

3) Differences in the database attributes between the two instances.

4) Was this done on same machine? If not how are the machines different?

5) How the upgrade was done?


> 
> I can't find any information online, is anyone aware of what would cause 
> this change in storage size for essentially the same data?
> 
> Kind regards,
> Sean

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
More information needed:

1) Table definition.

\d "DataRecords"
                Partitioned table "public.DataRecords"
   Column   |            Type             | Collation | Nullable |   Default  
------------+-----------------------------+-----------+----------+-------------
 filename   | character varying(255)      |           | not null |
 timestamp  | timestamp without time zone |           | not null |
 data       | jsonb                       |           | not null | '{}'::jsonb
 created_at | timestamp with time zone    |           | not null |
 updated_at | timestamp with time zone    |           | not null |
Partition key: RANGE ("timestamp")
Indexes:
    "DataRecords_pkey" PRIMARY KEY, btree (filename, "timestamp")
    "DataRecords_filename_updated_at_idx" btree (filename, updated_at)
Foreign-key constraints:
    "DataRecords_filename_fkey" FOREIGN KEY (filename) REFERENCES "DataFiles"(filename) ON UPDATE CASCADE ON DELETE CASCADE
Number of partitions: 97 (Use \d+ to list them.)

2) An example of reported size for the 14.? and 15.5 cases.

  Since upgrading from 14.8 to 15.5, the jsonb data that was previously written in 14.8 is reporting a smaller size than the same hourly data written after the upgrade (upgrade indicated in yellow):

  
filehourly_timestampfilename_bytestimestamp_bytesdata_filesizecreated_at_bytesupdated_at_bytes
datafile.dat2023-10-19 12:00:00238168288
datafile.dat2023-10-19 13:00:00238168788
datafile.dat2023-10-19 14:00:00238168588
datafile.dat2023-10-19 15:00:00238166888
datafile.dat2023-10-19 16:00:00238215588
datafile.dat2023-10-19 17:00:00238217888
datafile.dat2023-10-19 18:00:00238219988
datafile.dat2023-10-19 19:00:00238218788
datafile.dat2023-10-19 20:00:00238218088
datafile.dat2023-10-19 21:00:00238217688
datafile.dat2023-10-19 22:00:00238205388
datafile.dat2023-10-19 23:00:00238204388

3) Differences in the database attributes between the two instances.

  No differences.

4) Was this done on same machine? If not how are the machines different?

  Same AWS RDS Postgres instance.

5) How the upgrade was done?

  Using the AWS RDS upgrade.



On Wed, Dec 13, 2023 at 3:21 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/13/23 14:07, Sean Flaherty wrote:
> Hello,
>
> We are seeing an increase in the reported storage size of our jsonb
> columns (using pg_column_size) since upgrading to Postgres 15.5 from
> version 14.  The increase in size is enough to push us over the size
> threshold and more of our data to get written to the TOAST table.

More information needed:

1) Table definition.

2) An example of reported size for the 14.? and 15.5 cases.

3) Differences in the database attributes between the two instances.

4) Was this done on same machine? If not how are the machines different?

5) How the upgrade was done?


>
> I can't find any information online, is anyone aware of what would cause
> this change in storage size for essentially the same data?
>
> Kind regards,
> Sean

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/13/23 15:49, Sean Flaherty wrote:
> More information needed:
> 

> 2) An example of reported size for the 14.? and 15.5 cases.
> 
>    Since upgrading from 14.8 to 15.5, the jsonb data that was previously 
> written in 14.8 is reporting a smaller size than the same hourly data 
> written after the upgrade (upgrade indicated in yellow):

What is producing datafile.dat and how?

> 
> *file*    *hourly_timestamp*    *filename_bytes*    *timestamp_bytes* 
> *data_filesize*    *created_at_bytes*    *updated_at_bytes*
> datafile.dat    2023-10-19 12:00:00    23    8    1682    8    8
> datafile.dat    2023-10-19 13:00:00    23    8    1687    8    8
> datafile.dat    2023-10-19 14:00:00    23    8    1685    8    8
> datafile.dat    2023-10-19 15:00:00    23    8    1668    8    8
> datafile.dat    2023-10-19 16:00:00    23    8    2155    8    8
> datafile.dat    2023-10-19 17:00:00    23    8    2178    8    8
> datafile.dat    2023-10-19 18:00:00    23    8    2199    8    8
> datafile.dat    2023-10-19 19:00:00    23    8    2187    8    8
> datafile.dat    2023-10-19 20:00:00    23    8    2180    8    8
> datafile.dat    2023-10-19 21:00:00    23    8    2176    8    8
> datafile.dat    2023-10-19 22:00:00    23    8    2053    8    8
> datafile.dat    2023-10-19 23:00:00    23    8    2043    8    8

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
We have a process that runs once an hour to read the .dat file in csv format then a node script using the pg package version "8.8.0" to create the json objects and insert the data records as jsonb data.  

None of the upload process changed during the underlying database upgrade.


On Wed, Dec 13, 2023 at 4:56 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/13/23 15:49, Sean Flaherty wrote:
> More information needed:
>

> 2) An example of reported size for the 14.? and 15.5 cases.
>
>    Since upgrading from 14.8 to 15.5, the jsonb data that was previously
> written in 14.8 is reporting a smaller size than the same hourly data
> written after the upgrade (upgrade indicated in yellow):

What is producing datafile.dat and how?

>
> *file*        *hourly_timestamp*      *filename_bytes*        *timestamp_bytes*
> *data_filesize*       *created_at_bytes*      *updated_at_bytes*
> datafile.dat  2023-10-19 12:00:00     23      8       1682    8       8
> datafile.dat  2023-10-19 13:00:00     23      8       1687    8       8
> datafile.dat  2023-10-19 14:00:00     23      8       1685    8       8
> datafile.dat  2023-10-19 15:00:00     23      8       1668    8       8
> datafile.dat  2023-10-19 16:00:00     23      8       2155    8       8
> datafile.dat  2023-10-19 17:00:00     23      8       2178    8       8
> datafile.dat  2023-10-19 18:00:00     23      8       2199    8       8
> datafile.dat  2023-10-19 19:00:00     23      8       2187    8       8
> datafile.dat  2023-10-19 20:00:00     23      8       2180    8       8
> datafile.dat  2023-10-19 21:00:00     23      8       2176    8       8
> datafile.dat  2023-10-19 22:00:00     23      8       2053    8       8
> datafile.dat  2023-10-19 23:00:00     23      8       2043    8       8

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/14/23 06:48, Sean Flaherty wrote:
> We have a process that runs once an hour to read the .dat file in csv 
> format then a node script using the pg package 
> <https://github.com/brianc/node-postgres> version "8.8.0" to create the 
> json objects and insert the data records as jsonb data.

Now I am not understanding.

1) In your OP you mentioned checking size of the column storage using 
pg_column_size, yet what you show for increase in size are datafile.dat.

2) So how is datafile.dat related to this issue?

3) Show how you are determining that the storage in the database has 
increased in size.

> 
> None of the upload process changed during the underlying database upgrade.
> 
> 
> On Wed, Dec 13, 2023 at 4:56 PM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 12/13/23 15:49, Sean Flaherty wrote:
>      > More information needed:
>      >
> 
>      > 2) An example of reported size for the 14.? and 15.5 cases.
>      >
>      >    Since upgrading from 14.8 to 15.5, the jsonb data that was
>     previously
>      > written in 14.8 is reporting a smaller size than the same hourly
>     data
>      > written after the upgrade (upgrade indicated in yellow):
> 
>     What is producing datafile.dat and how?
> 
>      >
>      > *file*        *hourly_timestamp*      *filename_bytes*       
>     *timestamp_bytes*
>      > *data_filesize*       *created_at_bytes*      *updated_at_bytes*
>      > datafile.dat  2023-10-19 12:00:00     23      8       1682    8 
>           8
>      > datafile.dat  2023-10-19 13:00:00     23      8       1687    8 
>           8
>      > datafile.dat  2023-10-19 14:00:00     23      8       1685    8 
>           8
>      > datafile.dat  2023-10-19 15:00:00     23      8       1668    8 
>           8
>      > datafile.dat  2023-10-19 16:00:00     23      8       2155    8 
>           8
>      > datafile.dat  2023-10-19 17:00:00     23      8       2178    8 
>           8
>      > datafile.dat  2023-10-19 18:00:00     23      8       2199    8 
>           8
>      > datafile.dat  2023-10-19 19:00:00     23      8       2187    8 
>           8
>      > datafile.dat  2023-10-19 20:00:00     23      8       2180    8 
>           8
>      > datafile.dat  2023-10-19 21:00:00     23      8       2176    8 
>           8
>      > datafile.dat  2023-10-19 22:00:00     23      8       2053    8 
>           8
>      > datafile.dat  2023-10-19 23:00:00     23      8       2043    8 
>           8
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
"David G. Johnston"
Дата:
On Thu, Dec 14, 2023 at 7:48 AM Sean Flaherty <sflaherty@grndwork.com> wrote:
We have a process that runs once an hour to read the .dat file in csv format then a node script using the pg package version "8.8.0" to create the json objects and insert the data records as jsonb data.  

None of the upload process changed during the underlying database upgrade.

Basic debugging requires the existence of a self-contained reproducer.  In this case ideally one that only uses psql and some static (already processed) data files, and that is known to produce the observed behaviors on non-RDS PostgreSQL.

David J.

Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
We did a little more digging on our side, which I apologize for not doing beforehand.

We wrote a script to create a simple table with a jsonb column, inserted some json data into the column and then used pg_column_size to test the size of the jsonb data column.    

We ran the script against local docker versions of postgres 14.8 and 15.5 and saw that the data in the jsonb column was the same size.

We then ran the script on AWS RDS Postgres and saw a size difference for the same json data between versions 14.8 and 15.4 (we also saw the larger size on RDS postgres 15.5 and 16.1).

Versionsize (kb)notes
RDS Postgres-14.81587 (the same size we were seeing locally)
RDS Postgres-15.42112

Thank you for your prompt responses and assistance. 

If you are curious, I can share the script we used with you.


Kind regards,
Sean


On Thu, Dec 14, 2023 at 9:47 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thu, Dec 14, 2023 at 7:48 AM Sean Flaherty <sflaherty@grndwork.com> wrote:
We have a process that runs once an hour to read the .dat file in csv format then a node script using the pg package version "8.8.0" to create the json objects and insert the data records as jsonb data.  

None of the upload process changed during the underlying database upgrade.

Basic debugging requires the existence of a self-contained reproducer.  In this case ideally one that only uses psql and some static (already processed) data files, and that is known to produce the observed behaviors on non-RDS PostgreSQL.

David J.

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/15/23 12:43, Sean Flaherty wrote:
> We did a little more digging on our side, which I apologize for not 
> doing beforehand.
> 
> We wrote a script to create a simple table with a jsonb column, inserted 
> some json data into the column and then used *pg_column_size* to test 
> the size of the jsonb data column.
> 
> We ran the script against local docker versions of postgres 14.8 and 
> 15.5 and saw that the data in the jsonb column was the same size.
> 
> We then ran the script on AWS RDS Postgres and saw a size difference for 
> the same json data between versions 14.8 and 15.4 (we also saw the 
> larger size on RDS postgres 15.5 and 16.1).

This points at RDS Postgres being the issue. Since that is behind the 
AWS curtain, you will need to contact AWS technical support for 
information on what changed.

> 
>     *Version*     *size (kb)*     *notes *
>     RDS Postgres-14.8     1587     (the same size we were seeing locally)
>     RDS Postgres-15.4     2112     
> 
> 
> Thank you for your prompt responses and assistance.
> 
> If you are curious, I can share the script we used with you.
> 
> 
> Kind regards,
> Sean


-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
Follow-up:
Working with AWS, we found that starting in RDS Postgres 15, the default_toast_compression parameter is set to use lz4 compression instead of pglz.  This resulted in the increased json storage size we were seeing.

I have been able to reproduce the increased storage size on RDS Postgres and using my local docker instance of postgres 15.5 by changing the local default_toast_compression value in postgresql.conf.

I have attached the test script we use to create a table, insert some test records and a query to test the JSON data size on disk.


Kind regards,
Sean


On Fri, Dec 15, 2023 at 1:56 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/15/23 12:43, Sean Flaherty wrote:
> We did a little more digging on our side, which I apologize for not
> doing beforehand.
>
> We wrote a script to create a simple table with a jsonb column, inserted
> some json data into the column and then used *pg_column_size* to test
> the size of the jsonb data column.
>
> We ran the script against local docker versions of postgres 14.8 and
> 15.5 and saw that the data in the jsonb column was the same size.
>
> We then ran the script on AWS RDS Postgres and saw a size difference for
> the same json data between versions 14.8 and 15.4 (we also saw the
> larger size on RDS postgres 15.5 and 16.1).

This points at RDS Postgres being the issue. Since that is behind the
AWS curtain, you will need to contact AWS technical support for
information on what changed.

>
>     *Version*         *size (kb)*     *notes *
>     RDS Postgres-14.8         1587    (the same size we were seeing locally)
>     RDS Postgres-15.4         2112   
>
>
> Thank you for your prompt responses and assistance.
>
> If you are curious, I can share the script we used with you.
>
>
> Kind regards,
> Sean


--
Adrian Klaver
adrian.klaver@aklaver.com

Вложения

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/28/23 09:13, Sean Flaherty wrote:
> Follow-up:
> Working with AWS, we found that starting in RDS Postgres 15, the 
> default_toast_compression parameter is set to use lz4 compression 
> instead of pglz.  This resulted in the increased json storage size we 
> were seeing.
> 
> I have been able to reproduce the increased storage size on RDS Postgres 
> and using my local docker instance of postgres 15.5 by changing the 
> local default_toast_compression value in postgresql.conf.
> 
> I have attached the test script we use to create a table, insert some 
> test records and a query to test the JSON data size on disk.

I can confirm I see the same results using Postgres 16 installed from 
the PGDG repo on Ubuntu 22.04. That the lz4 data size is greater then 
the pglz data size.

> 
> 
> Kind regards,
> Sean
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
I'm rather new to the mailing list, are there any additional steps I should take (i.e. posting to pgsql-hackers, etc.)?


On Thu, Dec 28, 2023 at 11:23 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/28/23 09:13, Sean Flaherty wrote:
> Follow-up:
> Working with AWS, we found that starting in RDS Postgres 15, the
> default_toast_compression parameter is set to use lz4 compression
> instead of pglz.  This resulted in the increased json storage size we
> were seeing.
>
> I have been able to reproduce the increased storage size on RDS Postgres
> and using my local docker instance of postgres 15.5 by changing the
> local default_toast_compression value in postgresql.conf.
>
> I have attached the test script we use to create a table, insert some
> test records and a query to test the JSON data size on disk.

I can confirm I see the same results using Postgres 16 installed from
the PGDG repo on Ubuntu 22.04. That the lz4 data size is greater then
the pglz data size.

>
>
> Kind regards,
> Sean
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/28/23 12:41, Sean Flaherty wrote:
> I'm rather new to the mailing list, are there any additional steps I 
> should take (i.e. posting to pgsql-hackers, etc.)?
> 

For what purpose? You are seeing differences in compression strategies 
between lz4 and pglz. The 'fix' would be to go back to pglz.

> 
> On Thu, Dec 28, 2023 at 11:23 AM Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 12/28/23 09:13, Sean Flaherty wrote:
>      > Follow-up:
>      > Working with AWS, we found that starting in RDS Postgres 15, the
>      > default_toast_compression parameter is set to use lz4 compression
>      > instead of pglz.  This resulted in the increased json storage
>     size we
>      > were seeing.
>      >
>      > I have been able to reproduce the increased storage size on RDS
>     Postgres
>      > and using my local docker instance of postgres 15.5 by changing the
>      > local default_toast_compression value in postgresql.conf.
>      >
>      > I have attached the test script we use to create a table, insert
>     some
>      > test records and a query to test the JSON data size on disk.
> 
>     I can confirm I see the same results using Postgres 16 installed from
>     the PGDG repo on Ubuntu 22.04. That the lz4 data size is greater then
>     the pglz data size.
> 
>      >
>      >
>      > Kind regards,
>      > Sean
>      >
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Junwang Zhao
Дата:
On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> On 12/28/23 12:41, Sean Flaherty wrote:
> > I'm rather new to the mailing list, are there any additional steps I
> > should take (i.e. posting to pgsql-hackers, etc.)?
> >
>
> For what purpose? You are seeing differences in compression strategies
> between lz4 and pglz. The 'fix' would be to go back to pglz.

Agreed, lz4 is known for its high compression speed, but lower
compression ratio, this is the trade off one should bear in mind.

>
> >
> > On Thu, Dec 28, 2023 at 11:23 AM Adrian Klaver
> > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> >
> >     On 12/28/23 09:13, Sean Flaherty wrote:
> >      > Follow-up:
> >      > Working with AWS, we found that starting in RDS Postgres 15, the
> >      > default_toast_compression parameter is set to use lz4 compression
> >      > instead of pglz.  This resulted in the increased json storage
> >     size we
> >      > were seeing.
> >      >
> >      > I have been able to reproduce the increased storage size on RDS
> >     Postgres
> >      > and using my local docker instance of postgres 15.5 by changing the
> >      > local default_toast_compression value in postgresql.conf.
> >      >
> >      > I have attached the test script we use to create a table, insert
> >     some
> >      > test records and a query to test the JSON data size on disk.
> >
> >     I can confirm I see the same results using Postgres 16 installed from
> >     the PGDG repo on Ubuntu 22.04. That the lz4 data size is greater then
> >     the pglz data size.
> >
> >      >
> >      >
> >      > Kind regards,
> >      > Sean
> >      >
> >
> >
> >     --
> >     Adrian Klaver
> >     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> >
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
>
>


--
Regards
Junwang Zhao



Re: Increased storage size of jsonb in pg15

От
Tom Lane
Дата:
Junwang Zhao <zhjwpku@gmail.com> writes:
> On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> For what purpose? You are seeing differences in compression strategies
>> between lz4 and pglz. The 'fix' would be to go back to pglz.

> Agreed, lz4 is known for its high compression speed, but lower
> compression ratio, this is the trade off one should bear in mind.

I don't know if we can make any blanket statements like that, but
if we can, shouldn't there be some advice in the manual?  AFAICS,
right now there's exactly zip about why you should choose one over
the other.

            regards, tom lane



Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
What we found is that using lz4 compression on JSONB data is 20-25% larger on disk than pglz.  We are running a production workload that is storing jsonb data with a focus read performance.  The documented increase in write speed wasn't a large benefit, however, the increase in storage size moved the bulk of our data into TOAST and off the JSON performance cliff ("2-10× slower queries") described by Evan was impactful.

This article does a nice job describing the differences between pglz and lz4 compression for different data but does not include json or jsonb.

I believe validation of our numbers and additional documentation on the trade-offs in compression types would be very useful.

On Fri, Dec 29, 2023 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Junwang Zhao <zhjwpku@gmail.com> writes:
> On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>> For what purpose? You are seeing differences in compression strategies
>> between lz4 and pglz. The 'fix' would be to go back to pglz.

> Agreed, lz4 is known for its high compression speed, but lower
> compression ratio, this is the trade off one should bear in mind.

I don't know if we can make any blanket statements like that, but
if we can, shouldn't there be some advice in the manual?  AFAICS,
right now there's exactly zip about why you should choose one over
the other.

                        regards, tom lane

Re: Increased storage size of jsonb in pg15

От
Adrian Klaver
Дата:
On 12/29/23 07:21, Sean Flaherty wrote:
> What we found is that using lz4 compression on JSONB data is 20-25% 
> larger on disk than pglz.  We are running a production workload that is 
> storing jsonb data with a focus read performance.  The documented 
> increase in write speed wasn't a large benefit, however, the increase in 
> storage size moved the bulk of our data into TOAST and off the JSON 
> performance cliff ("2-10× slower queries") described by Evan 
> <https://www.evanjones.ca/postgres-large-json-performance.html> was 
> impactful.
> 
> This 
> <https://www.postgresql.fastware.com/blog/what-is-the-new-lz4-toast-compression-in-postgresql-14> article does a nice
jobdescribing the differences between pglz and lz4 compression for different data but does not include json or jsonb.
 
> 
> I believe validation of our numbers and additional documentation on the 
> trade-offs in compression types would be very useful.

Yes, that would be useful.

Also per this:

"Working with AWS, we found that starting in RDS Postgres 15, the
default_toast_compression parameter is set to use lz4 compression 
instead of pglz."

there is a discussion to be had with AWS about the advisability of 
changing defaults without testing what that does to the end user or 
notifying the end user.


> 
> On Fri, Dec 29, 2023 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us 
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> 
>     Junwang Zhao <zhjwpku@gmail.com <mailto:zhjwpku@gmail.com>> writes:
>      > On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>      >> For what purpose? You are seeing differences in compression
>     strategies
>      >> between lz4 and pglz. The 'fix' would be to go back to pglz.
> 
>      > Agreed, lz4 is known for its high compression speed, but lower
>      > compression ratio, this is the trade off one should bear in mind.
> 
>     I don't know if we can make any blanket statements like that, but
>     if we can, shouldn't there be some advice in the manual?  AFAICS,
>     right now there's exactly zip about why you should choose one over
>     the other.
> 
>                              regards, tom lane
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Increased storage size of jsonb in pg15

От
Sean Flaherty
Дата:
I have included your suggestion to document any changes to the default Postgres settings to the Amazon RDS for PostgreSQL updates page in our ticket with AWS.

On Fri, Dec 29, 2023 at 9:43 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 12/29/23 07:21, Sean Flaherty wrote:
> What we found is that using lz4 compression on JSONB data is 20-25%
> larger on disk than pglz.  We are running a production workload that is
> storing jsonb data with a focus read performance.  The documented
> increase in write speed wasn't a large benefit, however, the increase in
> storage size moved the bulk of our data into TOAST and off the JSON
> performance cliff ("2-10× slower queries") described by Evan
> <https://www.evanjones.ca/postgres-large-json-performance.html> was
> impactful.
>
> This
> <https://www.postgresql.fastware.com/blog/what-is-the-new-lz4-toast-compression-in-postgresql-14> article does a nice job describing the differences between pglz and lz4 compression for different data but does not include json or jsonb.
>
> I believe validation of our numbers and additional documentation on the
> trade-offs in compression types would be very useful.

Yes, that would be useful.

Also per this:

"Working with AWS, we found that starting in RDS Postgres 15, the
default_toast_compression parameter is set to use lz4 compression
instead of pglz."

there is a discussion to be had with AWS about the advisability of
changing defaults without testing what that does to the end user or
notifying the end user.


>
> On Fri, Dec 29, 2023 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
>     Junwang Zhao <zhjwpku@gmail.com <mailto:zhjwpku@gmail.com>> writes:
>      > On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver
>     <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>      >> For what purpose? You are seeing differences in compression
>     strategies
>      >> between lz4 and pglz. The 'fix' would be to go back to pglz.
>
>      > Agreed, lz4 is known for its high compression speed, but lower
>      > compression ratio, this is the trade off one should bear in mind.
>
>     I don't know if we can make any blanket statements like that, but
>     if we can, shouldn't there be some advice in the manual?  AFAICS,
>     right now there's exactly zip about why you should choose one over
>     the other.
>
>                              regards, tom lane
>

--
Adrian Klaver
adrian.klaver@aklaver.com