Обсуждение: [PATCH] Add zstd compression for TOAST using extended header format

Поиск
Список
Период
Сортировка

[PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hello PG Hackers,

Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].

A bit of a background in the 2022 thread [3], Robert Haas suggested:
"we had better reserve the fourth bit pattern for something extensible e.g. another byte or several to specify the actual method"

i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte

Michael also asked whether we should have "something a bit more extensible for the design of an extensible varlena header."

This patch implements that idea.
The format:

  struct varatt_external_extended {
      int32   va_rawsize;     /* same as legacy */
      uint32  va_extinfo;     /* cmid=3 signals extended format */
      uint8   va_flags;       /* feature flags */
      uint8   va_data[3];     /* va_data[0] = compression method */
      Oid     va_valueid;     /* same as legacy */
      Oid     va_toastrelid;  /* same as legacy */
  };

A few notes:

- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.

The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.

Regards,
Dharin 
Вложения

Re: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hello,

Apologies for the spam, updated the patch with the tests corrected.

Thanks,
Dharin

On Sat, Dec 13, 2025 at 6:31 PM Dharin Shah <dharinshah95@gmail.com> wrote:
Hello PG Hackers,

Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].

A bit of a background in the 2022 thread [3], Robert Haas suggested:
"we had better reserve the fourth bit pattern for something extensible e.g. another byte or several to specify the actual method"

i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte

Michael also asked whether we should have "something a bit more extensible for the design of an extensible varlena header."

This patch implements that idea.
The format:

  struct varatt_external_extended {
      int32   va_rawsize;     /* same as legacy */
      uint32  va_extinfo;     /* cmid=3 signals extended format */
      uint8   va_flags;       /* feature flags */
      uint8   va_data[3];     /* va_data[0] = compression method */
      Oid     va_valueid;     /* same as legacy */
      Oid     va_toastrelid;  /* same as legacy */
  };

A few notes:

- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.

The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.

Regards,
Dharin 
Вложения

Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hello PG Hackers,

Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].

A bit of a background in the 2022 thread [3], The overall suggestion was to have something extensible for the TOAST header

i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte

This patch implements that idea.
The new header format:

  struct varatt_external_extended {
      int32   va_rawsize;     /* same as legacy */
      uint32  va_extinfo;     /* cmid=3 signals extended format */
      uint8   va_flags;       /* feature flags */
      uint8   va_data[3];     /* va_data[0] = compression method */
      Oid     va_valueid;     /* same as legacy */
      Oid     va_toastrelid;  /* same as legacy */
  };

A few notes:

- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.

The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.

Regards,
Dharin
Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Peter Eisentraut
Дата:
On 16.12.25 11:51, Dharin Shah wrote:
> - Zstd only applies to external TOAST, not inline compression. The 2-bit 
> limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine 
> anyway. Zstd's wins show up on larger values.

This is a very complicated patch.  To motivate it, you should show some 
detailed performance measurements that show these wins.




Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Michael Paquier
Дата:
On Wed, Dec 17, 2025 at 04:11:38PM +0100, Peter Eisentraut wrote:
> On 16.12.25 11:51, Dharin Shah wrote:
> > - Zstd only applies to external TOAST, not inline compression. The 2-bit
> > limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine
> > anyway. Zstd's wins show up on larger values.
>
> This is a very complicated patch.  To motivate it, you should show some
> detailed performance measurements that show these wins.

Yes, this is expected for any patch posted.  Zstd is an improved
version of lz4, acting as a sort of industry standard these days, and
any byte sequences I have looked at points that zstd leads kind of
always to a better compression ratio for less or equivalent CPU cost
compared to LZ4.  Not saying that numbers are not required, they are.
But I strongly suspect numbers among these lines.

FWIW, it's not a complicated patch, it is a large mechanical patch
that enforces a bunch of TOAST code paths to do what it wants.  If we
are going to do something about that and agree on something, I think
that we should just use a new vartag_external for this matter
(spoiler: I think we should use a new vartag_external value), but
keep the toast structure at 16 bytes all the time, leaving alone the
extra bit in the existing varatt_external structure so as there is no
impact for heap relations if zstd is used, as long as the TOAST value
is 32 bits.  The patch introduces a new vartag_external with
VARTAG_ONDISK_EXTENDED, so while it leads to a better compatibility,
it also means that all zstd entries have to pay an extra amount of
space in the main relation as an effect of a different
default_toast_compression.  The difficulty is not in the
implementation, it would be on agreeing on what folks would be OK
with in terms if vartag and varatt structures, and that's one of the
oldest areas of the PG code, that has complications and assumptions of
its own.

The test implementation looks wrong to me.  Why is there any need for
an extra test module test_toast_ext?  You could just reuse the same
structure as compression_lz4.sql, but adapted to zstd.  That's why a
split with compression.sql has been done in 74a3fc36f314, FYI.

You should also aim at splitting the patch more to make it easier to
review: one of the sticky point of this area of the code is to untie
completely DefaultCompressionId, its GUC and the TOAST code.  The GUC
default_toast_compression accepts by design only 4 values.  This needs
to go further, and should be refactored as a piece of its own.

And also, I would prefer if the 32-bit value issue is tackled first,
but that's a digression here, for a different thread.  :)
--
Michael

Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hi Michael (and Peter),

Thanks for the detailed feedback — this is really helpful.

I want to make sure I understand your main point: you're OK with a new `vartag_external`, but prefer we avoid increasing the heap TOAST pointer from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in the main heap tuple.
I also realize the "compatibility" of the extended header doesn't buy us much — we'll need to support the existing 16-byte varatt_external forever for backward compatibility. Adding a 20-byte structure just means two formats to maintain indefinitely.

A couple clarifying questions if we go with new vartag (e.g., `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as discriminator
1. How should we handle future methods beyond zstd? One tag per method, or store a method id elsewhere (e.g., in TOAST chunk header)?
2. And re: "as long as the TOAST value is 32 bits" — are you referring to the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from extsize for method encoding)?

Test

Rows

Uncompressed

PGLZ

LZ4

ZSTD

PGLZ/ZSTD

LZ4/ZSTD

T1: Large JSON (~18KB/row)

500

~9,000 KB

1496 KB

1528 KB

976 KB

1.53x

1.57x

T2: Repetitive Text (~246KB/row)

500

~123,000 KB

1672 KB

648 KB

248 KB

6.74x

2.61x

T3: MD5 Hash Data (~16KB/row)

500

~8,000 KB

8288 KB

8232 KB

4256 KB

1.95x

1.93x

T4: Server Logs (~3.5KB/row)

1000

~3,500 KB

400 KB

352 KB

456 KB

0.88x

0.77x


Key findings (i guess well known at this point):

- ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
- For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
- The T4 result showing zstd as "worse" is not about compression quality - it's about missing inline storage support. ZSTD actually compresses better, but pays unnecessary TOAST overhead.

I'll share the detailed benchmark script with the next patch revision. But also a potential path forward could be that we could just fully replace pglz (can bring it up later in different thread)

On Testing and Patch Structure
Agreed on both points:
- I'll use `compression_zstd.sql` following the `compression_lz4.sql` pattern (removing the test_toast_ext module)
- I'll split the GUC refactoring into a separate preparatory patch

Once you confirm which representation you're advocating, I'll respin accordingly.

Thanks,
Dharin

On Thu, Dec 18, 2025 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Dec 17, 2025 at 04:11:38PM +0100, Peter Eisentraut wrote:
> On 16.12.25 11:51, Dharin Shah wrote:
> > - Zstd only applies to external TOAST, not inline compression. The 2-bit
> > limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine
> > anyway. Zstd's wins show up on larger values.
>
> This is a very complicated patch.  To motivate it, you should show some
> detailed performance measurements that show these wins.

Yes, this is expected for any patch posted.  Zstd is an improved
version of lz4, acting as a sort of industry standard these days, and
any byte sequences I have looked at points that zstd leads kind of
always to a better compression ratio for less or equivalent CPU cost
compared to LZ4.  Not saying that numbers are not required, they are.
But I strongly suspect numbers among these lines.

FWIW, it's not a complicated patch, it is a large mechanical patch
that enforces a bunch of TOAST code paths to do what it wants.  If we
are going to do something about that and agree on something, I think
that we should just use a new vartag_external for this matter
(spoiler: I think we should use a new vartag_external value), but
keep the toast structure at 16 bytes all the time, leaving alone the
extra bit in the existing varatt_external structure so as there is no
impact for heap relations if zstd is used, as long as the TOAST value
is 32 bits.  The patch introduces a new vartag_external with
VARTAG_ONDISK_EXTENDED, so while it leads to a better compatibility,
it also means that all zstd entries have to pay an extra amount of
space in the main relation as an effect of a different
default_toast_compression.  The difficulty is not in the
implementation, it would be on agreeing on what folks would be OK
with in terms if vartag and varatt structures, and that's one of the
oldest areas of the PG code, that has complications and assumptions of
its own.

The test implementation looks wrong to me.  Why is there any need for
an extra test module test_toast_ext?  You could just reuse the same
structure as compression_lz4.sql, but adapted to zstd.  That's why a
split with compression.sql has been done in 74a3fc36f314, FYI.

You should also aim at splitting the patch more to make it easier to
review: one of the sticky point of this area of the code is to untie
completely DefaultCompressionId, its GUC and the TOAST code.  The GUC
default_toast_compression accepts by design only 4 values.  This needs
to go further, and should be refactored as a piece of its own.

And also, I would prefer if the 32-bit value issue is tackled first,
but that's a digression here, for a different thread.  :)
--
Michael

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Michael Paquier
Дата:
On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> I want to make sure I understand your main point: you're OK with a new
> `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> the main heap tuple.

That would be my choice, yes.  Not sure about the opinion of others on
this matter.

> I also realize the "compatibility" of the extended header doesn't buy us
> much — we'll need to support the existing 16-byte varatt_external forever
> for backward compatibility. Adding a 20-byte structure just means two
> formats to maintain indefinitely.

Yes.  Patches have to maintain on-disk compatibility.

> A couple clarifying questions if we go with new vartag (e.g.,
> `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> discriminator
> 1. How should we handle future methods beyond zstd? One tag per method, or
> store a method id elsewhere (e.g., in TOAST chunk header)?

My suspicion would be that we could either use a new set of vartags in
the future for each compression method.  When it comes to zstd there
is something that comes in play: we could set some bits related to
dictionnaries at tuple level.  Not sure if this is the best design or
if using an attribute-level option is more adapted (for example a
JSONB blob could be applied as an attribute with common keys in a
dictionnary saving a lot of on-disk space even before compression),
but keeping some bits free in the 16-byte header leaves this option
open with a new vartag_external.  Saying that, zstd is good enough
that I strongly suspect that we would not regret it for quite a few
years.  One issue that has pushed towards the addition of lz4 as an
option for toast compression is that pglz was worse in terms of CPU
cost.  zlib is also more expensive than lz4 or zstd, especially at
very high compression level for usually little compression gains.

> 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> extsize for method encoding)?

I mean extending the TOAST value to 8 bytes, as per the following
issues:
https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
https://commitfest.postgresql.org/patch/5830/

> *Key findings (i guess well known at this point):*
> - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> - The T4 result showing zstd as "worse" is not about compression quality -
> it's about missing inline storage support. ZSTD actually compresses better,
> but pays unnecessary TOAST overhead.
>
> I'll share the detailed benchmark script with the next patch revision. But
> also a potential path forward could be that we could just fully replace
> pglz (can bring it up later in different thread)

I don't think that we will ever be able to remove pglz.  It would be
nice, as final result of course, but I also expect that not being able
to decompress pglz data is going to lead to a lot of user pain.  That
would be also very expensive to check at upgrade for large instances.

> *On Testing and Patch Structure*
> Agreed on both points:
> - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> pattern (removing the test_toast_ext module)

Okay.

> - I'll split the GUC refactoring into a separate preparatory patch

This refactoring, if done nicely, is worth an independent piece.  It's
something that I have actually done for the sake of the other thread,
though the result was not really much liked by others.  Perhaps I'm
just lacking imagination with this abstraction, and I'd surely welcome
different ideas.
--
Michael

Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hello,

Following up on my earlier patch submission, I've reworked the zstd TOAST compression implementation based on our discussion here. The new patch now avoids the 20-byte extended header.

Current Approach
- New `VARTAG_ONDISK_ZSTD` (value 19) for ZSTD external storage
- Maintains existing 16-byte varatt_external structure
- ZSTD external-only (no inline compression)

Note: Using a dedicated VARTAG_ONDISK_ZSTD keeps the on-disk TOAST pointer payload at 16 bytes, but it is not a general extensible metadata carrier. If PostgreSQL later adopts a more general extensible TOAST framework, this change should not block it; VARTAG_ONDISK_ZSTD would remain as a supported legacy encoding, while new toasted values could be written using the newer framework and old values rewritten via normal table rewrites.

Storage (170 MB uncompressed):
    ZSTD: 22 MB (7.60x) - 38.7% space savings vs LZ4
    PGLZ: 36 MB (4.76x)
    LZ4:  36 MB (4.66x)

Key findings:
- Large values (>50KB): ZSTD 33% better compression than PGLZ (~30% better than LZ4)
- Low-entropy data: ZSTD compresses what LZ77 methods cannot
- Small values: ZSTD pays external overhead vs inline PGLZ/LZ4
While ZSTD uses slightly less space overall, the external storage mechanism incurs a TOAST fetch overhead for small values, potentially impacting performance.
Backwards Compatibility Tests
- Mixed compression: Rows with PGLZ, LZ4, and ZSTD coexist and decompress correctly
- Lazy recompression: ALTER COLUMN ... SET COMPRESSION zstd affects new data; existing data is lazily recompressed upon UPDATE or VACUUM FULL.
- Inline vs external: Small values remain inline; large values use appropriate external compression.
Data integrity: All data decompresses correctly across all methods.

Trade-offs and Design Considerations

- External-only avoids consuming cmid=3 and extended header complexity

- Slice access: no ZSTD-specific optimization (follow-up area)

- Hybrid inline/external for small values: not in this patch (feedback welcome)

Reviewer Questions - Is vartag-based external-only acceptable?
- Should compression level (currently 3) be configurable? - Is the external storage overhead for small values acceptable, or is hybrid inline/external behavior needed? Thanks, Dharin


On Thu, Dec 18, 2025 at 11:44 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> I want to make sure I understand your main point: you're OK with a new
> `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> the main heap tuple.

That would be my choice, yes.  Not sure about the opinion of others on
this matter.

> I also realize the "compatibility" of the extended header doesn't buy us
> much — we'll need to support the existing 16-byte varatt_external forever
> for backward compatibility. Adding a 20-byte structure just means two
> formats to maintain indefinitely.

Yes.  Patches have to maintain on-disk compatibility.

> A couple clarifying questions if we go with new vartag (e.g.,
> `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> discriminator
> 1. How should we handle future methods beyond zstd? One tag per method, or
> store a method id elsewhere (e.g., in TOAST chunk header)?

My suspicion would be that we could either use a new set of vartags in
the future for each compression method.  When it comes to zstd there
is something that comes in play: we could set some bits related to
dictionnaries at tuple level.  Not sure if this is the best design or
if using an attribute-level option is more adapted (for example a
JSONB blob could be applied as an attribute with common keys in a
dictionnary saving a lot of on-disk space even before compression),
but keeping some bits free in the 16-byte header leaves this option
open with a new vartag_external.  Saying that, zstd is good enough
that I strongly suspect that we would not regret it for quite a few
years.  One issue that has pushed towards the addition of lz4 as an
option for toast compression is that pglz was worse in terms of CPU
cost.  zlib is also more expensive than lz4 or zstd, especially at
very high compression level for usually little compression gains.

> 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> extsize for method encoding)?

I mean extending the TOAST value to 8 bytes, as per the following
issues:
https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
https://commitfest.postgresql.org/patch/5830/

> *Key findings (i guess well known at this point):*
> - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> - The T4 result showing zstd as "worse" is not about compression quality -
> it's about missing inline storage support. ZSTD actually compresses better,
> but pays unnecessary TOAST overhead.
>
> I'll share the detailed benchmark script with the next patch revision. But
> also a potential path forward could be that we could just fully replace
> pglz (can bring it up later in different thread)

I don't think that we will ever be able to remove pglz.  It would be
nice, as final result of course, but I also expect that not being able
to decompress pglz data is going to lead to a lot of user pain.  That
would be also very expensive to check at upgrade for large instances.

> *On Testing and Patch Structure*
> Agreed on both points:
> - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> pattern (removing the test_toast_ext module)

Okay.

> - I'll split the GUC refactoring into a separate preparatory patch

This refactoring, if done nicely, is worth an independent piece.  It's
something that I have actually done for the sake of the other thread,
though the result was not really much liked by others.  Perhaps I'm
just lacking imagination with this abstraction, and I'd surely welcome
different ideas.
--
Michael
Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Robert Treat
Дата:
On Thu, Dec 18, 2025 at 5:44 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> > I'll share the detailed benchmark script with the next patch revision. But
> > also a potential path forward could be that we could just fully replace
> > pglz (can bring it up later in different thread)
>
> I don't think that we will ever be able to remove pglz.  It would be
> nice, as final result of course, but I also expect that not being able
> to decompress pglz data is going to lead to a lot of user pain.  That
> would be also very expensive to check at upgrade for large instances.
>

Agreed that I can't see pglz being removed any time soon, if ever.
Thinking through what a conversion process would look like seems
unwieldy at best, so I think we definitely need it for backwards
compatibility, plus I think it is useful to have a self-contained
option. I'd almost suggest we should look at replacing lz4, but I
don't think that is significantly easier, it just has a smaller, more
invested, blast radius. That said, I do suspect ztsd could quickly
become a popular recommendation and/or default among users /
consultants / service providers.

Robert Treat
https://xzilla.net



Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Michael Paquier
Дата:
On Wed, Dec 24, 2025 at 11:50:48AM -0500, Robert Treat wrote:
> Agreed that I can't see pglz being removed any time soon, if ever.
> Thinking through what a conversion process would look like seems
> unwieldy at best, so I think we definitely need it for backwards
> compatibility, plus I think it is useful to have a self-contained
> option. I'd almost suggest we should look at replacing lz4, but I
> don't think that is significantly easier, it just has a smaller, more
> invested, blast radius.

Backward-compatibility requirements make a replacement of LZ4
basically impossible to me, for the same reasons as pglz.  We could
not replace the bit used in the va_extinfo to track if LZ4 compression
is used, either.  One thing that I do wonder is if it would make
things simpler in the long-run if we introduced a new separated vartag
for LZ4-compressed external TOAST pointers as well.  At least we'd
have a leaner design: it means that we have to keep the
varatt_external available on read, but we could update to the new
format when writing entries.  Or perhaps that's not worth the
complication based on the last sentence you are writing..

> That said, I do suspect ztsd could quickly
> become a popular recommendation and/or default among users /
> consultants / service providers.

..  Because I strongly suspect that this is going to be true, and that
zstd would just be a better replacement over lz4.  That's a trend that
I see is already going on for wal_compression.

Note that I am not on board with simply reusing varatt_external for
zstd-compressed entries, neither do I think that this is the best move
ever.  It makes the core patch simpler, but it makes things like
ToastCompressionId more complicated to think about.  If anything, I'd
consider a rename of varatt_external as the best way to go with an
intermediate "translation" structure only used in memory as I am
proposing on the other thread (something that others seem meh enough
about but I am not seeing alternate proposals floating around,
either).  This would make things like detoast_external_attr() less
confusing, I think, as the latest patch posted on this thread is
actually proving with its shortcut for toast_fetch_datum as one
example of something I'd rather not do..
--
Michael

Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Thanks Michael & Robert,

Agreed — I don’t think it’s realistic or practical to talk about deprecating or replacing pglz (or lz4) given on-disk compatibility requirements.

> Note that I am not on board with simply reusing varatt_external for
> zstd-compressed entries, neither do I think that this is the best move
> ever.  It makes the core patch simpler, but it makes things like
> ToastCompressionId more complicated to think about.  If anything, I'd
> consider a rename of varatt_external as the best way to go with an
> intermediate "translation" structure only used in memory as I am
> proposing on the other thread (something that others seem meh enough
> about but I am not seeing alternate proposals floating around,
> either).  This would make things like detoast_external_attr() less
> confusing, I think, as the latest patch posted on this thread is
> actually proving with its shortcut for toast_fetch_datum as one
> example of something I'd rather not do..

On the design: I understand & share the same concerns that we’d end up with multiple “sources of truth” for external compression method identification (pglz/lz4 via va_extinfo bits, zstd via vartag), and that this pushes method-specific shortcuts into detoast paths.

Would you be OK if I split this into two steps?

1.First, a refactor-only patch introducing a small decoded/in-memory representation of an external TOAST pointer, so detoast/toast code paths don’t have to reason directly about tcinfo vs vartag vs va_extinfo. This would be a cleanup with no on-disk format change and no behavioral change for existing methods. Is this the same “translation structure” approach you mentioned in the other thread? If you can point me to it, I’ll align with that proposal.

2. Then, a follow-up patch adding zstd using VARTAG_ONDISK_ZSTD, implemented on top of that abstraction to keep zstd handling centralized and minimize special-casing in detoast.
If that direction matches what you had in mind, I can first post the proposed translation structure/API for feedback before respinning the zstd patch.

Thanks,
Dharin


On Thu, Dec 25, 2025 at 1:25 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Dec 24, 2025 at 11:50:48AM -0500, Robert Treat wrote:
> Agreed that I can't see pglz being removed any time soon, if ever.
> Thinking through what a conversion process would look like seems
> unwieldy at best, so I think we definitely need it for backwards
> compatibility, plus I think it is useful to have a self-contained
> option. I'd almost suggest we should look at replacing lz4, but I
> don't think that is significantly easier, it just has a smaller, more
> invested, blast radius.

Backward-compatibility requirements make a replacement of LZ4
basically impossible to me, for the same reasons as pglz.  We could
not replace the bit used in the va_extinfo to track if LZ4 compression
is used, either.  One thing that I do wonder is if it would make
things simpler in the long-run if we introduced a new separated vartag
for LZ4-compressed external TOAST pointers as well.  At least we'd
have a leaner design: it means that we have to keep the
varatt_external available on read, but we could update to the new
format when writing entries.  Or perhaps that's not worth the
complication based on the last sentence you are writing.. 

> That said, I do suspect ztsd could quickly
> become a popular recommendation and/or default among users /
> consultants / service providers.

..  Because I strongly suspect that this is going to be true, and that
zstd would just be a better replacement over lz4.  That's a trend that
I see is already going on for wal_compression.

Note that I am not on board with simply reusing varatt_external for
zstd-compressed entries, neither do I think that this is the best move
ever.  It makes the core patch simpler, but it makes things like
ToastCompressionId more complicated to think about.  If anything, I'd
consider a rename of varatt_external as the best way to go with an
intermediate "translation" structure only used in memory as I am
proposing on the other thread (something that others seem meh enough
about but I am not seeing alternate proposals floating around,
either).  This would make things like detoast_external_attr() less
confusing, I think, as the latest patch posted on this thread is
actually proving with its shortcut for toast_fetch_datum as one
example of something I'd rather not do..
--
Michael

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Hello Michael,

Following up on the discussion about avoiding method-specific shortcuts in detoast paths, this patch is a refactor-only step: it introduces a small decoded/in-memory representation of an on-disk external TOAST pointer, and refactors detoast_attr() and detoast_attr_slice() to use it.

The goal is to centralize “how do we interpret an external datum?” so that detoast code paths don’t have to reason directly about va_extinfo encoding vs payload layout details. This is intended as groundwork for a follow-up patch adding a new vartag-based method (e.g., zstd) without scattering special cases in detoast paths.

Key changes
- Introduces DecodedExternalToast + ToastDecompressMethod + TOAST_EXT_HAS_TCINFO in toast_internals.h.
- Adds a small static decoder in detoast.c (decode_external_toast_pointer())
- Refactors detoast_attr() and detoast_attr_slice() to use a decode -> fetch -> decompress dispatch pattern
- No on-disk format changes; existing behavior preserved (including error behavior for unsupported compression builds).

Why HAS_TCINFO?
- Previously, “is compressed?” was used as a proxy for whether the external payload begins with tcinfo. This patch makes that explicit: HAS_TCINFO captures payload layout, which is distinct from whether the value is compressed. This separation is needed for future methods that may store external compressed payloads without tcinfo.

Testing: Core regression suites pass 

Performance: I ran a small detoast-focused benchmark that forces external storage; results were within run-to-run variance, with no measurable regression. (Benchmark script attached: benchmark_toast_detoast.sql for reproduction)

Thanks,
Dharin

On Thu, Dec 25, 2025 at 1:54 AM Dharin Shah <dharinshah95@gmail.com> wrote:
Thanks Michael & Robert,

Agreed — I don’t think it’s realistic or practical to talk about deprecating or replacing pglz (or lz4) given on-disk compatibility requirements.

> Note that I am not on board with simply reusing varatt_external for
> zstd-compressed entries, neither do I think that this is the best move
> ever.  It makes the core patch simpler, but it makes things like
> ToastCompressionId more complicated to think about.  If anything, I'd
> consider a rename of varatt_external as the best way to go with an
> intermediate "translation" structure only used in memory as I am
> proposing on the other thread (something that others seem meh enough
> about but I am not seeing alternate proposals floating around,
> either).  This would make things like detoast_external_attr() less
> confusing, I think, as the latest patch posted on this thread is
> actually proving with its shortcut for toast_fetch_datum as one
> example of something I'd rather not do..

On the design: I understand & share the same concerns that we’d end up with multiple “sources of truth” for external compression method identification (pglz/lz4 via va_extinfo bits, zstd via vartag), and that this pushes method-specific shortcuts into detoast paths.

Would you be OK if I split this into two steps?

1.First, a refactor-only patch introducing a small decoded/in-memory representation of an external TOAST pointer, so detoast/toast code paths don’t have to reason directly about tcinfo vs vartag vs va_extinfo. This would be a cleanup with no on-disk format change and no behavioral change for existing methods. Is this the same “translation structure” approach you mentioned in the other thread? If you can point me to it, I’ll align with that proposal.

2. Then, a follow-up patch adding zstd using VARTAG_ONDISK_ZSTD, implemented on top of that abstraction to keep zstd handling centralized and minimize special-casing in detoast.
If that direction matches what you had in mind, I can first post the proposed translation structure/API for feedback before respinning the zstd patch.

Thanks,
Dharin


On Thu, Dec 25, 2025 at 1:25 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Dec 24, 2025 at 11:50:48AM -0500, Robert Treat wrote:
> Agreed that I can't see pglz being removed any time soon, if ever.
> Thinking through what a conversion process would look like seems
> unwieldy at best, so I think we definitely need it for backwards
> compatibility, plus I think it is useful to have a self-contained
> option. I'd almost suggest we should look at replacing lz4, but I
> don't think that is significantly easier, it just has a smaller, more
> invested, blast radius.

Backward-compatibility requirements make a replacement of LZ4
basically impossible to me, for the same reasons as pglz.  We could
not replace the bit used in the va_extinfo to track if LZ4 compression
is used, either.  One thing that I do wonder is if it would make
things simpler in the long-run if we introduced a new separated vartag
for LZ4-compressed external TOAST pointers as well.  At least we'd
have a leaner design: it means that we have to keep the
varatt_external available on read, but we could update to the new
format when writing entries.  Or perhaps that's not worth the
complication based on the last sentence you are writing.. 

> That said, I do suspect ztsd could quickly
> become a popular recommendation and/or default among users /
> consultants / service providers.

..  Because I strongly suspect that this is going to be true, and that
zstd would just be a better replacement over lz4.  That's a trend that
I see is already going on for wal_compression.

Note that I am not on board with simply reusing varatt_external for
zstd-compressed entries, neither do I think that this is the best move
ever.  It makes the core patch simpler, but it makes things like
ToastCompressionId more complicated to think about.  If anything, I'd
consider a rename of varatt_external as the best way to go with an
intermediate "translation" structure only used in memory as I am
proposing on the other thread (something that others seem meh enough
about but I am not seeing alternate proposals floating around,
either).  This would make things like detoast_external_attr() less
confusing, I think, as the latest patch posted on this thread is
actually proving with its shortcut for toast_fetch_datum as one
example of something I'd rather not do..
--
Michael
Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Michael Paquier
Дата:
On Mon, Dec 29, 2025 at 02:45:27PM +0100, Dharin Shah wrote:
> The goal is to centralize “how do we interpret an external datum?” so that
> detoast code paths don’t have to reason directly about va_extinfo encoding
> vs payload layout details. This is intended as groundwork for a follow-up
> patch adding a new vartag-based method (e.g., zstd) without scattering
> special cases in detoast paths.

+static bool
+decode_external_toast_pointer(const struct varlena *attr,
+                           DecodedExternalToast *decoded)
[...]
+typedef enum ToastDecompressMethod
+{
+    TOAST_DECOMP_NONE = 0,
+    TOAST_DECOMP_PGLZ = 1,
+    TOAST_DECOMP_LZ4 = 2
+} ToastDecompressMethod;
+
+typedef struct DecodedExternalToast
+{
+    Oid            toastrelid;
+    Oid            valueid;
+    uint32        rawsize;        /* Decompressed size; for future methods without tcinfo */
+    uint32        extsize;        /* On-disk payload size */
+    ToastDecompressMethod method;
+    uint8        flags;
+} DecodedExternalToast;

Yeah, honestly this is a layer I have been thinking about as well as
one option, but contrary to you I have been focusing on putting that
into varatt.h, with the exception of the value being an Oid8.  I think
that you have an interesting point in focusing your implementation to
be stored in the detoast part, though.  I'd need to spend a bit more
time to see the result this would lead at with the larger 8-byte issue
in mind, but this is something that would come at no real cost as it
has no function pointer redirection compared to what I was first
envisioning on the other thread.  That's especially true if it makes
the CompressionId business easier to mold around when adding a new
vartag.

> Why HAS_TCINFO?
> - Previously, “is compressed?” was used as a proxy for whether the external
> payload begins with tcinfo. This patch makes that explicit: HAS_TCINFO
> captures payload layout, which is distinct from whether the value is
> compressed. This separation is needed for future methods that may store
> external compressed payloads without tcinfo.

It is possible to model the on-memory data as we want.  This
suggestion would be OK with some flags.
--
Michael

Вложения

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Dharin Shah
Дата:
Thanks Michael,

After looking more closely at your “8‑byte TOAST values / infinite loop” thread and patch series, I see this is very much the same direction you outlined there: introduce a normalized in-memory representation for external pointers (toast_external_data) and keep most call sites from having to reason about vartag_external/va_extinfo details directly [1].

For this refactor patch I kept the decoder local to detoast.c to minimize scope and avoid committing to a broader API boundary too early. But if the consensus heads toward a shared interface closer to the format definitions (as in your toast_external approach), I’m happy to respin/rework this patch to align with that direction, rather than working on parallel abstractions. It should also be straightforward to mold this refactor in the direction of the 8‑byte value-id work without changing the overall detoast structure.

On HAS_TCINFO flag: the intent is to make payload layout explicit. In the current code, “external is compressed” effectively implies “payload begins with tcinfo”, which is wired into fetch/slice logic. For a vartag-based follow-up (e.g., zstd), we may want compressed payloads without a tcinfo prefix, so having an explicit flag keeps detoast paths uniform and avoids method-specific shortcuts.

Let me know what you’d prefer for next steps: keep this patch as a detoast-local refactor, or respin it to align more directly with a shared decoded external-pointer interface in the direction of the 8‑byte work.


On Tue, Dec 30, 2025 at 12:46 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Dec 29, 2025 at 02:45:27PM +0100, Dharin Shah wrote:
> The goal is to centralize “how do we interpret an external datum?” so that
> detoast code paths don’t have to reason directly about va_extinfo encoding
> vs payload layout details. This is intended as groundwork for a follow-up
> patch adding a new vartag-based method (e.g., zstd) without scattering
> special cases in detoast paths.

+static bool
+decode_external_toast_pointer(const struct varlena *attr,
+                                                  DecodedExternalToast *decoded)
[...]
+typedef enum ToastDecompressMethod
+{
+       TOAST_DECOMP_NONE = 0,
+       TOAST_DECOMP_PGLZ = 1,
+       TOAST_DECOMP_LZ4 = 2
+} ToastDecompressMethod;
+
+typedef struct DecodedExternalToast
+{
+       Oid                     toastrelid;
+       Oid                     valueid;
+       uint32          rawsize;                /* Decompressed size; for future methods without tcinfo */
+       uint32          extsize;                /* On-disk payload size */
+       ToastDecompressMethod method;
+       uint8           flags;
+} DecodedExternalToast;

Yeah, honestly this is a layer I have been thinking about as well as
one option, but contrary to you I have been focusing on putting that
into varatt.h, with the exception of the value being an Oid8.  I think
that you have an interesting point in focusing your implementation to
be stored in the detoast part, though.  I'd need to spend a bit more
time to see the result this would lead at with the larger 8-byte issue
in mind, but this is something that would come at no real cost as it
has no function pointer redirection compared to what I was first
envisioning on the other thread.  That's especially true if it makes
the CompressionId business easier to mold around when adding a new
vartag.

> Why HAS_TCINFO?
> - Previously, “is compressed?” was used as a proxy for whether the external
> payload begins with tcinfo. This patch makes that explicit: HAS_TCINFO
> captures payload layout, which is distinct from whether the value is
> compressed. This separation is needed for future methods that may store
> external compressed payloads without tcinfo.

It is possible to model the on-memory data as we want.  This
suggestion would be OK with some flags.
--
Michael

Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format

От
Michael Paquier
Дата:
On Wed, Dec 31, 2025 at 04:02:24PM +0100, Dharin Shah wrote:
> Let me know what you’d prefer for next steps: keep this patch as a
> detoast-local refactor, or respin it to align more directly with a shared
> decoded external-pointer interface in the direction of the 8‑byte work.

My apologies for the rather long silence on this thread.  As the next
step of this project, I am going to put my hands of what you are
suggesting here, and see how I can align it with the 64-bit toast
value patch:
https://www.postgresql.org/message-id/CAOj6k6dEVi0NvLjMLDhyrJS_n_NZO5D_OU89AO1u53u6NCDDwQ@mail.gmail.com

What I am pretty sure about at this stage is that there is little love
for the patch set I have sent on the other thread where I have been
using pointer redirections for the TOAST function calls with
callbacks (perhaps I'll be able to apply some of the renaming patches
anyway, nobody would scream at me for that), at least nobody has put a
+1 on it or just ignored it, so this approach feels dead to me.  What
you are suggesting upthread, though, is a direction I'd like to dig
into and this comes down to how I can unify what you want to do for
zstd and what I want to do with Oid8.  Perhaps that you are right and
that it is just simpler to invest on an interface in the detoast code,
but I still see that there is nothing done for the logical decoding or
amcheck code paths, which is something my other patch is able to deal
with transparently.
--
Michael

Вложения