Re: [PATCH] Add zstd compression for TOAST using extended header format
| От | Dharin Shah |
|---|---|
| Тема | Re: [PATCH] Add zstd compression for TOAST using extended header format |
| Дата | |
| Msg-id | CAOj6k6eAR=3yM8g-4Dm0jv9Tqf=ZYQ5HgC7oOgeZ5QN-JF2vaw@mail.gmail.com обсуждение исходный текст |
| Ответ на | [PATCH] Add zstd compression for TOAST using extended header format (Dharin Shah <dharinshah95@gmail.com>) |
| Список | pgsql-hackers |
Hello,
Apologies for the spam, updated the patch with the tests corrected.
Thanks,
Dharin
Apologies for the spam, updated the patch with the tests corrected.
Thanks,
Dharin
On Sat, Dec 13, 2025 at 6:31 PM Dharin Shah <dharinshah95@gmail.com> wrote:
Hello PG Hackers,
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], Robert Haas suggested:
"we had better reserve the fourth bit pattern for something extensible e.g. another byte or several to specify the actual method"
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
Michael also asked whether we should have "something a bit more extensible for the design of an extensible varlena header."
This patch implements that idea.
The format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.Regards,
Dharin
Вложения
В списке pgsql-hackers по дате отправления: