[PATCH] Add zstd compression for TOAST using extended header format
| От | Dharin Shah |
|---|---|
| Тема | [PATCH] Add zstd compression for TOAST using extended header format |
| Дата | |
| Msg-id | CAOj6k6dy2CRVA6Lsb5N59zE-7KNVKt=oYwWyg8ULK8zOOY8e7A@mail.gmail.com обсуждение исходный текст |
| Ответы |
Re: [PATCH] Add zstd compression for TOAST using extended header format
|
| Список | pgsql-hackers |
Hello PG Hackers,
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], Robert Haas suggested:
"we had better reserve the fourth bit pattern for something extensible e.g. another byte or several to specify the actual method"
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
Michael also asked whether we should have "something a bit more extensible for the design of an extensible varlena header."
This patch implements that idea.
The format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], Robert Haas suggested:
"we had better reserve the fourth bit pattern for something extensible e.g. another byte or several to specify the actual method"
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
Michael also asked whether we should have "something a bit more extensible for the design of an extensible varlena header."
This patch implements that idea.
The format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};
A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.
Regards,
Dharin
Dharin
Вложения
В списке pgsql-hackers по дате отправления: