Hi,
On 05/08/2019 09:26, Andres Freund wrote:
> Hi,
>
> On 2019-08-05 00:19:28 +0200, Petr Jelinek wrote:
>> It carries that information inside the compressed value, like I said in the
>> other reply, that's why the extra byte.
>
> I'm not convinced that that is a good plan - imo the reference to the
> compressed data should carry that information.
>
> I.e. in the case of toast, at least toast pointers should hold enough
> information to determine the compression algorithm. And in the case of
> WAL, the WAL record should contain that.
>
Point taken.
>
> For external datums I suggest encoding the compression method as a
> distinct VARTAG_ONDISK_COMPRESSED, and then have that include the
> compression method as a field.
So the new reads/writes will use this and reads of old format won't
change? Sounds fine.
>
> For in-line compressed values (so VARATT_IS_4B_C), doing something
> roughly like you did, indicating the type of metadata following using
> the high bit sounds reasonable. But I think I'd make it so that if the
> highbit is set, the struct is instead entirely different, keeping a full
> 4 byte byte length, and including the compression type header inside the
> struct. Perhaps I'd combine the compression type with the high-bit-set
> part? So when the high bit is set, it'd be something like
>
> {
> int32 vl_len_; /* varlena header (do not touch directly!) */
>
> /*
> * Actually only 7 bit, the high bit determines whether this
> * is the old compression header (if unset), or this type of header
> * (if set).
> */
> uint8 type;
>
> /*
> * Stored as uint8[4], to avoid unnecessary alignment padding.
> */
> uint8[4] length;
>
> char va_data[FLEXIBLE_ARRAY_MEMBER];
> }
>
Won't this break BW compatibility on big-endian (if I understand
corretly what you are trying to achieve here)?
> I think it's worth spending some effort trying to get this right - we'll
> be bound by design choices for a while.
>
Sure, however I am not in business of redesigning TOAST from scratch
right now, even if I do agree that the current header is far from ideal.
--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/