> pg_upgrade means you can't just redefine the current toast bits so the > compressed bit means "data is compressed, check first byte of varlena data > for algorithm" because existing data won't have that, the first byte will > be the start of the compressed data stream.
Is there any small sequence of initial bytes you wouldn't ever see in PGLZ output? Either something invalid, or something obviously nonoptimal like run(n,'A')||run(n,'A') where PGLZ would have just output run(2n,'A')?
I don't think we need to worry, since doing it per-column makes this issue go away. Per-Datum compression would make it easier to switch methods (requiring no table rewrite) at the cost of more storage for each varlena, which probably isn't worth it anyway.