On Fri, Jun 7, 2013 at 10:30 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Turns out the benefits are imo big enough to make it worth pursuing
> further.
Yeah, those were nifty numbers.
> The problem is that to discern from pglz on little endian the byte with
> the two high bits unset is actually the fourth byte in a toast datum. So
> we would need to store it in the 5th byte or invent some more
> complicated encoding scheme.
>
> So I think we should just define '00' as pglz, '01' as xxx, '10' as yyy
> and '11' as storing the schema in the next byte.
Not totally following, but I'm fine with that.
>> > 3) Surely choosing the compression algorithm via GUC ala SET
>> > toast_compression_algo = ... isn't the way to go. I'd say a storage
>> > attribute is more appropriate?
>>
>> The way we do caching right now supposes that attoptions will be
>> needed only occasionally. It might need to be revised if we're going
>> to need it all the time. Or else we might need to use a dedicated
>> pg_class column.
>
> Good point. It probably belongs right besides attstorage, seems to be
> the most consistent choice anyway.
Possibly, we could even store it in attstorage. We're really only
using two bits of that byte right now, so just invent some more
letters.
> Alternatively, if we only add one form of compression, we can just
> always store in snappy/lz4/....
Not following.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company