"Jim C. Nasby" <jim@nasby.net> writes:
> ... I'm sure there's plenty of other ways MVCC info could be
> stored without using 16/20 bytes per tuple.
I didn't really see a single workable idea there. Keep in mind that
storage space is only one consideration (and not a real big one given
modern disk-drive sizes). Ask yourself about atomicity, failure
recovery, and update costs. RLE encoding of tuple states? Get real ---
how many rows could get wiped out by a one-bit lossage? How extensive
are the on-disk changes needed to encode a one-tuple change in state,
and how do you recover if the machine crashes when only some of those
changes are down to disk? In my opinion PG's on-disk structures are
barely reliable enough now; we don't want to introduce compression
schemes with the potential for large cross-tuple failure modes.
Storing commit state in index entries has been repeatedly proposed
and repeatedly rejected, too. It converts an atomic operation
(update one word in one page) into a non-atomic, multi-page operation,
which creates lots of performance and reliability problems. And the
point of an index is to be smaller than the main table --- the more
stuff you cram into an index tuple header, the less the advantage
of having the index.
Criticism in the form of a patch with experimental evidence is welcome,
but I'm not really interested in debating what-if proposals, especially
not ones that are already discussed in the archives.
regards, tom lane