Re: Quick-and-dirty compression for WAL backup blocks

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Quick-and-dirty compression for WAL backup blocks
Дата
Msg-id 26740.1117899967@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Quick-and-dirty compression for WAL backup blocks  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Quick-and-dirty compression for WAL backup blocks  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
Re: Quick-and-dirty compression for WAL backup blocks  ("Jim C. Nasby" <decibel@decibel.org>)
Список pgsql-hackers
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
>> A run-length compressor would be reasonably quick but I think that the
>> omit-the-middle-hole approach gets most of the possible win with even
>> less work.

> I can't think that a RLE scheme would be much more expensive than a 'count
> the hole' approach with more benefit, so I wouldn't like to discount this
> straight away...

RLE would require scanning the whole page with no certainty of win,
whereas count-the-hole is a certain win since you only examine bytes
that are potentially removable from the later CRC calculation.

> If you do manage to go ahead with the code, I'd be very interested to see
> some comparisons in bytes written to XLog for old and new approaches for
> some inserts/updates. Perhaps we could ask Mark to run another TPC benchmark
> at OSDL when this and the CRC changes have been completed.

I've completed a test run for this (it's essentially MySQL's sql-bench
done immediately after initdb).  What I get is:

CVS tip of 6/1: ending WAL offset = 0/A364A780 = 2741282688 bytes written

CVS tip of 6/2: ending WAL offset = 0/8BB091DC = 2343604700 bytes written

or about a 15% savings.  This is with a checkpoint_segments setting of 30.
One can presume that the savings would be larger at smaller checkpoint
intervals and smaller at larger intervals, but I didn't try more than
one set of test conditions.

I'd say that's an improvement worth having, especially considering that
it requires no net expenditure of CPU time.  But the table is certainly
still open to discuss more complicated approaches.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christopher Kings-Lynne
Дата:
Сообщение: Re: Precedence of %
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Precedence of %