Re: Table and Index compression

Поиск
Список
Период
Сортировка
От Sam Mason
Тема Re: Table and Index compression
Дата
Msg-id 20090807123835.GD5407@samason.me.uk
обсуждение исходный текст
Ответ на Re: Table and Index compression  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers
On Fri, Aug 07, 2009 at 12:59:57PM +0100, Greg Stark wrote:
> On Fri, Aug 7, 2009 at 12:48 PM, Sam Mason<sam@samason.me.uk> wrote:
> >> Well most users want compression for the space savings. So running out
> >> of space sooner than without compression when most of the space is
> >> actually unused would disappoint them.
> >
> > Note, that as far as I can tell for a filesystems you only need to keep
> > enough reserved for the amount of uncompressed dirty buffers you have in
> > memory.  As space runs out in the filesystem all that happens is that
> > the amount of (uncompressed?) dirty buffers you can safely have around
> > decreases.
> 
> And when it drops to zero?

That was why I said you need to have one page left "to handle the base
case".  I was treating the inductive case as the interesting common case
and considered the base case of lesser interest.

> > In PG's case, it would seem possible to do the compression and then
> > check to see if the resulting size is greater than 4kB.  If it is you
> > write into the 4kB page size and write uncompressed data.  Upon reading
> > you do the inverse, if it's 4kB then no need to decompress.  I believe
> > TOAST does this already.
> 
> It does, as does gzip and afaik every compression system.

It's still a case that needs to be handled explicitly by the code.  Just
for reference, gzip does not appear to do this when I test it:
 echo -n 'a' | gzip > tmp.gz gzip -l --verbose tmp.gz

says the compression ratio is "-200%" (an empty string results in
an infinite increase in size yet gets displayed as "0%" for some
strange reason).  It's only when you hit six 'a's that you start to get
positive ratios.  Note that that this is taking headers into account;
the compressed size is 23 bytes for both 'aaa' and 'aaaaaa' but the
uncompressed size obviously changes.

gzip does indeed have a "copy" method, but it doesn't seem to be being
used.

--  Sam  http://samason.me.uk/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Table and Index compression
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Table and Index compression