Re: Compression and on-disk sorting

Поиск
Список
Период
Сортировка
От Jim C. Nasby
Тема Re: Compression and on-disk sorting
Дата
Msg-id 20060519030243.GW64371@pervasive.com
обсуждение исходный текст
Ответ на Re: Compression and on-disk sorting  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Compression and on-disk sorting
Список pgsql-hackers
On Thu, May 18, 2006 at 04:55:17PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Actually, I guess the amount of memory used for zlib's lookback buffer
> > (or whatever they call it) could be pretty substantial, and I'm not sure
> > if there would be a way to combine that across all tapes.
> 
> But there's only one active write tape at a time.  My recollection of
> zlib is that compression is memory-hungry but decompression not so much,
> so it seems like this shouldn't be a huge deal.

It seems more appropriate to discuss results here, rather than on
-patches...

http://jim.nasby.net/misc/compress_sort.txt is preliminary results.
I've run into a slight problem in that even at a compression level of
-3, zlib is cutting the on-disk size of sorts by 25x. So my pgbench sort
test with scale=150 that was producing a 2G on-disk sort is now
producing a 80M sort, which obviously fits in memory. And cuts sort
times by more than half.

So, if nothing else, it looks like compression is definately a win if it
means you can now fit the sort within the disk cache. While that doesn't
sound like something very worthwhile, it essentially extends work_mem
from a fraction of memory to up to ~25x memory.

I'm currently loading up a pgbench database with a scaling factor of
15000; hopefully I'll have results from that testing in the morning.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: TupleDesc refcounting, again
Следующее
От: Thomas Hallgren
Дата:
Сообщение: Re: [OT] MySQL is bad, but THIS bad?