Re: [PATCH] Compression and on-disk sorting

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: [PATCH] Compression and on-disk sorting
Дата
Msg-id 20060518114201.GB4359@svana.org
обсуждение исходный текст
Ответ на Re: [PATCH] Compression and on-disk sorting  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-patches
On Thu, May 18, 2006 at 11:34:36AM +0100, Simon Riggs wrote:
> Just do a Z_FULL_FLUSH when you hit end of block. That way all blocks
> will be independent of each other and you can rewind as much as you
> like. We can choose the block size to be 32KB or even 64KB, there's no
> dependency there, just memory allocation. It should be pretty simple to
> make the block size variable at run time, so we can select it according
> to how many files and how much memory we have.

If you know you don't need to seek, there's no need to block the data
at all, one long stream is fine. So that case is easy.

For seeking, you need more work. I assume you're talking about 32KB
input block sizes (uncompressed). The output blocks will be of variable
size. These compressed blocks would be divided up into fixed 8K blocks
and written to disk.

To allow seeking, you'd have to do something like a header comtaining:

- length of previous compressed block
- length of this compressed block
- offset of block in uncompressed bytes (from beginning of tape)

This would allow you to scan backwards and forwards. If you want to be
able to jump to anywhere in the file, you may be better off storing the
file offsets (which would be implicit if the blocks are 32KB) in the
indirect blocks, using a search to find the right block, and then a
header in the block to find the offset.

Still, I'd like some evidence of benefits before writing up something
like that.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

В списке pgsql-patches по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [PATCH] Compression and on-disk sorting
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] buildfarm failures