Re: Table and Index compression
От | Pierre Frédéric Caillaud |
---|---|
Тема | Re: Table and Index compression |
Дата | |
Msg-id | op.uyhszpgkcke6l8@soyouz обсуждение исходный текст |
Ответ на | Re: Table and Index compression (Sam Mason <sam@samason.me.uk>) |
Ответы |
Re: Table and Index compression
|
Список | pgsql-hackers |
Well, here is the patch. I've included a README, which I paste here. If someone wants to play with it (after the CommitFest...) feel free to do so. While it was an interesting thing to try, I don't think it has enough potential to justify more effort... * How to test - apply the patch - copy minilzo.c and minilzo.h to src/backend/storage/smgr - configure & make - enjoy * How it works - pg block size set to 32K - an extra field is added in the header telling the compressed length THIS IS BAD, this information should be stored in a separate fork of the relation, because - it would then be backwards compatible - the number of bytes to read from a compressed page would be known in advance - the table file is sparse - the page header is not compressed - pages are written at their normal positions, but only the compressed bytes are written - if compression gains nothing, un-compressed page is stored - the filesystem doesn't store the un-written blocks * Benefits - Sparse file holes are not cached, so OS disk cache efficiency is at least x2 - Random access is faster, having a better probability to hit cache (sometimes a bit faster, sometimes it's spectatular) - Yes, it does save space (> 50%) * Problems - Biggest problem : any write to a table that writes data that compresses less than whatever was there before can fail on a disk full error. - ext3 sparse file handling isn't as fast as I wish it would be : on seq scans, even if it reads 2x less data, and decompresses very fast, it's still slower... - many seq scans (especially with aggregates) are CPU bound anyway - therefore, some kind of background-reader-decompressor would be needed - pre-allocation has to be done to avoid extreme fragmentation of the file, which kind of defeats the purpose - it still causes fragmentation * Conclusion (for now) It was a nice thing to try, but I believe it would be better if this was implemented directly in the filesystem, on the condition that it should be implemented well (ie not like NTFS compression).
Вложения
В списке pgsql-hackers по дате отправления: