Re: Optimize partial TOAST decompression
От | Tomas Vondra |
---|---|
Тема | Re: Optimize partial TOAST decompression |
Дата | |
Msg-id | 20190930172951.gdedvexnf4d2wv5e@development обсуждение исходный текст |
Ответ на | Re: Optimize partial TOAST decompression (Andrey Borodin <x4mmm@yandex-team.ru>) |
Ответы |
Re: Optimize partial TOAST decompression
|
Список | pgsql-hackers |
On Mon, Sep 30, 2019 at 09:20:22PM +0500, Andrey Borodin wrote: > > >> 30 сент. 2019 г., в 20:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> написал(а): >> >> I mean this: >> >> /* >> * Use int64 to prevent overflow during calculation. >> */ >> compressed_size = (int32) ((int64) rawsize * 9 + 8) / 8; >> >> I'm not very familiar with pglz internals, but I'm a bit puzzled by >> this. My first instinct was to compare it to this: >> >> #define PGLZ_MAX_OUTPUT(_dlen) ((_dlen) + 4) >> >> but clearly that's a very different (much simpler) formula. So why >> shouldn't pglz_maximum_compressed_size simply use this macro? > >compressed_size accounts for possible increase of size during >compression. pglz can consume up to 1 control byte for each 8 bytes of >data in worst case. OK, but does that actually translate in to the formula? We essentially need to count 8-byte chunks in raw data, and multiply that by 9. Which gives us something like nchunks = ((rawsize + 7) / 8) * 9; which is not quite what the patch does. >Even if whole data is compressed well - there can be prefix compressed >extremely ineffectively. Thus, if you are going to decompress rawsize >bytes, you need at most compressed_size bytes of compressed input. OK, that explains why we can't use the PGLZ_MAX_OUTPUT macro. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: