Re: GIN improvements part 1: additional information

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: GIN improvements part 1: additional information
Дата
Msg-id 52B4694A.3060201@vmware.com
обсуждение исходный текст
Ответ на Re: GIN improvements part 1: additional information  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On 12/19/2013 10:44 AM, Heikki Linnakangas wrote:
> On 12/19/2013 08:37 AM, Oleg Bartunov wrote:
>> Guys,
>>
>> before digging deep into the art of comp/decomp world I'd like to know
>> if you familiar with results of
>> http://wwwconference.org/www2008/papers/pdf/p387-zhangA.pdf paper and
>> some newer research ?
>
> Yeah, I saw that paper.
>
>> Do we agree in what we really want ? Basically,
>> there are three main features: size, compression, decompression speed
>> - we should take two :)
>
> According to that Zhang et al paper you linked, the Vbyte method
> actually performs the worst on all of those measures. The other
> algorithms are quite similar in terms of size (PForDelta being the most
> efficient), while PForDelta is significantly faster to compress/decompress.
>
> Just by looking at those numbers, PForDelta looks like a clear winner.
> However, it operates on much bigger batches than the other algorithms; I
> haven't looked at it in detail, but Zhang et al used 128 integer
> batches, and they say that 32 integers is the minimum batch size. If we
> want to use it for the inline posting lists stored in entry tuples, that
> would be quite wasteful if there are only a few item pointers on the tuple.
>
> Also, in the tests I've run, the compression/decompression speed is not
> a significant factor in total performance, with either varbyte encoding
> or Simple9-like encoding I hacked together.

One disadvantage of Simple9 and other encodings that operate in batches 
is that removing a value from the middle can increase the number of 
bytes required for the remaining values. That's a problem during vacuum; 
it's possible that after vacuuming away one item pointer, the remaining 
items no longer fit on the page.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: ERROR during end-of-xact/FATAL
Следующее
От: Oskari Saarenmaa
Дата:
Сообщение: [PATCH] Make various variables read-only (const)