Re: Zedstore - compressed in-core columnar storage

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Zedstore - compressed in-core columnar storage
Дата
Msg-id 8391a140-55dd-7b78-42b7-c1cbfdba0df9@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Zedstore - compressed in-core columnar storage  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: Zedstore - compressed in-core columnar storage  (Ashwin Agrawal <aagrawal@pivotal.io>)
Список pgsql-hackers

On 09.04.2019 18:51, Alvaro Herrera wrote:
> On 2019-Apr-09, Konstantin Knizhnik wrote:
>
>> On 09.04.2019 3:27, Ashwin Agrawal wrote:
>>> Heikki and I have been hacking recently for few weeks to implement
>>> in-core columnar storage for PostgreSQL. Here's the design and initial
>>> implementation of Zedstore, compressed in-core columnar storage (table
>>> access method). Attaching the patch and link to github branch [1] to
>>> follow along.
>> Thank you for publishing this patch. IMHO Postgres is really missing normal
>> support of columnar store
> Yep.
>
>> and table access method API is the best way of integrating it.
> This is not surprising, considering that columnar store is precisely the
> reason for starting the work on table AMs.
>
> We should certainly look into integrating some sort of columnar storage
> in mainline.  Not sure which of zedstore or VOPS is the best candidate,
> or maybe we'll have some other proposal.  My feeling is that having more
> than one is not useful; if there are optimizations to one that can be
> borrowed from the other, let's do that instead of duplicating effort.
>
There are two different aspects:
1. Store format.
2. Vector execution.

1. VOPS is using mixed format, something similar with Apache parquet.
Tuples are stored vertically, but only inside one page.
It tries to minimize trade-offs between true horizontal and true 
vertical storage:
first is most optimal for selecting all rows, while second - for 
selecting small subset of rows.
To make this approach more efficient, it is better to use large page 
size - default Postgres 8k pages is not enough.

 From my point of view such format is better than pure vertical storage 
which will be very inefficient if query access larger number of columns.
This problem can be somehow addressed by creating projections: grouping 
several columns together. But it requires more space for storing 
multiple projections.

2. Doesn't matter which format we choose, to take all advantages of 
vertical representation we need to use vector operations.
And Postgres executor doesn't support them now. This is why VOPS is 
using some hacks, which is definitely not good and not working in all cases.
zedstore is not using such hacks and ... this is why it never can reach 
VOPS performance.

The right solution is to add vector operations support to Postgres 
planner and executors.
But is much harder than develop columnar store itself.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Anastasia Lubennikova
Дата:
Сообщение: Re: Failure in contrib test _int on loach
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] PATCH: multivariate histograms and MCV lists